Senior Site Reliability Engineer - HashiCorp Network, Infrastructure Services
Bengaluru East, Karnataka
Hybrid
Senior
Full Time
Posted January 07, 2026
Tech Stack
hashicorp
amazon-web-services
terraform
kubernetes
python
golang
typescript
microsoft-typescript
datadog
amazon-ec2
amazon-s3
amazon-vpc
google-vpc
amazon-rds-for-mysql
aws-iam
amazon-eks
prometheus
vault
consul
postgresql
rabbitmq
avature
Job Description
**Introduction**
A career in IBM Software means you'll be part of a team that transforms our customer's challenges into industry-leading solutions. We are an infinitely curious team, always seeking new possibilities, and dedicated to creating the world's leading AI-powered, cloud-native software solutions. Our renowned legacy creates endless global opportunities for our network of IBMers. We are a team of deep product experts, ensuring exceptional client experiences, with a focus on delivery, excellence, and obsession over customer outcomes. This position involves contributing to HashiCorp's offerings, now part of IBM, which empower organizations to automate and secure multi-cloud and hybrid environments. You will join a team managing the lifecycle of infrastructure and security, enhancing IBM's cloud solutions to ensure enterprises achieve efficiency, security, and scalability in their cloud journey.
**Your Role And Responsibilities**
**Our Team**
The Vault Radar Infrastructure team builds and maintains the core systems that power our cloud and on-prem platforms. We focus on reliability, scalability, and security so the product team can ship features confidently. Our core stack includes Nomad, Consul, Vault, Terraform, Postgres, RabbitMQ and AWS services.
**About The Role**
As a Site Reliability Engineer focusing on network, infrastructure and test operations, you’ll help design, build, and support the networking foundations that connect our cloud and on-prem products. You’ll work with senior engineers to ensure reliable, secure connectivity between services and environments, and to automate routine tasks for faster, safer delivery.
**In This Role, You Will**
- Infrastructure as Code (IaC): Design and deploy AWS cloud infrastructure using Terraform.
- Container Management: Orchestrate workloads with Nomad and Kubernetes.
- Automation: Develop tools in Python, Go, and TypeScript to automate deployments and maintenance.
- Observability: Utilize DataDog for comprehensive monitoring, logging, and alerting.
- Testing: Maintain automated testing frameworks for infrastructure and pipelines.
- Reliability & Response: Manage capacity planning, participate in on-call rotations, conduct post-mortems, and collaborate with development teams to ensure system resilience and scalability.
**Preferred Education**
Master's Degree
**Required Technical And Professional Expertise**
- Experience: Proven experience in an SRE/DevOps role managing production environments.
- AWS Expertise: Deep knowledge of core AWS services (EC2, S3, VPC, RDS, IAM, EKS, etc.).
- IaC & Automation: Hands-on experience with Terraform, Nomad or Kubernetes orchestration, and scripting in Python/Go/TypeScript.
- Monitoring: Experience implementing monitoring/logging systems (Datadog, Prometheus, etc.).
- Fundamentals: Strong understanding of Linux and networking fundamentals.
- Methodologies: Familiarity with CI/CD pipelines and methodologies.
- Soft Skills: Strong problem-solving, analytical, and communication skills.
**Preferred Technical And Professional Experience**
- Education in Computer Science or a related technical field.
- Relevant certifications (e.g., AWS Certified DevOps Engineer, Certified Kubernetes Administrator - CKA, Terraform Associate, or similar).
- Experience with softwares like Terraform, Vault, Nomad, Consul, Postgres, RabbitMQ.
- Experience in defining and tracking Service Level Indicators (SLIs) and Service Level Objectives (SLOs).