On-site
$14k - $20k
Senior
Full Time
Posted January 08, 2026
Tech Stack
hashicorp
amazon-web-services
terraform
kubernetes
python
golang
typescript
microsoft-typescript
datadog
amazon-ec2
amazon-s3
amazon-vpc
google-vpc
amazon-rds-for-mysql
aws-iam
amazon-eks
prometheus
vault
consul
postgresql
rabbitmq
Job Description
- **Our Team**
The Vault Radar Infrastructure team builds and maintains the core systems that power our cloud and on-prem platforms. We focus on reliability, scalability, and security so the product team can ship features confidently. Our core stack includes Nomad, Consul, Vault, Terraform, Postgres, RabbitMQ and AWS services.
**About the Role**
As a Site Reliability Engineer focusing on network, infrastructure and test operations, you??ll help design, build, and support the networking foundations that connect our cloud and on-prem products. You??ll work with senior engineers to ensure reliable, secure connectivity between services and environments, and to automate routine tasks for faster, safer delivery.
**In this role, you will:**
- **Infrastructure as Code (IaC):** Design and deploy AWS cloud infrastructure using Terraform.
- **Container Management:** Orchestrate workloads with Nomad and Kubernetes.
- **Automation:** Develop tools in Python, Go, and TypeScript to automate deployments and maintenance.
- **Observability:** Utilize DataDog for comprehensive monitoring, logging, and alerting.
- **Testing:** Maintain automated testing frameworks for infrastructure and pipelines.
- **Reliability & Response:** Manage capacity planning, participate in on-call rotations, conduct post-mortems, and collaborate with development teams to ensure system resilience and scalability.
- Required education
- Bachelor''s Degree
- Preferred education
- Master''s Degree
- Required technical and professional expertise
- **Experience** Proven experience in an SRE/DevOps role managing production environments.
- **AWS Expertise** Deep knowledge of core AWS services (EC2, S3, VPC, RDS, IAM, EKS, etc.).
- **IaC & Automation** Hands-on experience with Terraform, Nomad or Kubernetes orchestration, and scripting in Python/Go/TypeScript.
- **Monitoring** Experience implementing monitoring/logging systems (Datadog, Prometheus, etc.).
- **Fundamentals** Strong understanding of Linux and networking fundamentals.
- **Methodologies** Familiarity with CI/CD pipelines and methodologies.
- **Soft Skills** Strong problem-solving, analytical, and communication skills.
- Preferred technical and professional experience
- Education in Computer Science or a related technical field.
- Relevant certifications (e.g., AWS Certified DevOps Engineer, Certified Kubernetes Administrator - CKA, Terraform Associate, or similar).
- Experience with softwares like Terraform, Vault, Nomad, Consul, Postgres, RabbitMQ.
- Experience in defining and tracking Service Level Indicators (SLIs) and Service Level Objectives (SLOs).