MLOps Engineer at Master-Works

This role aims to design, implement, and maintain scalable, secure, and reliable **MLOps infrastructure** and **CI/CD pipelines** to enable rapid and high-quality delivery of **machine learning models** and data-driven services to production. The role bridges **ML/Development** and **Operations**, driving automation, reliability, monitoring, and operational excellence across environments. **Key Responsibilities** - Build and operate end-to-end pipelines for training, validation, packaging, and deployment across dev/test/prod. - Implement CI/CD for **code, data, and model artifacts** with quality gates, approvals, and rollbacks. - Deploy and scale ML services using **Docker** and **Kubernetes** (real-time and batch), with safe rollout strategies. - Set up **model registry & experiment tracking** and enforce reproducible, versioned releases (e.g., MLflow or equivalent). - Implement monitoring/alerting for service health, latency, errors, resource usage, plus ML signals (**drift, data quality, model performance**). - Define operational standards (SLIs/SLOs, incident response, RCA, runbooks) and continuously improve reliability. - Enforce security best practices (IAM/RBAC, secrets management, network controls, audit logging) and collaborate with DS/ML/Data teams. ### Requirements **Requirements** - 3–7 years in MLOps/DevOps/Platform roles with production ML exposure. - Strong CI/CD + automation, solid Python and Linux, strong troubleshooting. - Hands-on with Docker + Kubernetes and observability tools (Prometheus/Grafana, ELK, OpenTelemetry or similar)

KubeCraftJobs

MLOps Engineer

Tech Stack

Job Description