KubeCraftJobs

DevOps & Cloud Job Board

Solutions Architect / Data Architect 100% Remote

Jobs via Dice

Location not specified

Remote
Mid Level
Part Time
Posted January 05, 2026

Tech Stack

root-cause curated microsoft-azure azure-devops github github-enterprise azure-databricks databricks dbt terraform guardrails serverless python kubernetes unity-3d azure-data-factory aws-data-exchange shiny appcast

Please log in or register to view job application links.

Job Description

Dice is the leading career destination for tech experts at every stage of their careers. Our client, Apetan Consulting, is seeking the following. Apply via Dice today! Solutions Architect / Data Architect 100% Remote - Reliability, Monitoring, and Incident Management Design, implement, and maintain comprehensive monitoring and alerting for Lakehouse platform components, ingestion jobs, key data assets, and system health indicators. Oversee end-to-end reliability engineering, including capacity planning, throughput tuning, job performance optimization, and preventative maintenance (e.g., IR updates, compute policy reviews). Participate in - and help shape - the on-call rotation for high-priority incidents affecting production workloads, including rapid diagnosis and mitigation during off-hours as needed. Develop and maintain incident response runbooks, escalation pathways, stakeholder communication protocols, and operational readiness checklists. Lead or participate in post-incident Root Cause Analyses, ensuring durable remediation and preventing recurrence. Conduct periodic DR and failover simulations, validating RPO/RTO and documenting improvements. This role is foundational to ensuring 24/7/365 availability and timely delivery of mission-critical data for clinical, financial, operational, and analytical needs. - Pipeline Reliability, Ingestion Patterns & Data Quality Strengthen and standardize ingestion pipelines (ADF landing raw curated), including watermarking, incremental logic, backfills, and retry/cancel/resume patterns. Collaborate with the Data Engineer to modernize logging, automated anomaly detection, pipeline health dashboards, and DQ validation automation. Provide architectural guidance, code reviews, mentoring, and best-practice patterns to distributed engineering teams across MedStar. Support stabilization of existing ingestion and transformation pipelines across clinical (notes, OHDSI), financial, operational, and quality use cases. - DevOps, CI/CD, and Infrastructure as Code Administer and improve CI/CD pipelines using Azure DevOps or GitHub Enterprise. Support automated testing, environment promotion, and rollback patterns for Databricks and dbt assets. Maintain and extend Terraform (or adopt Terraform from another IaC background) for Databricks, storage, networking, compute policies, and related infrastructure. Promote version control standards, branching strategies, and deployment governance across data engineering teams. - Security, FinOps, and Guardrails Partnership Partner with Enterprise Architecture and Security on platform access controls, identity strategy, encryption, networking, and compliance. Implement and enforce cost tagging, compute policies, and alerts supporting FinOps transparency and cost governance. Collaborate with the team defining agentic coding guardrails, ensuring the Lakehouse platform supports safe & compliant use of AI-assisted code generation and execution. Help assess and optimize serverless SQL, serverless Python, and job compute patterns for cost-efficiency and reliability. - Mentorship, Collaboration, & Distributed Enablement Mentor the mid-level Data Engineer on Databricks, ADF, dbt, observability, DevOps, Terraform, and operational engineering patterns. Provide guidance, design patterns, and code review support to multiple distributed data engineering teams (Finance, MCPI, Safety/Risk, Quality, Digital Transformation, etc.). Lead platform knowledge-sharing efforts through documentation, workshops, and best-practice guidance. Demonstrate strong collaboration skills, balancing independence with alignment across teams. - Optional / Nice-to-Have: OHDSI Platform Support (Not required for hiring; can be learned on the job.) Assist with or support operational administration of the OHDSI/OMOP stack (Atlas, WebAPI, vocabularies, Kubernetes deployments). Collaborate with partners to ensure the OHDSI platform is secure, maintainable, and well-integrated with the Lakehouse. **Required Qualifications** 5+ years in cloud data engineering, platform engineering, or solution architecture. Strong hands-on expertise in Azure Databricks: - Unity Catalog - Workspaces & external locations - SQL/Python notebooks & Jobs - Cluster/warehouse governance Solid working experience with Azure Data Factory (pipelines, IRs, linked services). Strong SQL and Python engineering skills. Experience with CI/CD in Azure DevOps or GitHub Enterprise. Experience with Terraform or another IaC framework, and willingness to adopt Terraform. Demonstrated ability to design or support monitoring, alerting, logging, or reliability systems. Strong communication, collaboration, and problem-solving skills. **Preferred Qualifications (Optional)** Advanced Terraform experience. Familiarity with healthcare, HIPAA, PHI, or regulated environments. Experience with Purview or enterprise cataloging. Exposure to OHDSI/OMOP. Experience optimizing or refactoring legacy ingestion pipelines. Experience supporting secure, controlled AI/agentic execution environments. Experience with EPIC EHR data exchange and/or EPIC Caboodle or Cogito analytics suite. Personal Attributes Hands-on, pragmatic, and operationally minded. Comfortable leading both architecture and implementation. Collaborative and mentorship-oriented; thrives in small core teams with broad influence. Values platform stability, observability, and hardening over shiny features. Curious and adaptable, especially with emerging AI-assisted engineering patterns. Ability to remain calm and effective during incidents and high-pressure situations.