About the position
As a Cloud DevOps Engineer at ICA, you will own and evolve our AWS infrastructure across ECS/Fargate and EKS (Kubernetes), including RDS/Postgres, S3, IAM, and VPC. Own and evolve CI/CD pipelines (GitHub Actions, Argo CD) and Infrastructure-as-Code (Terraform). Set up alerting, log aggregation, and performance dashboards (e.g., CloudWatch, Datadog, or Open Telemetry). Implement secure-by-default practices; support SOC 2 / HIPAA readiness. Write Python/Bash scripts for backups, monitoring, deployment hooks, etc. Own Postgres operational concerns including tuning, access control, backups, and zero-downtime migrations. Design and own our EKS platform, leading workload migrations, defining cluster architecture, autoscaling strategy (Karpenter), GitOps workflows (Argo CD), and reliability standards. Make and document infrastructure decisions; define best practices for reliability, security, and cost.
Responsibilities
• Own and evolve our AWS infrastructure across ECS/Fargate and EKS (Kubernetes), including RDS/Postgres, S3, IAM, and VPC.
• Own and evolve CI/CD pipelines (GitHub Actions, Argo CD) and Infrastructure-as-Code (Terraform).
• Set up alerting, log aggregation, and performance dashboards (e.g., CloudWatch, Datadog, or Open Telemetry).
• Implement secure-by-default practices; support SOC 2 / HIPAA readiness.
• Write Python/Bash scripts for backups, monitoring, deployment hooks, etc.
• Own Postgres operational concerns including tuning, access control, backups, and zero-downtime migrations.
• Design and own our EKS platform, leading workload migrations, defining cluster architecture, autoscaling strategy (Karpenter), GitOps workflows (Argo CD), and reliability standards.
• Make and document infrastructure decisions; define best practices for reliability, security, and cost.
Requirements
• 5+ years in DevOps, DevSecOps, or SRE roles
• Experience owning production systems end-to-end, including design, rollout, and ongoing operation
• Strong AWS experience, especially ECS/EKS and Terraform
• Experience implementing or improving monitoring systems and incident workflows
• Comfort with scripting (Python, Bash) for automation and operational tooling
• Familiarity with Postgres in production
• Authorized to work in the United States and have the ability to obtain a Public Trust Clearance (required)
Nice-to-haves
• Exposure to Ansible or other config mgmt tools
• Experience supporting compliance frameworks (HIPAA, SOC 2)
• Exposure to AI/ML infra (e.g., GPU workloads, model deployment)
• Experience migrating workloads from ECS or similar platforms to Kubernetes
Benefits
• Health Insurance -100% employer-paid premiums – ICA covers the full cost of one of three offered medical plans
• Dental Insurance
• Vision insurance
• Health Spending Account
• Flexible Spending Account
• Life and Disability insurance
• 401(k) plan with company match
• Paid Time Off (Vacation, Sick Leave and Holidays)
• Education and Professional Development Assistance
• Remote work from anywhere within the continental United States