1. Cloud Infrastructure Design & Operations: Design, build, and optimize highly available and secure cloud architectures on AWS.
2. CI/CD Pipeline Development: Design and maintain end-to-end pipelines from code commit to deployment — including build, test, artifact management, release, and rollback — to improve release frequency and quality.
3. Operations Automation: Reduce manual toil through programming and automation tools (e.g., Ansible, Terraform, custom scripts/platforms) to automate configuration management, change execution, and incident response.
4. Infrastructure as Code (IaC): Manage AWS cloud resources and network configurations using Terraform to ensure environment consistency, auditability, and reproducibility.
5. CDN Operations: Configure, monitor, tune, and troubleshoot AWS CDN services to ensure acceleration and availability of static assets and streaming media.
6. Monitoring, Alerting & Observability: Build and maintain monitoring, logging, and distributed tracing systems to enable rapid issue identification and drive capacity and performance optimization.
Requirements:
1. Education & Experience:
– Bachelor’s degree in Computer Science, Software Engineering, Telecommunications, or a related field.
– 3+ years of experience in DevOps, SRE, Cloud Operations, or Infrastructure Engineering.
2. Cloud Platform:
– Proficient with AWS: hands-on experience with core services including EC2, VPC, IAM, S3, RDS, Lambda, EKS/ECS.
– Solid understanding of cloud-native principles; familiar with containers (Docker), orchestration (Kubernetes or AWS EKS managed service), service mesh, and serverless technologies.
3. CI/CD:
– Proficient in designing and maintaining CI/CD pipelines (e.g., GitLab CI, GitHub Actions, ArgoCD).
– Familiar with pipeline design principles: multi-environment management, branching strategies, artifact management, blue-green/canary deployments, and rollback mechanisms.
4. Observability:
– Experience building and optimizing monitoring stacks such as Prometheus, Grafana, ELK/Loki.
5. Operations Automation & Programming:
– Strong automation programming skills: proficient in at least one scripting/programming language (e.g., Shell, Python, Go) for writing automation scripts and tools.
– Able to design automation solutions for operational scenarios and integrate with existing monitoring, ticketing, and configuration management systems.
6. Infrastructure as Code (IaC):
– Proficient with Terraform: able to manage AWS resources with modular design and proven hands-on experience.
– Familiarity with Ansible for configuration management and batch operations (e.g., system initialization, application deployment, configuration distribution).
7. CDN Operations:
– Hands-on CDN operations experience: familiar with AWS CloudFront configuration, caching strategies, domain and certificate management.
– Able to perform bandwidth/traffic and cost analysis, origin optimization, troubleshooting, and SLA assurance.
8. Soft Skills:
– Strong documentation habits, communication, collaboration, and ability to drive initiatives across teams.
– Solid problem analysis and post-mortem skills; able to participate in or lead root cause analysis and drive improvement actions.
Other similar jobs that might interest you