About the Role
We’re looking for a senior infrastructure and operations engineer to own and evolve our platform reliability. You’ll design, operate, and maintain our Kubernetes-based infrastructure, build reliable monitoring and alerting pipelines, and ensure our systems remain stable under real-world load and failure conditions.
This is a hands-on role for someone with deep experience running production systems at scale and who focuses on making infrastructure predictable and stable. You’ll work across Kubernetes, networking, CI/CD, Cloudflare, and observability to create a platform engineers can trust.
What You’ll Do
- Design, deploy, and maintain production Kubernetes clusters.
- Own cluster reliability, upgrades, security, and performance.
- Build and operate monitoring, logging, and alerting pipelines.
- Ensure full-stack observability across infrastructure and services.
- Design and maintain CI/CD pipelines that are fast, reproducible, and safe.
- Improve deployment strategies (rollouts, canaries, rollbacks).
- Automate infrastructure provisioning and configuration.
- Investigate and resolve production incidents.
- Improve system resilience, redundancy, and recovery strategies.
- Define SLOs/SLIs and track reliability targets.
- Optimize and maintain our Cloudflare setup (caching, routing, security, edge behavior).
- Work closely with engineering teams to improve operational practices.
- Identify and remove single points of failure.
What We’re Looking For
Must-have
- Senior-level experience operating production infrastructure.
- Deep, hands-on expertise with Kubernetes (cluster internals, networking, storage, security).
- Strong networking fundamentals (TCP/IP, routing, DNS, TLS, load balancing).
- Experience debugging distributed systems and network-related issues.
- Experience optimizing CDN and edge setups, including Cloudflare.
- Strong experience building monitoring and observability systems.
- Experience with metrics, logs, traces, and alerting pipelines.
- Experience designing reliable CI/CD pipelines.
- Strong Linux fundamentals.
- Experience with infrastructure as code and automation.
- Ability to debug issues across the entire stack.
- Experience handling incidents and conducting postmortems.
Nice-to-have
- Experience with multi-cluster or multi-region setups.
- Experience with high-throughput or data-heavy systems.
- Experience with Elasticsearch or large-scale data infrastructure.
- Experience with service meshes.
- Experience with cost optimization and capacity planning.
- Experience in regulated or reliability-focused environments.
How You Work
- You assume infrastructure will fail and design accordingly.
- You prioritize reliability, visibility, and recoverability.
- You build systems that engineers trust in production.
- You automate carefully and deliberately.
- You are calm and methodical during incidents.
- You focus on long-term stability over short-term fixes.
- You document and standardize important processes.
Example Problems You Might Work On
- Hardening Kubernetes clusters for high availability and safe upgrades.
- Debugging network latency or connectivity issues across services.
- Optimizing Cloudflare caching, routing, and edge security rules.
- Building monitoring pipelines that provide reliable signals.
- Designing alerting that is actionable and low-noise.
- Improving deployment reliability and rollback safety.
- Removing single points of failure in production systems.
- Ensuring observability across all services and data pipelines.
Why Join Range
- Join one of the fastest-growing sectors in Web3 as stablecoins reach mass adoption.
- Competitive compensation with meaningful equity upside.
- Strong potential for growth and leadership opportunities.
- Remote-first culture with bi-yearly international off-sites.
- Opportunities for global conference travel and ecosystem engagement.
- Health and wellness benefits.
How to Apply
Send us:
- A short introduction and your background.
- Examples of infrastructure or platform work you’ve led.
- Any public write-ups, repositories, or talks (if available).
- We’re particularly interested in engineers who have built and operated Kubernetes platforms, improved network reliability, and optimized CDN/edge setups such as Cloudflare in production.