Loading...
27 August 2025

Who You Are

We’re looking for someone with a genuine engineering mindset – you’re deeply curious about how things work under the hood and passionate about building robust, elegant solutions. You take ownership of your work from conception to production, treating infrastructure as a craft. You have an innate desire to build and improve systems, not just maintain them. You’re the type of person who investigates root causes rather than applying quick fixes, and you get satisfaction from solving complex distributed systems challenges. You thrive in environments where you can dive deep into technical problems while maintaining a holistic view of the infrastructure landscape.

Responsibilities:

  • Infrastructure Management & Operations
    • Implement and manage container orchestration platforms (Kubernetes, Nomad, Talos) for the Inco testnet and mainnet deployments
    • Ensure infrastructure scalability, reliability, and security through implementation of industry best practices
    • Manage hybrid cloud and on-premises infrastructure with focus on performance optimization
    • Deploy and maintain blockchain nodes (Ethereum, Solana, etc.) ensuring high availability and performance
  • Monitoring & Observability
    • Build comprehensive monitoring and alerting systems using Prometheus, Grafana, and Loki
    • Design and implement distributed tracing and logging infrastructure
    • Create custom dashboards and metrics for network performance monitoring
    • Establish SLIs/SLOs and implement proactive alerting strategies
  • Automation & Infrastructure as Code
    • Develop and maintain infrastructure as code using Terraform and Ansible
    • Implement GitOps workflows using ArgoCD or similar tools for continuous deployment
    • Build secure CI/CD pipelines with automated security scanning and compliance checks
  • Security and Compliance:
    • Implement defense-in-depth security strategies including network segmentation, secrets management (Vault), and vulnerability scanning
    • Manage security baselines, conduct regular audits, and coordinate penetration testing
    • Prepare and maintain infrastructure for compliance audits (SOC2, ISO 27001)
    • Configure VPNs, firewalls, IDS/IPS, and implement zero-trust architecture
  • System Administration & Platform Engineering
    • Configure and optimize distributed systems for high-performance computing workloads
    • Network engineering including routing, load balancing, firewall configuration, and network segmentation
    • Manage distributed systems, bare metal servers, and virtualization platforms
  • Documentation & Collaboration
    • Create comprehensive documentation for infrastructure, security procedures, and incident response playbooks
    • Collaborate with protocol engineering and development teams on secure SDLC practices
    • Contribute to security awareness and training programs

Required Experience:

  • 5+ years DevOps/SRE experience with production infrastructure management
  • Cloud expertise: Hands-on experience with at least one major cloud provider (AWS, GCP, Azure)
  • Container orchestration: Production experience with Kubernetes and/or Nomad, Talos
  • IaC proficiency: Hands-on experience with at least one Terraform and Ansible with GitOps (ArgoCD) implementation
  • Monitoring stack: Production experience with Prometheus, Grafana, and Loki
  • Linux administration: Advanced skills in system administration and security hardening
  • Security implementation: Experience with vulnerability scanning, secrets management, and security automation in CI/CD
  • Networking: Strong understanding of VPNs, firewalls, load balancers, and network security
  • Scripting: Proficiency in Python, Go, or Bash for automation
  • Compliance: Experience with security audits and implementing compliance controls

Preferred Qualifications

  • Advanced infrastructure: Bare metal provisioning, mkosi/Yocto, virtualization, Intel SGX/TDX/AMD SEV
  • Multi-cloud/hybrid architecture: Designing and operating across multiple cloud providers simultaneously with on-premise infrastructure
  • Blockchain: Understanding of distributed systems security and high-assurance key management systems
  • SOC2: Direct experience preparing for and passing compliance audits
  • Distributed systems: Understanding of distributed systems security and high-assurance key management systems
  • Performance optimization: Experience with high-performance computing, GPU clusters, or latency-critical systems
  • Database operations: Managing PostgreSQL, Redis, or time-series databases at scale
  • Service mesh: Experience with Istio, Linkerd, or similar service mesh technologies
  • Edge computing: Experience with CDN configuration, edge deployments, or geo-distributed systems
  • Open source contributions: Active contributor to infrastructure or DevOps tooling projects
Employment Type
On-site

Related Jobs

Other similar jobs that might interest you