Site Reliability Engineer with Nomad

Duration: Permanent

Pay Rate: 120K – 140K annually on W2 along with the benefits

Job Description:

Technical/ Functional Skill

  • Hands on experience in building and working on NOMAD clusters.
  • Experience running docker based workloads in production using a platform like Nomad.
  • Strong sense of ownership, customer service, and integrity demonstrated through clear communication.
  • Deep understanding of the Linux and system administration at large-scale
  • Understanding of standard networking protocols and components such as: HTTP, DNS, TCP/IP, the OSI Model,
  • Subnetting and Load Balancing strategies.
  • [nice to have] Coding experience using a high-level programming language like: Python, Golang

 

Roles & Responsibilities:

  • Keeping the lights on - Oncall and Alert Handling.
  • Manage new buildouts (additions and decommissions)
  • Develop and maintain scripts used for environment monitoring and task automation (Python, Ansible, Puppet)
  • Experience setting up and managing monitoring tools such as Graphite, Prometheus, Influx DB, Grafana
  • Set priorities and work efficiently in a fast-paced environment.
  • Measure and optimize system performance.
  • Demonstrate ability to deliver results on time with high quality.

 

Qualifications: Bachelor’s

Experience Required: 8+ years

Shift: Day