DevOps/On call

Remote (US timezones) · Singapore

What we need

Yep is looking for a Site Reliability Engineer with deep knowledge of Linux and distributed systems to help take care of its distributed crawler and ensure all systems are up and running 24/7. Working experience with bare-metal servers and ability to participate in daily on-call rotation are required.

Our system is big part custom OCaml code and also employs third-party technologies - Debian, ELK, Puppet, and anything else that will solve the task at hand. In this role, be prepared to deal with 25 petabytes storage cluster, 2,000 bare-metal servers, experimental large-scale deployments and all kinds of software bugs and hardware deviations on a daily basis.

If you possess a healthy desire to automate everything while being able to quickly resolve urgent issues manually, then we want you! We strive to keep humans away from doing repetitive jobs that can be done by computers and focus instead on foreseeing problems and defining programmatic means to handle them.

Basic requirements:

  • Deep understanding of operating systems and networks fundamentals

  • Practical knowledge of Linux userspace and kernel internals

  • Working experience with bare-metal servers

  • Participation in on-call rotation (6 hours every weekday + one weekend per month)

The ideal candidate is expected to:

  • Understand the whole technology stack at all levels: from network and user-space code to OS internals and hardware

  • Independently deal with and investigate infrastructure issues on live production systems including dealing with hardware problems and interact with datacenters

  • Develop internal automation - monitoring, setup, statistics

  • Have the ability to foresee potential problems and prevent them from happening. Apply first-aid reaction to infrastructure failures when necessary

  • Help developers with deployment and integration

  • Make well-reasoned technical choices and take responsibility for it

  • Approach problems with a practical mindset and suppress perfectionism when time is a priority

  • Setup automatic systems to control infrastructure

  • Possess a healthy detestation for complex shell scripts

  • Be ready to work in a small team and take responsible independent decisions

Who we are

Yep is a search engine created by the Ahrefs team, which is behind one of the leading SEO toolsets worldwide. We’ve been crawling the entire web 24/7 since 2010, storing petabytes of information about live websites and backlinks.

We’re now looking to form the core product team for Yep.

We are a lean and robust team who strongly believe that better technology leads to better solutions for real-world problems. We worship functional languages and static typing, extensively employ code generation and meta-programming, value code clarity and predictability, and are constantly seeking to automate repetitive tasks and eliminate boilerplate, guided by DRY and following KISS. If there is any new technology that will make our life easier, we’ll be the first to give it a try. We solve problems at all levels, starting from simplifying overloaded UI, removing code bloat in middleware and tracking bugs all the way down to CPU.

We rely heavily on open-source code as the only viable way to build maintainable system and contribute back.

Our motto is, "First do it, then do it right, then do it better."

What you get

We offer:

  • Competitive compensation package

  • Informal and thriving work atmosphere

  • [SG office] First-class workplace equipment (hardware & tools)

  • Above-average perks and fringe benefits

Work location for this role could be:

  • Remote (US timezones)

  • Singapore

Apply for this job

To apply for this job, drop us a note at jobs@yep.com.

Please include:

  • Salary expectations.

  • Your CV and short description of how we can benefit each other.

  • Date of availability.

Apply now