Senior DevOps Engineer
You may know Gatsby, a wildly popular open-source project that has nearly 50k GitHub stars and a thriving community of more than 3,000 contributors. Beyond open source, we’re also a newly commercializing business, one that helps the professional developer build blazing-fast apps and websites without needing to become a performance expert.
As a remote-first, community-focused team, Gatsby’s core values include:
- Prioritize the customer
- Expect excellence, have empathy
- Take ownership
- Grow through inclusivity
- Collaborate by default
Why we’re hiring
Gatsby Cloud is growing fast, and we’re building products that make our user’s projects easier to manage, scale, and improve. The team is small with a million interesting problems to solve.
The mission of the Infrastructure team is to architect, build, operate and enhance Gatsby Cloud infrastructure in the support of product initiatives. The team strives to impact performance, reliability, security, and scalability.
As we’re growing, we need more engineers to support these initiatives! You’ll be working closely with our small but amazing team, solving interesting problems and helping us anticipate how we can improve our infrastructure as we scale.
- Help us better understand the Gatsby Cloud user’s experience. We’re always trying to create better monitoring and alerting that allows us to understand what’s happening in Gatsby Cloud and react to customer issues quickly.
- Plan, scope, and execute on large initiatives. We approach our work with specific company goals in mind, and we’re looking for someone who is opinionated on how we can reach them. Right now, we’re deeply focused on our product’s quality and stability as we scale.
- Participate in planning, retrospectives, and other team rituals. We’re still a small, relatively new team, and we’re eager for every team member to help us improve our processes and focus on the most important work, including proposing new practices that have worked well for you!
- Debug and investigate complex and interesting customer issues.
- Be available for on-call rotation during working hours
- Code scripts to enhance and improve developer tools
- Demonstrated experience scaling commercial SaaS systems. Bonus points if you grew a system from something small. We’re growing fast, and will continue to do so, and we need someone who is able and willing to question our architecture along the way.
- Experience with Kubernetes cluster design and managing Kubernetes at scale.
- Experience with cloud hosting services, particularly with awareness of Google Cloud’s offerings.
- Proven experience in operations. You’re opinionated about tooling, and have ideas about what we can do better or differently. You’re active, responsible, and thoughtful about metrics. All of this is driven by a desire to understand and relate to the customer experience.
- Experience with nginx.
- Experience with search DBs like Elasticsearch, Lucene, or Cassandra.
- Experience creating and maintaining monitoring solutions.
- Terraform, ansible, and other IaaC type of tools
- You have experience with security. You’re willing to share your opinions about the most-impactful steps we can take to improve in this area.
- You feel comfortable working on Node.js applications. This will make it easier for you to dive into our Node.js backend applications as-needed.
- You have hands-on experience with Gatsby.
The best parts of this job
- You’ll be at the cutting edge of website development — working on one of the fastest-growing site building frameworks on the market.
- You’ll feel a deep sense of ownership. This role will play a key part in shaping our future as we scale. We’ll welcome your thoughts and opinions about how to improve our infrastructure with open arms.
- An incredible team to learn from and mentor. From domain experts to incredibly talented early-career developers, the Gatsby Cloud Team is a team that you will be challenged by, and that you will challenge.
- Challenging technical problems. These include scaling, container orchestration, and running untrusted code at scale. These challenging engineering endeavors and problems are complex, but rewarding and oh-so-fun.
The worst parts of this job
- Shifting context. You may necessarily have to shift context, whether it’s due to shifting priorities, or an urgent customer-facing issue.
- We’re a really distributed team. The infrastructure team, in particular, has contributors from the Pacific time zone to the India standard time zone. We’re passionate about remote work, and strive to create sustainable work schedules for everyone, but this sometimes results in long feedback loops.
- We don’t know what we don’t know. You are probably going to be leaned on to suggest improvements to processes that we acknowledge are clunky, but also point out areas where we need structure we haven’t even scoped yet.
Details of the role
- Type of Work: Full-time
- Location: Remote (preference for UTC-8 to UTC+4)
- Engineering Level: Level 4 (see our Engineering Levels Guide)
Benefits and perks
- Unlimited vacation policy, with a minimum of 15 days paid vacation time
- Amazing health, dental, and vision insurance for you and your family (US only)
- 3 months of paid parental leave covering both adoption and foster placement
- Stock options in a fast-growing startup
- We are distributed first, so skip the commute
- Set up costs for a home office OR coworking/private office reimbursement
- New laptop of your choice
- WiFi and cell phone reimbursement
- Fly to cool locations 3x/year for company-wide meetups (once it’s safe to travel again!)