Site Reliability Engineer
Okta
⚲ Barcelona
Wymagania
- Go
- Kubernetes
- Docker
- Terraform
- ArgoCD
Opis stanowiska
Okta is The World’s Identity Company. We free everyone to safely use any technology, anywhere, on any device or app. Our flexible and neutral products, Okta Platform and Auth0 Platform, provide secure access, authentication, and automation, placing identity at the core of business security and growth. At Okta, we celebrate a variety of perspectives and experiences. We are not looking for someone who checks every single box - we’re looking for lifelong learners and people who can make us better with their unique experiences. Join our team! We’re building a world where Identity belongs to you. Auth0 provides an unparalleled authentication experience for hundreds of millions of users worldwide. Our commitment to reliability is a key foundation of our product and our dedication to exceeding customer availability expectations is a core engineering focus. As a mid-level Site Reliability Engineer, you'll join our SRE team based in Europe to ensure our production systems are not only operational but also resilient, scalable, and ready for exponential growth. This isn't just about keeping the lights on; it's about directly contributing to the platform's core resiliency and robustness. You'll be a hands-on builder, crafting solutions that make our system more reliable by design. What you’ll do: - Design and build custom software in Go to enhance the platform's reliability, resiliency, and redundancy. - Partner with engineering teams to embed reliability principles, improving the availability, performance, and observability of our services. - Use your deep understanding of infrastructure and observability principles to identify opportunities for improvement within the product and implement solutions. - Contribute to our on-call rotation, providing rapid, effective response to critical incidents and using your expertise to troubleshoot, mitigate or accurately escalate production issues. - Develop and refine our SRE tooling and processes, focusing on automation and operational efficiency. - Define, document, and champion reliability best practices across the organisation. We're looking for someone who is not just looking for a job, but a career-defining opportunity to tackle complex challenges at a massive scale. If you're a curious and motivated engineer who's passionate about building reliability directly into the platform, we'd love to hear from you. LI-Remote