JustJoin.IT Hybrydowo Mid New

Senior Site Reliability Engineer – Cloud & DevOps

ITDS

⚲ Krakow

24 150 - 30 450 PLN netto (B2B)

Wymagania

  • Cloud Infrastructure
  • Site Reliability Engineering
  • Production Application Support
  • Java
  • Python
  • Jenkins
  • Grafana
  • DevOps
  • Ansible
  • Prometheus

Opis stanowiska

Unleash the power of reliability — shape the future of cloud and DevOps innovation! Krakow-based opportunity with hybrid work model. As a Senior Site Reliability Engineer – Cloud & DevOps, you will be working for our client, a global leader in IT solutions, dedicated to ensuring the highest levels of system availability and performance. You will support and optimize production services, implement SRE best practices, and lead critical incident resolutions to drive seamless digital operations and continuous improvement. Your main responsibilities: • Act as a technical lead for supporting highly available (24x7) production services within a global DevOps team. • Implement and promote SRE best practices to enhance service availability, performance, and security. • Resolve incidents, conduct root cause analysis, and facilitate post-incident reviews to prevent recurrence. • Design and review software architecture, defining application SLIs and SLOs to optimize operational health. • Build and maintain observability frameworks using tools such as Prometheus and Grafana, automating alerts and monitoring. • Plan and execute application and infrastructure migrations, disaster recovery exercises, and product upgrades. • Review support queries, develop automation solutions, and enhance self-service capabilities to improve user experience. • Provide on-call support during scheduled rotations, ensuring rapid response to critical issues. • Participate in scheduled maintenance activities, including weekend tasks, to maintain system reliability with minimal user disruption. You're ideal for this role if you have: • Minimum of 7 years of professional experience in Production Application Support or Site Reliability Engineering. • Proven expertise with automation, build, and monitoring tools such as Ansible, Jenkins, Prometheus, and Grafana. • Exceptional analytical and troubleshooting skills in high-pressure environments. • Strong full-stack engineering skills with Java, Python, JavaScript, NodeJS, React, and SQL. • In-depth understanding of the Software Development Life Cycle (SDLC) and its principles. • Excellent communication skills to collaborate effectively across global, cross-functional teams. • Prior experience supporting large Atlassian Jira and Confluence Data Centre instances is a plus. • Ability to learn quickly and adapt to new technologies. Language Required for the role: • Fluent English. Eligibility for the role: • Only candidates with an existing legal right to work in the European Union will be considered for this role. #MAKEYourCareerBETTER Interested? Apply now and include your CV (preferably in English) along with a statement confirming your consent to the processing and storage of your personal data.