JustJoin.IT Hybrydowo Mid New

IT - Site Reliability Engineer

emagine Polska

⚲ Pune

Wymagania

  • Incident management
  • Configuration management
  • Configuration Management (ITIL)
  • Operations
  • Python
  • Cloud
  • Powershell
  • Security
  • Microsoft Azure
  • CI/CD

Opis stanowiska

Introduction & Summary We are seeking a dedicated Site Reliability Engineer (SRE) to join our team. The ideal candidate will possess a strong technical background and operational excellence in ensuring the reliability, availability, and performance of critical systems. You will play a key role in monitoring, troubleshooting, and resolving issues, while leveraging your expertise in observability for robust incident management. Main Responsibilities Your core duties will include: • Monitoring production systems and services using observability tools. • Responding to incidents, alerts, and outages in real time. • Participating in a rotating on-call schedule. • Designing, implementing, and maintaining observability solutions. • Collaborating with development and infrastructure teams to ensure system reliability. • Automating operational tasks and documenting procedures. • Conducting post-incident reviews and proposing monitoring enhancements. Key Requirements • Bachelor's degree in Information Technology, Computer Science or related field. • 2-5 years of experience in cloud and operations engineering. • Proficiency with Azure services; AWS and GCP experience is a plus. • Hands-on experience with Infrastructure-as-Code (IaC) tools like Terraform. • Strong scripting skills in Python, Bash or PowerShell. • Familiarity with Gitlab CI/CD tools integrated with Azure. • Proficiency in monitoring and logging tools. Nice to Have • Master's degree or relevant certifications. Other Details This position involves a 24/7 shift rotation, ensuring continuous system reliability and performance. The role emphasizes proactive monitoring and efficient incident response in a collaborative environment.