NoFluffJobs Stacjonarnie Senior

Site Reliability Engineer

Antal

⚲ Kraków

30 240 - 38 640 PLN (B2B)

Wymagania

  • Grafana
  • Prometheus
  • Loki
  • Splunk
  • Unix
  • Linux
  • Cloud
  • GCP
  • RDBMS
  • Ansible
  • Jenkins
  • GitHub Actions
  • Control-M
  • Communication skills

Opis stanowiska

O projekcie: Site Reliability Engineer 📍 Kraków (Hybrid – minimum 2 days/week in the office) 💼 Employment type: B2B Are you looking for an opportunity to join a high-impact project in a global financial institution that invests heavily in cloud, AI, and DevOps? We're building a new Site Reliability Engineering (SRE) team in Kraków to support a mission-critical Counterparty Credit Risk (CCR) platform, and we're looking for experienced engineers to join the journey. As part of this role, you'll contribute to the stability, scalability, and observability of a high-volume, distributed platform operating on both Google Cloud Platform and on-prem infrastructure. What you’ll do: - Ensure the reliability and high availability of production systems used in global credit risk management. - Monitor, detect, and troubleshoot incidents in distributed systems running in cloud and hybrid environments. - Implement observability tools (Grafana, Prometheus, Loki, etc.) and improve monitoring and alerting strategies. - Lead root cause analysis (RCA) and post-incident reviews to improve resilience and operational efficiency. - Collaborate with developers, DevOps engineers, and global support teams to implement SRE best practices. - Contribute to CI/CD automation, deployment pipelines, and security/vulnerability remediation. What we offer: - The chance to build and shape a new SRE team supporting a critical platform for global risk management. - Work in a modern technology stack: Java, GCP, Apache Beam, Spring Boot, DevOps tooling. - Hybrid working model with at least 2 days/week in our Kraków office. - Stable, long-term project with excellent opportunities for growth and learning. 📩 Interested? Apply now and take the next step in your career with a team that’s redefining reliability at a global scale. Wymagania: - 5+ years of experience in supporting or developing distributed systems (Java-based environments preferred). - Hands-on experience with monitoring and logging tools: Grafana, Prometheus, Loki, Splunk, etc. - Solid understanding of Unix/Linux systems, cloud infrastructure (GCP preferred), and databases (RDBMS). - Experience with CI/CD tooling, such as Ansible, Jenkins, GitHub Actions, and vulnerability management. - Familiarity with job scheduling tools (e.g., Control-M or equivalent). - Strong communication skills and ability to drive technical discussions with multiple support teams. - Experience working in Agile/Scrum teams.