NoFluffJobs Stacjonarnie Senior

Senior Site Reliability Engineer

Link Group

⚲ Warszawa

28 560 - 38 640 PLN (B2B)

Wymagania

SRE
DevOps
AWS Cloud Services
IaC
Docker
CI/CD Pipelines
GitHub Actions
PostgreSQL
Amazon RDS
SQL
VPC
DNS
Troubleshooting
dig
traceroute
UNIX/Linux
Prometheus
Grafana
Datadog
Dynatrace
Automation
AI
Problem-Solving
Incident management skills

Opis stanowiska

O projekcie: We are looking for an experienced Site Reliability Engineer to ensure the reliability, scalability, and performance of large-scale cloud-based web applications. You will work closely with software development, cloud operations, and platform teams to build and maintain resilient infrastructure and improve system stability. Wymagania: - 5+ years of experience in SRE, DevOps, or similar roles- Strong experience with AWS cloud services and Infrastructure-as-Code tools- Hands-on experience with Kubernetes and containerized environments- Proficiency in Docker and CI/CD pipelines (e.g., GitHub Actions)- Solid understanding of databases (e.g., PostgreSQL, Amazon RDS) and SQL- Knowledge of networking concepts (VPC, DNS, troubleshooting tools like dig/traceroute)- Strong Linux/Unix administration skills- Experience with observability tools (e.g., Prometheus, Grafana, Datadog, Dynatrace)- Familiarity with automation and AI-based solutions in infrastructure- Strong problem-solving and incident management skills Codzienne zadania: - Design and maintain monitoring, alerting, and incident response systems to ensure high availability - Collaborate closely with engineering, product, and architecture teams - Build and manage cloud infrastructure using Infrastructure-as-Code (e.g., Terraform, Pulumi) on AWS - Operate and optimize Kubernetes environments (e.g., EKS) - Develop and maintain containerized applications using Docker - Improve CI/CD pipelines and drive automation across deployment processes - Implement and manage observability tools (logging, metrics, tracing) - Participate in incident management, postmortems, and reliability improvements - Support capacity planning, disaster recovery, and system scaling - Contribute to security, compliance, and operational best practices - Develop automation and AI-driven solutions for monitoring and incident prevention

🔍 Dekoder Ogłoszenia

🔴

ensure the reliability, scalability, and performance of large-scale cloud-based web applications

Oczekuje się, że będziesz odpowiedzialny za utrzymanie i poprawę kluczowych metryk systemu, co może wiązać się z presją i długimi godzinami pracy w przypadku problemów.

🔴

work closely with software development, cloud operations, and platform teams

Będziesz musiał efektywnie komunikować się i współpracować z wieloma różnymi zespołami, co może oznaczać potrzebę negocjacji priorytetów i rozwiązywania konfliktów.

🔴

improve system stability

Może to oznaczać, że obecny system jest niestabilny i wymaga znaczących nakładów pracy na jego naprawę, a nie tylko drobnych usprawnień.

🔴

Familiarity with automation and AI-based solutions in infrastructure

Chociaż brzmi nowocześnie, może oznaczać, że firma dopiero zaczyna eksplorować te technologie i oczekuje od Ciebie aktywnego ich wdrażania od podstaw.

🔴

drive automation across

Oczekuje się, że będziesz inicjatorem i wykonawcą automatyzacji, co może oznaczać dużą odpowiedzialność i potrzebę przekonywania innych do nowych rozwiązań.

2026-05-18

Aplikuj - przejdz do oferty ↗