Site Reliability Engineer
GLOBALTECH POLAND SPÓŁKA Z OGRANICZONĄ ODPOWIEDZIALNOŚCIĄ
⚲ Warszawa, Wola
Do uzgodnienia
Wymagania
- JavaScript
- Python
- Google Cloud Platform
Opis stanowiska
Nasze wymagania:
3+ years of experience in production support, site reliability engineering (SRE), or DevOps - preferably supporting Apigee APIs.
Strong understanding of cloud infrastructure (GCP preferred) and observability tools.
Proficiency in Python or shell scripting for automation and troubleshooting.
Strong analytical, communication, and incident management skills.
Bachelor’s degree in Computer Science, Engineering, or a related field.
Proficiency in programming languages such as Python and JavaScript.
Excellent problem-solving and analytical skills.
English proficiency at B2-C1 level and Polish proficiency at B1-B2 level.
Ability to work proactively with a high level of initiative and accuracy.
Critical thinking ability ranging from moderately to highly complex tasks.
Mile widziane:
Experience with CI/CD tools and Alerts/Monitoring automation.
Familiarity with API integrations.
Zakres obowiązków:
Serve as the first line of defense for production incidents, ensuring rapid triage, root cause analysis, and resolution.
Monitor system health and performance of deployed APIs and integrating applications.
Track and investigate issues related to latency, failures, or broken integrations, escalating to the API engineering group where appropriate.
Collaborate with platform engineers to implement observability, logging, and alerting best practices for API services.
Build diagnostic tools, runbooks, and automated workflows to improve incident response time and reduce manual intervention.
Maintain knowledge bases and playbooks for repeatable troubleshooting and knowledge transfer.
Partner with governance and compliance teams to ensure incidents are documented and remediated in line with internal policy.
Contribute to retrospectives and continuous improvement efforts to harden production systems.
Oferujemy:
working in a global environment with international market-focused projects
using English language on daily base
private medical care
onboarding training in first days of work – you will get to know our company better
training for employees: with us you will develop your professional and personal potential
lunch pass/Pluxee
multisport cards at preferential prices
possibility to join a group UNUM life insurance
fresh fruits every Wednesday and delicious coffee from Praska Palarnia every
3+ years of experience in production support, site reliability engineering (SRE), or DevOps - preferably supporting Apigee APIs.
Strong understanding of cloud infrastructure (GCP preferred) and observability tools.
Proficiency in Python or shell scripting for automation and troubleshooting.
Strong analytical, communication, and incident management skills.
Bachelor’s degree in Computer Science, Engineering, or a related field.
Proficiency in programming languages such as Python and JavaScript.
Excellent problem-solving and analytical skills.
English proficiency at B2-C1 level and Polish proficiency at B1-B2 level.
Ability to work proactively with a high level of initiative and accuracy.
Critical thinking ability ranging from moderately to highly complex tasks.
Mile widziane:
Experience with CI/CD tools and Alerts/Monitoring automation.
Familiarity with API integrations.
Zakres obowiązków:
Serve as the first line of defense for production incidents, ensuring rapid triage, root cause analysis, and resolution.
Monitor system health and performance of deployed APIs and integrating applications.
Track and investigate issues related to latency, failures, or broken integrations, escalating to the API engineering group where appropriate.
Collaborate with platform engineers to implement observability, logging, and alerting best practices for API services.
Build diagnostic tools, runbooks, and automated workflows to improve incident response time and reduce manual intervention.
Maintain knowledge bases and playbooks for repeatable troubleshooting and knowledge transfer.
Partner with governance and compliance teams to ensure incidents are documented and remediated in line with internal policy.
Contribute to retrospectives and continuous improvement efforts to harden production systems.
Oferujemy:
working in a global environment with international market-focused projects
using English language on daily base
private medical care
onboarding training in first days of work – you will get to know our company better
training for employees: with us you will develop your professional and personal potential
lunch pass/Pluxee
multisport cards at preferential prices
possibility to join a group UNUM life insurance
fresh fruits every Wednesday and delicious coffee from Praska Palarnia every
🔍 Dekoder Ogłoszenia
🔴
Serve as the first line of defense for production incidents, ensuring rapid triage, root cause analysis, and resolution.
Będziesz osobą, która pierwsza reaguje na awarie, co może oznaczać dużą presję i konieczność szybkiego działania w stresujących sytuacjach.
🟡
Ability to work proactively with a high level of initiative and accuracy.
Oczekuje się, że będziesz samodzielnie identyfikować problemy i proponować rozwiązania, a nie tylko wykonywać polecenia.
🟡
Critical thinking ability ranging from moderately to highly complex tasks.
Będziesz musiał radzić sobie z problemami, które nie mają oczywistych rozwiązań i wymagają głębszej analizy.
🟡
escalating to the API engineering group where appropriate.
Twoja rola może polegać na wstępnym diagnozowaniu problemów, ale faktyczne naprawy bardziej złożonych błędów mogą być delegowane do innych zespołów.
🔴
reduce manual int
Prawdopodobnie będziesz musiał tworzyć skrypty i narzędzia, które zautomatyzują powtarzalne zadania, co może wymagać dodatkowego czasu poza podstawowymi obowiązkami.