NoFluffJobs Praca zdalna Senior New

Lead DevOps / Monitoring & Automation Engineer

Ework Group

⚲ Remote

25 200 - 29 400 PLN (B2B)

Wymagania

  • Python
  • CI/CD
  • Datadog

Opis stanowiska

O projekcie: 🔹 For our Client we are looking for Lead DevOps / Monitoring & Automation Engineer🔹 Work location: Remote - In exceptional cases, there may be a need to be present at the office (Katowice/Łódź) A highly capable Lead Developer focused on application reliability, observability, and test automation. Experienced in designing and deploying integration tests for complex, multi-component systems and implementing production monitoring solutions. Combines deep Python expertise with a strong DevOps mindset to ensure system health through automated testing, proactive alerting, and insightful dashboards. A collaborative leader who mentors teams, drives technical clarity, and champions sustainable engineering practices. Wymagania: Core Technical Skills: - Python Development: Able to write clean, maintainable Python code for automation and testing purposes using a modern IDE (e.g., VS Code, PyCharm).- Integration Testing: Proficient with pytest and similar frameworks to design, write, and deploy comprehensive integration tests that validate end-to-end functionality and inter-component communication across distributed applications.- Test Automation & CI/CD: Experience automating the execution of test suites within CI/CD pipelines (e.g., GitLab CI, Jenkins, GitHub Actions) to ensure continuous quality.- Application Monitoring: Hands-on experience with Application Performance Monitoring (APM) tools (such as DataDog, or similar) to observe production applications.- Observability Configuration: Skilled in configuring monitoring agents, defining custom metrics, and creating informative dashboards to visualize application health and performance.- Alerting & Incident Response: Proven ability to design and implement automated alerting policies based on key performance indicators (KPIs), error rates, and thresholds to enable proactive incident response.- Debugging & Analysis: Proficient in using telemetry data and logs to debug production issues, analyze trends, and collaborate with teams to improve application stability.- Cloud & DevOps Basics: Familiar with deploying applications and test suites to serverless environments like AWS Lambda behind API Gateway. Comfortable using AWS boto3 and command-line tools (e.g., curl) for testing and debugging. Leadership & Collaboration: - Strong communicator who can lead technical discussions and translate complex system health data into actionable insights for development teams.- Encourages a culture of transparency, learning, and shared responsibility for production stability.- Skilled at mentoring developers in testing best practices and observability-driven development.- Thrives in cross-functional collaboration, working closely with development and operations teams to enhance system reliability. Professional Traits: - Proactive and highly engaged in ensuring team and system success.- Brings structure and calmness to incident management and complex engineering environments.- Driven by a desire to build reliable systems through rigorous testing and intelligent monitoring.- Values code quality, maintainability, and sustainable engineering practices. ✔️Nice-to-have: Supplementary Technical Skills: - Advanced Python: Comfortable writing and debugging asyncio-based asynchronous code. Strong understanding of multithreading and concurrency in Python.- Infrastructure as Code (IaC): Experience with Terraform for managing cloud resources.- Wider Cloud Exposure: Familiarity with deploying Python services to Azure Container Apps or other container orchestration platforms (e.g., Kubernetes).- Alternative Observability Tools: Experience with open-source monitoring stacks like Prometheus and Grafana.- Security & Authentication: Familiarity with authentication schemes including Basic Auth and OAuth, particularly in the context of testing secured API endpoints.- LLM & AI Ecosystem: Knowledge of modern LLM APIs, experience with LangChain for orchestration, or an understanding of the Model Context Protocol (MCP), LangSmith for observability. Broader Technical Background - Familiarity with a wider range of DevOps tools and practices beyond the core CI/CD mentioned above.- Experience with other cloud providers or hybrid cloud environments.