MLOps Engineer
DEVAPO SPÓŁKA Z OGRANICZONĄ ODPOWIEDZIALNOŚCIĄ
⚲ Warszawa, Ochota
Do uzgodnienia
Wymagania
- Python
- Docker
- Kubernetes
- MLflow
- Kubeflow
- Airflow
- AWS SageMaker
- Azure ML
- GCP Vertex AI
- GitHub Actions
- GitLab CI
- Jenkins
- Azure DevOps
- Terraform
- Pulumi
- CloudFormation
- Prometheus
- Grafana
- Datadog
- Databricks
- Azure AI Foundry
- AWS Bedrock
- Qdrant
- Weaviate
- Pinecone
- pgvector
Opis stanowiska
Nasze wymagania:
Proven experience running ML/AI systems in production — you’ve dealt with model drift, pipeline failures, and scaling issues in real environments
Strong Python skills and hands-on experience with MLOps tooling: MLflow, Kubeflow, Airflow, or similar
Solid experience with containerization (Docker) and orchestration (Kubernetes) in production settings
Working knowledge of at least one major cloud platform (AWS SageMaker, Azure ML, or GCP Vertex AI) and its ML services
Experience with CI/CD tools (GitHub Actions, GitLab CI, Jenkins, or Azure DevOps) applied to ML workflows
Infrastructure as Code experience (Terraform, Pulumi, or CloudFormation)
Understanding of ML fundamentals — you don’t need to build models, but you need to understand what makes them break in production
Experience with monitoring and observability tools (Prometheus, Grafana, Datadog, or similar)
English B2+ — client-facing role, calls and written communication included
Mile widziane:
Experience with LLM serving infrastructure (vLLM, TGI, Triton Inference Server)
Databricks, Azure AI Foundry, or AWS Bedrock
GPU infrastructure management and cost optimization
Kafka or streaming pipelines for real-time inference
Experience with vector databases (Qdrant, Weaviate, Pinecone, pgvector) in production RAG setups
Familiarity with AI governance and regulatory context (EU AI Act, GDPR)
O projekcie:
We are looking for an MLOps Engineer who knows that a model is only as good as the pipeline behind it — someone who has actually kept ML systems running in production, not just deployed a tutorial to a notebook. You will work on international projects for clients in banking, insurance, and telco (US, Netherlands, UK), building the infrastructure that makes AI reliable at scale.
Zakres obowiązków:
Designing, building, and maintaining CI/CD pipelines for ML model training, evaluation, and deployment
Managing model lifecycle end-to-end — from experiment tracking and versioning to production serving and monitoring
Setting up and maintaining infrastructure for ML workloads on cloud platforms (AWS, Azure, or GCP)
Implementing monitoring, alerting, and observability for deployed models — detecting drift, latency issues, and quality degradation
Building and managing feature stores, data pipelines, and ETL processes that feed ML models
Containerizing and orchestrating ML services using Docker and Kubernetes
Collaborating with data scientists and ML engineers to streamline the path from experimentation to production
Implementing Infrastructure as Code (Terraform, Pulumi, or CloudFormation) for reproducible ML environments
Defining and enforcing MLOps best practices, standards, and documentation across teams
Oferujemy:
Certifications and training funded
Private medical care (Medicover)
Multisport card
English language classes
Flexible working hours
Team meetups and integration events
Referral bonus
Proven experience running ML/AI systems in production — you’ve dealt with model drift, pipeline failures, and scaling issues in real environments
Strong Python skills and hands-on experience with MLOps tooling: MLflow, Kubeflow, Airflow, or similar
Solid experience with containerization (Docker) and orchestration (Kubernetes) in production settings
Working knowledge of at least one major cloud platform (AWS SageMaker, Azure ML, or GCP Vertex AI) and its ML services
Experience with CI/CD tools (GitHub Actions, GitLab CI, Jenkins, or Azure DevOps) applied to ML workflows
Infrastructure as Code experience (Terraform, Pulumi, or CloudFormation)
Understanding of ML fundamentals — you don’t need to build models, but you need to understand what makes them break in production
Experience with monitoring and observability tools (Prometheus, Grafana, Datadog, or similar)
English B2+ — client-facing role, calls and written communication included
Mile widziane:
Experience with LLM serving infrastructure (vLLM, TGI, Triton Inference Server)
Databricks, Azure AI Foundry, or AWS Bedrock
GPU infrastructure management and cost optimization
Kafka or streaming pipelines for real-time inference
Experience with vector databases (Qdrant, Weaviate, Pinecone, pgvector) in production RAG setups
Familiarity with AI governance and regulatory context (EU AI Act, GDPR)
O projekcie:
We are looking for an MLOps Engineer who knows that a model is only as good as the pipeline behind it — someone who has actually kept ML systems running in production, not just deployed a tutorial to a notebook. You will work on international projects for clients in banking, insurance, and telco (US, Netherlands, UK), building the infrastructure that makes AI reliable at scale.
Zakres obowiązków:
Designing, building, and maintaining CI/CD pipelines for ML model training, evaluation, and deployment
Managing model lifecycle end-to-end — from experiment tracking and versioning to production serving and monitoring
Setting up and maintaining infrastructure for ML workloads on cloud platforms (AWS, Azure, or GCP)
Implementing monitoring, alerting, and observability for deployed models — detecting drift, latency issues, and quality degradation
Building and managing feature stores, data pipelines, and ETL processes that feed ML models
Containerizing and orchestrating ML services using Docker and Kubernetes
Collaborating with data scientists and ML engineers to streamline the path from experimentation to production
Implementing Infrastructure as Code (Terraform, Pulumi, or CloudFormation) for reproducible ML environments
Defining and enforcing MLOps best practices, standards, and documentation across teams
Oferujemy:
Certifications and training funded
Private medical care (Medicover)
Multisport card
English language classes
Flexible working hours
Team meetups and integration events
Referral bonus
🔍 Dekoder Ogłoszenia
🟡
Proven experience running ML/AI systems in production — you’ve dealt with model drift, pipeline failures, and scaling issues in real environments
Oczekiwane jest doświadczenie w rozwiązywaniu problemów, które pojawiają się w rzeczywistych, produkcyjnych systemach ML/AI, a nie tylko teoretyczna wiedza.
🟡
you don’t need to build models, but you need to understand what makes them break in production
Nie musisz być ekspertem od tworzenia modeli, ale musisz rozumieć ich ograniczenia i potencjalne problemy w środowisku produkcyjnym.
🟡
English B2+ — client-facing role, calls and written communication included
Oczekiwana jest dobra znajomość języka angielskiego, ponieważ będziesz komunikować się z klientem, co może oznaczać częste rozmowy i pisanie dokumentacji.
🟢
Experience with LLM serving infrastructure (vLLM, TGI, Triton Inference Server)
Mile widziane jest doświadczenie z konkretnymi narzędziami do serwowania modeli językowych, co może wskazywać na przyszłe kierunki rozwoju projektu.
🔴
GPU infrastructure management and cost optimization
Może to oznaczać, że projekt intensywnie wykorzystuje zasoby GPU, a optymalizacja kosztów będzie ważnym aspektem pracy.