Senior ML/AI Engineer
ITEAMLY SPÓŁKA Z OGRANICZONĄ ODPOWIEDZIALNOŚCIĄ
⚲ Warszawa
23 000–34 000 zł netto (+ VAT) / mies.
Wymagania
- Python
Opis stanowiska
Nasze wymagania:
5+ years in ML engineering / applied AI / search & retrieval / backend AI systems
Strong Python — production services, not just notebooks
Hands-on with LLMs: retrieval, embeddings, structured outputs, tool/agent orchestration
Experience building evaluation pipelines (golden cases, automated checks, human review loops)
Understanding of retrieval failure modes: stale context, hallucination amplification, ranking drift
Knowledge graphs / graph-based memory (Neo4j or similar)
Mile widziane:
Fine-tuning, distillation, or local model deployment
Experience with sovereignty-driven model strategy
Agentic systems or long-term memory for AI products
Pragmatic reasoning across vector / lexical / graph / hybrid retrieval approaches
O projekcie:
• APIs, async jobs, observability, schema validation, production debugging
• Strong bias toward measurable quality improvements over demo-only progress
• AI-native dev tools: Claude Code, Codex, or similar
Zakres obowiązków:
Build scalable memory and retrieval systems for high-context AI workflows
Hybrid retrieval across knowledge graphs, vector, lexical, and structured approaches
Evaluation workflows comparing retrieval and memory strategies on real corpora
Agent memory interfaces — enabling AI workflows to query, inspect, and reuse context
Quality metrics: recall, provenance, latency, cost
Routing logic — deciding when to use lightweight retrieval, deeper reasoning, or agentic workflows
Fine-tuning, distillation, and model adaptation pipelines
Observability for AI workflows: retrieval failures, model regressions, cost budgets
Oferujemy:
Fully remote work,
Opportunity to work on exciting AI/ML projects,
Flexible cooperation model,
Career growth and exposure to modern technologies,
Friendly and collaborative work environment.
5+ years in ML engineering / applied AI / search & retrieval / backend AI systems
Strong Python — production services, not just notebooks
Hands-on with LLMs: retrieval, embeddings, structured outputs, tool/agent orchestration
Experience building evaluation pipelines (golden cases, automated checks, human review loops)
Understanding of retrieval failure modes: stale context, hallucination amplification, ranking drift
Knowledge graphs / graph-based memory (Neo4j or similar)
Mile widziane:
Fine-tuning, distillation, or local model deployment
Experience with sovereignty-driven model strategy
Agentic systems or long-term memory for AI products
Pragmatic reasoning across vector / lexical / graph / hybrid retrieval approaches
O projekcie:
• APIs, async jobs, observability, schema validation, production debugging
• Strong bias toward measurable quality improvements over demo-only progress
• AI-native dev tools: Claude Code, Codex, or similar
Zakres obowiązków:
Build scalable memory and retrieval systems for high-context AI workflows
Hybrid retrieval across knowledge graphs, vector, lexical, and structured approaches
Evaluation workflows comparing retrieval and memory strategies on real corpora
Agent memory interfaces — enabling AI workflows to query, inspect, and reuse context
Quality metrics: recall, provenance, latency, cost
Routing logic — deciding when to use lightweight retrieval, deeper reasoning, or agentic workflows
Fine-tuning, distillation, and model adaptation pipelines
Observability for AI workflows: retrieval failures, model regressions, cost budgets
Oferujemy:
Fully remote work,
Opportunity to work on exciting AI/ML projects,
Flexible cooperation model,
Career growth and exposure to modern technologies,
Friendly and collaborative work environment.
🔍 Dekoder Ogłoszenia
🟡
Strong Python — production services, not just notebooks
Oczekuje się, że kandydat będzie tworzył stabilne i skalowalne aplikacje w Pythonie, a nie tylko eksperymentował w środowiskach interaktywnych.
🟡
Hands-on with LLMs: retrieval, embeddings, structured outputs, tool/agent orchestration
Wymagane jest praktyczne doświadczenie w pracy z dużymi modelami językowymi, obejmujące ich integrację i wykorzystanie w konkretnych zastosowaniach.
🟡
Experience building evaluation pipelines (golden cases, automated checks, human review loops)
Kandydat musi umieć projektować i wdrażać systemy oceny jakości modeli, które łączą automatyzację z weryfikacją przez człowieka.
🟡
Understanding of retrieval failure modes: stale context, hallucination amplification, ranking drift
Oczekuje się głębokiego zrozumienia problemów związanych z wyszukiwaniem informacji w kontekście AI i umiejętności ich rozwiązywania.
🟢
Strong bias toward measurable quality improvements over demo-only progress
Priorytetem są realne, mierzalne usprawnienia jakości, a nie tylko prezentacje działających prototypów.