AI Engineer (RAG & On Prem LLMs)
⚲ Warszawa
17 000 - 23 000 PLN (PERMANENT)
Wymagania
- Python
- Machine learning
- AI
- ML
- RAG
- LLAMA
- Deep learning
- DeepSeek
- Mistral
- PyTorch
- Hugging Face
- LangChain
- Red Hat
- OpenShift
- NLP
Opis stanowiska
O projekcie:
- Private medical care co-financing- Sports card- Training & learning opportunities- Life insurance co-financing
Wymagania:
- At least 3 years of professional experience in ML/NLP roles, including 2+ years working with RAG systems- Proven experience deploying and operating LLM‑based solutions in on‑prem or hybrid environments- Hands‑on experience with vLLM, LiteLLM, and open‑source LLMs such as LLAMA 3.2, DeepSeek, or Mistral- Strong Python skills and experience with frameworks such as PyTorch, Hugging Face Transformers, and LangChain- Experience with vector databases (e.g. Neo4j)- Familiarity with Linux‑based systems and Red Hat OpenShift- Strong problem‑solving and analytical skills- Ability to clearly communicate complex AI concepts to non‑technical stakeholders- Bachelor's, Master's, or PhD degree in Computer Science, Artificial Intelligence, or a related field- Knowledge of English (B2+/C1)
Codzienne zadania:
- Architect, implement, and optimize end-to-end Retrieval Augmented Generation (RAG) pipelines for enterprise use cases in on-premises environments
- Design and integrate retrieval mechanisms (e.g. vector databases such as Neo4j) with generative models (e.g. LLAMA 3.2, Mistral)
- Fine-tune and optimize retrieval and generation components to achieve high accuracy and low latency
- Implement and customize inference servers using vLLM and LiteLLM for efficient and scalable LLM serving
- Integrate open-source large language models with proprietary data sources and enterprise APIs
- Design GPU-optimized, scalable on-prem infrastructure for model training and inference, ensuring security and data governance compliance
- Collaborate with DevOps teams to containerize workflows using Docker and Kubernetes and automate MLOps pipelines
- Apply performance optimization techniques such as quantization, pruning, and dynamic batching
- Monitor system performance, troubleshoot bottlenecks, and ensure high availability
- Work closely with data engineers and business stakeholders to translate business requirements into technical AI solutions in telco environments
- Private medical care co-financing- Sports card- Training & learning opportunities- Life insurance co-financing
Wymagania:
- At least 3 years of professional experience in ML/NLP roles, including 2+ years working with RAG systems- Proven experience deploying and operating LLM‑based solutions in on‑prem or hybrid environments- Hands‑on experience with vLLM, LiteLLM, and open‑source LLMs such as LLAMA 3.2, DeepSeek, or Mistral- Strong Python skills and experience with frameworks such as PyTorch, Hugging Face Transformers, and LangChain- Experience with vector databases (e.g. Neo4j)- Familiarity with Linux‑based systems and Red Hat OpenShift- Strong problem‑solving and analytical skills- Ability to clearly communicate complex AI concepts to non‑technical stakeholders- Bachelor's, Master's, or PhD degree in Computer Science, Artificial Intelligence, or a related field- Knowledge of English (B2+/C1)
Codzienne zadania:
- Architect, implement, and optimize end-to-end Retrieval Augmented Generation (RAG) pipelines for enterprise use cases in on-premises environments
- Design and integrate retrieval mechanisms (e.g. vector databases such as Neo4j) with generative models (e.g. LLAMA 3.2, Mistral)
- Fine-tune and optimize retrieval and generation components to achieve high accuracy and low latency
- Implement and customize inference servers using vLLM and LiteLLM for efficient and scalable LLM serving
- Integrate open-source large language models with proprietary data sources and enterprise APIs
- Design GPU-optimized, scalable on-prem infrastructure for model training and inference, ensuring security and data governance compliance
- Collaborate with DevOps teams to containerize workflows using Docker and Kubernetes and automate MLOps pipelines
- Apply performance optimization techniques such as quantization, pruning, and dynamic batching
- Monitor system performance, troubleshoot bottlenecks, and ensure high availability
- Work closely with data engineers and business stakeholders to translate business requirements into technical AI solutions in telco environments
🔍 Dekoder Ogłoszenia
🔴
Private medical care co-financing
Firma oferuje częściowe pokrycie kosztów prywatnej opieki medycznej, ale niekoniecznie pełne lub darmowe.
🔴
Sports card
Możliwość skorzystania z karty sportowej, która zazwyczaj wymaga dopłaty pracownika lub ma ograniczony zakres.
🔴
Training & learning opportunities
Dostęp do szkoleń i możliwości nauki, które mogą być ograniczone budżetem lub dostępnością.
🔴
Life insurance co-financing
Firma współfinansuje ubezpieczenie na życie, co oznacza, że pracownik również będzie musiał ponieść część kosztów.
🔴
Bachelor's, Master's, or PhD degree in Computer Science, Artificial Intelligence, or a related field
Chociaż wymieniono różne stopnie naukowe, często preferowany jest wyższy stopień lub konkretne doświadczenie, które może przewyższać formalne wykształcenie.