TheProtocol.IT Praca zdalna Expert New

T Hub - AI Expert – RAG & LLM Systems

T-Mobile

⚲ Warszawa

Wymagania

Python

Opis stanowiska

Wymagania: - Bachelor’s, Master’s, or PhD degree in Computer Science, Artificial Intelligence, or a related field - at least 3 years of professional experience in ML/NLP roles, including 2+ years working with RAG systems - proven experience deploying and operating LLM‑based solutions in on‑prem or hybrid environments - hands‑on experience with vLLM, LiteLLM, and open‑source LLMs such as LLAMA 3.2, DeepSeek, or Mistral - strong Python skills and experience with frameworks such as PyTorch, Hugging Face Transformers, and LangChain - experience with vector databases (e.g. Neo4j) - familiarity with Linux‑based systems and Red Hat OpenShift - strong problem‑solving and analytical skills - ability to clearly communicate complex AI concepts to non‑technical stakeholders O firmie: - Working at T-Mobile’s T Hub will offer you an unique and highly rewarding experience on IT market. As a leader in the telecommunications industry, T-Mobile not only provides a platform to hone your technical skills but also empowers you to be a catalyst for innovation. You'll have the opportunity to work at the forefront of cutting-edge technologies, from 5G to IoT and AI, shaping the future of connectivity. Our commitment to fostering a diverse and inclusive workplace means you'll collaborate with a wide range of talented professionals, learning and growing together. T-Mobile encourages a culture of continuous learning and development, offering mentorship and support to help you thrive in your career. Zakres obowiązków: - architect, implement, and optimize end‑to‑end Retrieval Augmented Generation (RAG) pipelines for enterprise use cases in on‑premises environments - design and integrate retrieval mechanisms (e.g. vector databases such as Neo4j) with generative models (e.g. LLAMA 3.2, Mistral) - fine‑tune and optimize retrieval and generation components to achieve high accuracy and low latency - implement and customize inference servers using vLLM and LiteLLM for efficient and scalable LLM serving - integrate open‑source large language models with proprietary data sources and enterprise APIs - design GPU‑optimized, scalable on‑prem infrastructure for model training and inference, ensuring security and data governance compliance - collaborate with DevOps teams to containerize workflows using Docker and Kubernetes and automate MLOps pipelines - apply performance optimization techniques such as quantization, pruning, and dynamic batching - monitor system performance, troubleshoot bottlenecks, and ensure high availability - work closely with data engineers and business stakeholders to translate business requirements into technical AI solutions in telco environments Oferujemy: - stable employment based on an employment contract - private medical care and life insurance - access to professional training platforms such as Percipio, Coursera, and Rodos - flexible benefits platform – you choose what suits you best - additional day off for your birthday or name day - free parking space - flexible home/office working model

2026-05-08 Aplikuj - przejdz do oferty ↗