Bulldogjob Praca zdalna Senior

AI Infrastructure Engineer (GPU)

Pragmatike

⚲ Dubai

Do uzgodnienia

Wymagania

Python
vLLM
TGI
Triton
Terraform

Opis stanowiska

About the Role

Pragmatike is recruiting on behalf of a fast-scaling, well-funded distributed cloud infrastructure startup building next-generation AI-native cloud services. The company is redefining how compute is delivered by providing GPU-powered infrastructure for AI/ML workloads, secure storage, and high-speed data transfer through a decentralized architecture that significantly reduces environmental impact compared to traditional cloud providers.

We are seeking a AI Infrastructure Engineer with strong experience in production-grade model serving and infrastructure for AI systems. This is a highly technical, hands-on role focused on building scalable, reliable, and efficient ML inference platforms powering real-time AI applications.

You will be responsible for designing and operating the core infrastructure that serves machine learning models at scale. You will work closely with infrastructure, platform, and applied AI teams to ensure high availability, low latency, and cost-efficient inference systems. Strong ownership, production mindset, and experience with distributed GPU systems are essential.

Your Responsibilities

- Build and operate production-grade model serving infrastructure using frameworks such as vLLM, TGI, Triton, or equivalent

- Design and implement robust deployment pipelines with blue/green and canary rollout strategies for ML models

- Develop and maintain auto-scaling systems, multi-model serving architectures, and intelligent request routing layers

- Optimize GPU utilization, memory efficiency, network throughput, and model artifact storage performance

- Design observability systems for tracking inference latency, throughput, GPU usage, cost metrics, and system health

- Manage model registries and CI/CD pipelines enabling automated and reproducible model deployments

- Own the full lifecycle of ML systems from development through production, including operational support and on-call responsibilities

- Define engineering best practices and contribute to platform scalability in a fast-moving startup environment

🔍 Dekoder Ogłoszenia

🔴

fast-scaling, well-funded distributed cloud infrastructure startup

Firma szybko rośnie i ma dużo pieniędzy, ale jako startup może mieć nieustabilne procesy i wysokie tempo pracy.

🔴

redefining how compute is delivered

Firma twierdzi, że wprowadza innowacyjne rozwiązania, ale rzeczywisty wpływ i sukces tych zmian mogą być jeszcze nieudowodnione.

🔴

significantly reduces environmental impact compared to traditional cloud providers

Jest to chwytliwy marketingowy slogan, który może być trudny do zweryfikowania i może nie mieć bezpośredniego przełożenia na codzienne obowiązki inżyniera.

🟡

highly technical, hands-on role

Oczekuje się, że będziesz aktywnie pracować przy kodzie i infrastrukturze, a nie tylko zarządzać zespołem lub delegować zadania.

🔴

Strong ownership

Będziesz ponosić pełną odpowiedzialność za swoje zadania i projekty, co może oznaczać dużą presję i konieczność samodzielnego rozwiązywania problemów.

2026-06-08

Aplikuj - przejdz do oferty ↗