AI/ ML Engineer (Data & Schema Engineering)
Acaisoft
⚲ Warszawa
25 200 - 31 920 PLN (B2B)
Wymagania
- Python
- Machine learning
- AI
- LLM
Opis stanowiska
O projekcie: i there! If you’re looking for a high-impact position in an ambitious software house, we’ve got a match for you! Currently, we are searching for an ML/AI Engineer to cooperate with our US client - the company integrates production-grade, governed AI workflows into complex enterprise systems by addressing common friction points like poor integration and lack of domain expertise. The organization combines deep domain knowledge with cutting-edge capabilities to accelerate the deployment of reliable AI products. Working hours: generally flexible; however, availability is required daily between 6:00 PM and 9:00 PM CEST due to regular meetings with the client’s team. Wymagania: - 7+ years of industry experience, including 3+ years in a similar role focused on AI/ML systems - Strong proficiency in Python and SQL for data processing, analysis, and pipeline development - Hands-on experience with AI/LLM frameworks, particularly LangChain - Proven experience designing and evaluating AI models, including benchmarking, test set creation, and performance analysis - Solid understanding of data schema design, dataset generation, and data quality standards for AI applications - Experience building and maintaining scalable data and evaluation pipelines in production environments Codzienne zadania: - Design, implement, and maintain data schema definitions for AI platform inputs, outputs, and intermediate representations - Ensure schema compatibility with APIs, databases, and downstream systems, including versioning as the platform evolves - Develop and curate benchmark test sets to evaluate AI/LLM performance across defined use cases - Build and maintain benchmarking pipelines, run systematic evaluations, and deliver structured performance reports - Identify performance gaps and collaborate with cross-functional teams to prioritize model improvements - Generate synthetic and curated datasets, applying data augmentation and labeling techniques while ensuring data quality, diversity, compliance, and proper documentation