AI Engineer
⚲ Kraków
Do uzgodnienia
Wymagania
- A/B testing
- Node.js
- Next.js
- Langchain
- OpenRouter
- Prompt Engineering
Opis stanowiska
Ruby Labs is a tech company with a portfolio of consumer products in health, education, and entertainment (100M+ annual users). We’re looking for a senior AI Engineer (Node.js / Next.js / TypeScript) to shape our AI infrastructure and drive production-ready LLM experiences. You’ll work in a modern stack, making data-driven decisions around model performance, reliability, and cost. You’ll take full ownership of key AI features from experimentation to live production.
Responsibilities:
• Advanced Prompt Engineering: Designing complex, dynamic prompt templates with conditional logic and efficiently reusing information and context within prompts to maximize generation quality and reasoning.
• Structured Outputs & Schemas: Implementing various response schemes (JSON mode, function calling, Zod/JSON schemas) to ensure AI outputs are predictable and ready for seamless integration into application logic.
• Prompt Engineering & Evaluations: Building robust evaluation pipelines and using Langfuse to collect feedback and score the quality of responses in real time.
• Tracing & Debugging: Performing deep debugging of complex LLM chains using Langfuse traces to identify bottlenecks and optimize for cost, latency, and context window usage.
• AI A/B Testing: Running systematic experiments across different models via OpenRouter (e.g., comparing Claude 3.5 Sonnet vs. GPT-4o) and analyzing results based on quantitative metrics.
• Data-Driven Decisions: Making deployment decisions for new prompts or models strictly based on quantitative benchmarks and trace data, rather than intuition.
• Output Scoring & Analysis: Developing scoring systems to analyze the “Problem → Solution” chain and identify root causes of hallucinations or logic errors using Langfuse analytics.
• Model Performance & Fine-Tuning: Regularly re-evaluating model performance as new architectures emerge and performing fine-tuning when necessary to meet specific domain requirements.
Qualifications:
• Node.js & Next.js: Deep knowledge of the stack to build reliable services and handle complex LLM-generated data.
• Dynamic Prompting Skills: Proven experience in building prompts where content is highly dependent on input variables and context injection.
• OpenRouter Experience: Experience working with unified APIs, managing rate limits, and selecting the most cost-effective models for specific tasks.
• Langfuse (or similar): Understanding of LLM observability principles — setting up tracing, creating test datasets, and integrating scoring systems.
• Evaluation Methodology: Experience with frameworks like RAGAS or building custom “LLM-as-a-judge” systems.
• Analytical Mindset: Ability to transform raw generation logs into actionable business metrics and technical insights.
• Iterative Mindset: Focus on continuous product improvement through constant feedback loops.
• Fluency in English and Russian/Ukrainian.
Nice to have:
• Fine-Tuning: Practical experience in fine-tuning models for specific domain tasks or JSON compliance.
• RAG Architecture: Understanding how to build and optimize Retrieval-Augmented Generation systems, including indexing, retrieval, and re-ranking.
• Python: Basic knowledge for working with data science scripts or AI evaluation libraries.
We Offer:
• Remote Work Environment: Embrace the freedom to work from anywhere, anytime, promoting a healthy work-life balance.
• Unlimited PTO: Enjoy unlimited paid time off to recharge and prioritize your well-being, without counting days.
• Paid National Holidays: Celebrate and relax on national holidays with paid time off to unwind and recharge.
• Company-provided MacBook: Experience seamless productivity with top-notch Apple MacBooks provided to all employees who need them.
• Flexible Independent Contractor Agreement: Unlock the benefits of flexibility, autonomy, and entrepreneurial opportunities. Benefit from tax advantages, networking opportunities, reduced employment obligations, and the freedom to work from anywhere.
• Work on an outstanding product: Our platform is trendy, fast-scaling, and redefining how people change their professional roles.
• Make a direct impact: As a Senior Data Analyst, your insights will guide product decisions, influence the roadmap, and shape user experiences.
• Grow with us: This is an ambitious journey with huge potential – perfect for someone eager to grow alongside a cutting-edge product and market.
Responsibilities:
• Advanced Prompt Engineering: Designing complex, dynamic prompt templates with conditional logic and efficiently reusing information and context within prompts to maximize generation quality and reasoning.
• Structured Outputs & Schemas: Implementing various response schemes (JSON mode, function calling, Zod/JSON schemas) to ensure AI outputs are predictable and ready for seamless integration into application logic.
• Prompt Engineering & Evaluations: Building robust evaluation pipelines and using Langfuse to collect feedback and score the quality of responses in real time.
• Tracing & Debugging: Performing deep debugging of complex LLM chains using Langfuse traces to identify bottlenecks and optimize for cost, latency, and context window usage.
• AI A/B Testing: Running systematic experiments across different models via OpenRouter (e.g., comparing Claude 3.5 Sonnet vs. GPT-4o) and analyzing results based on quantitative metrics.
• Data-Driven Decisions: Making deployment decisions for new prompts or models strictly based on quantitative benchmarks and trace data, rather than intuition.
• Output Scoring & Analysis: Developing scoring systems to analyze the “Problem → Solution” chain and identify root causes of hallucinations or logic errors using Langfuse analytics.
• Model Performance & Fine-Tuning: Regularly re-evaluating model performance as new architectures emerge and performing fine-tuning when necessary to meet specific domain requirements.
Qualifications:
• Node.js & Next.js: Deep knowledge of the stack to build reliable services and handle complex LLM-generated data.
• Dynamic Prompting Skills: Proven experience in building prompts where content is highly dependent on input variables and context injection.
• OpenRouter Experience: Experience working with unified APIs, managing rate limits, and selecting the most cost-effective models for specific tasks.
• Langfuse (or similar): Understanding of LLM observability principles — setting up tracing, creating test datasets, and integrating scoring systems.
• Evaluation Methodology: Experience with frameworks like RAGAS or building custom “LLM-as-a-judge” systems.
• Analytical Mindset: Ability to transform raw generation logs into actionable business metrics and technical insights.
• Iterative Mindset: Focus on continuous product improvement through constant feedback loops.
• Fluency in English and Russian/Ukrainian.
Nice to have:
• Fine-Tuning: Practical experience in fine-tuning models for specific domain tasks or JSON compliance.
• RAG Architecture: Understanding how to build and optimize Retrieval-Augmented Generation systems, including indexing, retrieval, and re-ranking.
• Python: Basic knowledge for working with data science scripts or AI evaluation libraries.
We Offer:
• Remote Work Environment: Embrace the freedom to work from anywhere, anytime, promoting a healthy work-life balance.
• Unlimited PTO: Enjoy unlimited paid time off to recharge and prioritize your well-being, without counting days.
• Paid National Holidays: Celebrate and relax on national holidays with paid time off to unwind and recharge.
• Company-provided MacBook: Experience seamless productivity with top-notch Apple MacBooks provided to all employees who need them.
• Flexible Independent Contractor Agreement: Unlock the benefits of flexibility, autonomy, and entrepreneurial opportunities. Benefit from tax advantages, networking opportunities, reduced employment obligations, and the freedom to work from anywhere.
• Work on an outstanding product: Our platform is trendy, fast-scaling, and redefining how people change their professional roles.
• Make a direct impact: As a Senior Data Analyst, your insights will guide product decisions, influence the roadmap, and shape user experiences.
• Grow with us: This is an ambitious journey with huge potential – perfect for someone eager to grow alongside a cutting-edge product and market.
🔍 Dekoder Ogłoszenia
🔴
shape our AI infrastructure and drive production-ready LLM experiences
Oczekuje się, że będziesz budować od podstaw infrastrukturę AI i wdrażać gotowe rozwiązania LLM, co może oznaczać dużą odpowiedzialność i brak gotowych narzędzi.
🔴
You’ll take full ownership of key AI features from experimentation to live production
Będziesz w pełni odpowiedzialny za cały cykl życia kluczowych funkcji AI, od pomysłu po wdrożenie i utrzymanie.
🔴
Advanced Prompt Engineering
Może oznaczać, że firma oczekuje od Ciebie głębokiego zrozumienia i praktycznych umiejętności w tworzeniu zaawansowanych promptów, a nie tylko podstawowej znajomości tematu.
🟡
making data-driven decisions around model performance, reliability, and cost
Oczekuje się, że będziesz podejmować decyzje oparte na analizie danych dotyczących wydajności modeli, ich niezawodności i kosztów, co wymaga umiejętności analitycznych i monitorowania.
🔴
Building robust evaluation pipelines
Będziesz musiał samodzielnie zaprojektować i zaimplementować systemy oceny jakości modeli, co może być czasochłonne i wymagać dobrego zrozumienia metryk.