JustJoin.IT Praca zdalna Senior

QA and Performance Testing Engineering Lead

Link Group

⚲ Warszawa, Kraków, Wrocław, Poznań, Gdańsk

130 - 150 PLN/h netto (B2B)

Wymagania

  • CI/CD
  • Testing
  • GitHub Actions
  • Leadership
  • Azure DevOps

Opis stanowiska

QA & Performance Engineering Lead (AI/LLM Focus) The Role We are seeking a high-caliber QA and Performance Engineering Lead to spearhead the testing strategy for enterprise-grade AI and LLM solutions. In this role, you will define the architecture for functional, non-functional, and performance testing, ensuring that complex AI agent workflows and large-scale applications meet the highest standards of reliability and compliance. You will act as a bridge between traditional QA excellence and the cutting-edge requirements of GenAI evaluation. Core Responsibilities & Technical Expertise • Strategic QA Leadership: Leverage 10+ years of experience leading enterprise-wide testing initiatives within Fortune 500 environments to design comprehensive QA architectures. • AI/LLM Specialized Evaluation: Implement advanced metrics for model assessment, including BLEU, ROUGE, perplexity, and specialized scoring for hallucination and grounding rates. • Performance & Resilience Engineering: Build frameworks for load, stress, and chaos testing to ensure system stability under extreme conditions and peak workloads. • Automation & Orchestration: Engineer robust CI/CD test pipelines using Azure DevOps or GitHub Actions, focusing on automated API testing (Pytest/Postman) and integrated test harnesses. • Agentic Workflow Validation: Design testing strategies for multi-step AI agents, covering tool chaining, orchestration, and context injection accuracy. • Data Governance & Compliance: Apply deep knowledge of data lineage (Purview/Unity Catalog) and maintain strict traceability and auditability standards required in regulated industries. • Lifecycle Management: Oversee model release gates, registry promotions, and the management of synthetic datasets and versioning. Key Deliverables • Unified Testing Framework: A standardized taxonomy and coverage model spanning unit, integration, E2E, and AI agent workflows. • AI Evaluation Suite: A comprehensive suite for validating model consistency, toxicity, and correctness, supported by Proof-of-Concept (PoC) validations. • Automated Performance Harness: Scalable workload models designed for peak-load scenarios and resiliency benchmarking. • Smart Quality Gates: Automated pass/fail scoring mechanisms embedded directly into release pipelines across all quality dimensions. • Advanced Observability: Implementation of "Golden Dashboards" tracking real-time metrics such as latency-per-thought, grounding quality, and functional pass rates. Professional Profile • Expertise in Enterprise QA Architecture (Functional + Non-functional + Performance). • Deep understanding of ML/LLM lifecycle and model promotion pipelines. • Strong background in Regulated Industries (ensuring compliance and audit readiness). • Hands-on experience with Synthetic Data generation and dataset versioning.