Principal Data Engineer (PST Overlap)
Appliscale
⚲ Kraków
30 000 - 34 000 PLN netto (B2B)
Wymagania
- Databricks
- Apache Spark
- Data Build Tool (DBT)
- Apache Airflow
- Atlan
- Amazon AWS
- Data Dog
- Tableau Cloud
- Monte Carlo Data
- Fivetran
Opis stanowiska
About the role Our client is one of the largest game studios known for their very successful MOBA and FPS franchises. You will be a member of the Data Foundations team which collects and uses data to improve the player's experience. Data Foundations team is an initiative accountable for decision science, data products, data capture, and data warehousing and governance. Their mission is to harness the power of data for player-centric decisions and AI/ML products that make it better to be a player.You'll work closely with a technical lead to expand data consumption and reporting capabilities and build solutions to improve the development lifecycle for data engineers and insight analysts throughout the organization. You'll be helping to enable advancements and fast iteration in Machine Learning and GenAI pipelines. As a Principal Data Engineer, you will design, build, and maintain scalable data pipelines and models that power critical data products, including centralized player data models, and publishing activity data systems. These systems support understanding of player onboarding funnels, engagement patterns, support efficiency, and content effectiveness. You will collaborate with product managers, analysts, and engineers to ensure our data infrastructure is reliable, performant, and directly tied to delivering better player experiences Responsibilities • Please note, availability to attend afternoon/evening meetings is a requirement for this role as most of the team is located on the US West Coast (LA and Seattle) • Design, build, and optimize scalable ETL pipelines for structured and semi-structured data supporting Insights use cases, growth metrics, and other publishing/marketing focused data sets. • Design and implement data models using industry best practices that capture a complete ecosystem view of internal customer experiences, while ensuring accuracy, scalability,and long-term usability • Architect and implement robust, maintainable, and high-performance data solutions • Automate workflows to reduce manual intervention and enhance data processing efficiency, including automation for content, growth, and pubsports areas of coverage • Optimize query performance and resolve pipeline bottlenecks to improve data accessibility • Evaluate and adopt new tools, frameworks, and methodologies to advance data engineering capabilities • Support cost optimization by ensuring scalable and efficient data solutions • Ensure data quality, governance, and compliance with regulatory standards (e.g., GDPR, CCPA) • Contribute to engineering discipline by shaping infrastructure, craft standards, tooling, and organizational best practices Required qualifications • Minimum of 8 years commercial work experience in data engineering or a related field • Bachelor's or higher degree in Computer Science, Software Engineering, or a related field • Proficiency in Python, essential for data processing and analysis tasks • Hands-on experience with DBT (Data Build Tool) and Airflow • Commercial experience using PySpark and Databricks • Proficiency in AWS cloud infrastructure (AWS) • Effective communication and teamwork skills Nice to have • Experience with ML Operations and GenAI pipelines and infrastructure • Experience in the gaming industry, particularly with online multiplayer games • Experience working with cross-discipline organizations that build data products • Proficient in large-scale data manipulation across various data types • Demonstrated ability to troubleshoot and optimize complex ETL pipelines • Proven experience mentoring and guiding other engineers Tech stack • Databricks • Unity Catalog • Apache Spark + PySpark + SQL • DBT (Data Build Tool) • Apache Airflow + Astronomer • Atlan • Tableau Cloud • Monte Carlo Data • Fivetran • Datadog • GIT/github • AWS