Pracuj.pl Praca zdalna Senior New

Data Engineer

PRETIUS SOFTWARE SP. Z O.O.

⚲ Warszawa, Włochy

140–170 zł netto (+ VAT) / godz.

Wymagania

  • AWS
  • Azure
  • SQL
  • Power Query
  • Python
  • PostgreSQL
  • Airflow
  • CI/CD
  • ETL/ELT
  • BigQuery
  • Apache Spark
  • Kafka
  • Power BI

Opis stanowiska

Nasze wymagania: 8+ years of experience in data engineering, analytics engineering, or similar data-focused roles Expert-level proficiency in Python for data processing, pipeline development, and automation Advanced SQL skills, including query optimization and complex analytical transformations Strong experience with relational and analytical databases (e.g., PostgreSQL, Snowflake, BigQuery, Redshift, Synapse) Hands-on experience designing and implementing data warehouse architectures (ETL/ELT, batch, near-real-time) Proven experience with big data processing frameworks such as Apache Spark (PySpark, Spark SQL) Strong cloud experience across AWS, Azure, and/or GCP, including core data services Experience building and operating scalable data pipelines using orchestration tools (Airflow, ADF, Prefect, Dagster) Understanding of distributed systems principles and large-scale data processing challenges Strong knowledge of data quality, governance, security, and compliance best practices Experience with DevOps practices, including CI/CD, Git, and Infrastructure as Code (Terraform or equivalent) Ability to design scalable, production-grade data solutions in complex enterprise environments Mile widziane: Familiarity with streaming technologies (Kafka, Kinesis, Pub/Sub) Experience with dbt and BI tools (Power BI, Tableau, Looker) O projekcie: At Pretius, we are looking for Data Engineer to a project for global-scale platform in the field of gaming and lotteries. Zakres obowiązków: Design, build, and maintain scalable, production-grade data pipelines using Python (ETL/ELT) and orchestration tools Write and optimize advanced SQL queries for efficient data extraction, transformation, and performance tuning Design and implement scalable data models (star/snowflake schema) for analytics and reporting Build and maintain end-to-end data warehouse solutions, including batch and near-real-time ingestion, data marts, and semantic layers Work with Apache Spark (PySpark, Spark SQL) for large-scale data processing and analytics Develop and operate cloud data solutions across AWS, Azure, and/or GCP (e.g., S3, Glue, EMR, Redshift, ADLS, Data Factory, Synapse, BigQuery) Design scalable, secure, and cost-efficient data architectures with FinOps awareness Build and maintain reliable data pipelines using orchestration tools (Airflow, ADF, Prefect, Dagster) with proper scheduling, retries, and monitoring Ensure data reliability through validation, monitoring, idempotent design, and failure recovery mechanisms Develop streaming and real-time data pipelines using Kafka, Kinesis, Pub/Sub, or Event Hubs where required Implement data quality, governance, and security standards (PII protection, encryption, RBAC, data lineage) Apply DevOps practices including Git, CI/CD, Infrastructure as Code, and production monitoring Integrate external APIs and SaaS data sources into data platforms Oferujemy: We focus on long-term relationships based on fair principles and reliability Co-financing of the Multisport card and Medicover private healthcare Modern office available Team bonding activities, internal courses, conferences, certifications