Pracuj.pl Stacjonarnie Senior

Data Engineer – Data Lake (f/m/x)

Sii Sp. z o.o.

⚲ Białystok, Centrum, Bydgoszcz, Gdańsk, Oliwa, Katowice, Kraków, Podgórze, Lublin, Łódź, Śródmieście, Piła, Poznań, Wilda, Rzeszów, Szczecin, Toruń, Warszawa, Mokotów, Wrocław, Fabryczna

Wymagania

  • Apache Spark
  • Python
  • ETL/ELT
  • AWS
  • Docker
  • Kubernetes
  • Kafka

Opis stanowiska

Nasze wymagania: Strong experience in Data Engineering or Big Data-related roles Proficiency in Python, Scala, or Java Hands-on experience with tools such as Apache Spark, PySpark, or similar frameworks Previous work with Data Lake technologies (e.g., AWS S3, Azure Data Lake, Databricks, BigQuery) Knowledge of ETL/ELT processes and orchestration tools (e.g., Airflow, Data Factory) Good understanding of SQL and data modeling Experience with distributed systems and large-scale data processing Familiarity with Docker and Kubernetes Strong analytical and problem-solving skills Fluent in Polish required Residing in Poland required Mile widziane: Experience with streaming technologies (e.g., Kafka) Knowledge of data governance tools Familiarity with CI/CD processes in data projects O projekcie: You will join an international project within the healthcare and life sciences industry, focused on building and evolving a modern Data Lake platform supporting large-scale data processing and analytics. The solution enables data-driven decision-making in a highly regulated environment, with a strong emphasis on data quality, security, and compliance. The environment is cloud-based and leverages modern big data technologies and best engineering practices. As a Data Engineer, you will be responsible for designing, developing, and maintaining data pipelines and Data Lake architecture. You will work closely with cross-functional teams, including data scientists and business stakeholders, to deliver reliable and efficient data solutions. Zakres obowiązków: Designing and developing scalable data pipelines for batch and real-time data processing Building and optimizing Data Lake architecture for analytical use cases Integrating multiple data sources and ensuring seamless data flow across systems Ensuring data quality, consistency, and governance (data lineage, access control) Optimizing storage and processing performance using modern data formats and partitioning strategies Monitoring, troubleshooting, and improving data pipeline performance Collaborating with stakeholders to translate business needs into technical solutions Following best practices in data engineering and continuously improving the platform Oferujemy: Great Place to Work since 2015 - it’s thanks to feedback from our workers that we get this special title and constantly implement new ideas Employment stability - revenue of PLN 2.1BN, no debts, since 2006 on the market We share the profit with Workers - over PLN 76M has already been allocated for this aim since 2022 Attractive benefits package - private healthcare, benefits cafeteria platform, car discounts and more Comfortable workplace – class A offices or remote work Dozens of fascinating projects for prestigious brands from all over the world – you can change them thanks to Job Changer application PLN 1 000 000 per year for your ideas - with this amount, we support the passions and voluntary actions of our workers Investment in your growth – meetups, webinars, training platform and technology blog – you choose Fantastic atmosphere created by all Sii Power People