Data Engineer
emagine Polska
⚲ Stockholm Metropolitan Area
Wymagania
- automation
- Machine Learning (ML)
- User Experience (UX)
- JavaScript
- Java
- Python
- Testing
- Scala
- Elasticsearch
- CI/CD
Opis stanowiska
About the role As a Data Engineer, you will work closely with Fullstack Engineers, UX Designers, and Data Scientists to help transform a core business domain within the private equity industry. You will be part of a small team designing end-to-end solutions: ingesting raw data from multiple sources, running state-of-the-art ML prediction pipelines, and serving large datasets to web applications used for algorithmic deal sourcing by deal professionals. We believe that a great team culture is built by great people enjoying their work together. We regularly organize team activities such as game nights, hack days, and hackathons. You’ll also enjoy a modern Stockholm office with great perks—including an on-site full-time barista. What you’ll do • Work on the data-intensive parts of an internal product used for deal sourcing. • Design data models and build, maintain, and optimize data pipelines that merge large data sources in real time. • Develop data pipelines using a modern technology stack, including tools such as Airflow, Dataflow, Bigtable, BigQuery, Kubeflow/TFX, Kafka, Elasticsearch, FAISS, and Neo4j. • Build business-critical applications and contribute to a shared codebase using Java, Scala, Python, and JavaScript. Who are you? • Able to think creatively and find innovative ways to bring complex systems into production. • Passionate about building great products and working closely with stakeholders. • A strong communicator and a collaborative team player. • Humble, with a growth mindset. • Comfortable in a fast-paced environment and motivated to get things done. • Experienced and enthusiastic about coding and building production-grade applications. • Value automated testing, pair (or mob-) programming, peer reviews, and strong software craftsmanship in a team setting. Nice-to-have • Experience working with large datasets using tools such as Kafka, Apache Beam, BigQuery, Kubeflow, Elasticsearch, or Neo4j. • Background in machine learning or analytics. • Understanding of CI/CD release automation, infrastructure as code, Kubernetes, and GCP.