Senior ML Engineer
SQUARE ONE RESOURCES sp. z o.o.
⚲ Warszawa, Mokotów
22 000–26 000 zł netto (+ VAT) / mies.
Wymagania
- Python
- Pandas
- ScikitLearn
- TensorFlow
- PyTorch
- Pydantic
- SQL
- AWS Sagemaker
- AWS Lambda
- AWS CloudFormation
- AWS CDK
- AWS Step Functions
- GenAI
- AWS Bedrock
- Azure OpenAI API
- AWS Textract
- PyMuPDF
- OpenCV
- Pillow
- AWS Redshift
- AWS S3
- AWS EMR
- AWS Glue
Opis stanowiska
Nasze wymagania:
Advanced knowledge of Python (native, Pandas, ScikitLearn, Tensorflow or Pytorch, PyStats, Pydantic)
Experience with AWS tools for ML Engineering and ML deployment (Sagemaker, Lambda, Cloudformation/CDK, Step Functions)
Advanced knowledge of SQL and Data Modeling
Experience with GenAI for document intelligence, including prompt engineering, RAG (Retrieval Augmented Generation), multi-modal models (vision + text), and production deployment using AWS Bedrock or Azure OpenAI APIs
Experience in experiment design (power analysis and hypothesis testing)
Proficiency in both written and verbal communication, required for a remote and largely asynchronous work environment
Demonstrated capacity to clearly and concisely communicate complex technical problems and propose iterative solutions
Experience owning a feature from concept to production, including proposal, discussion, and execution
Mile widziane:
Experience with document processing tools (AWS Textract, Azure Document Intelligence, or similar OCR/layout detection systems)
Experience with PDF and Image processing libraries (e.g. PyMuPDF, OpenCV, Pillow)
Experience in Machine Learning/Data Science (e.g., ML algorithm selection, feature engineering, model training, hyperparameter tuning, supervised and unsupervised learning implementation, building model pipelines, using Machine Learning tools/libraries/frameworks)
Experience working with AWS big data technologies (Redshift, S3, EMR, Glue, etc.)
O projekcie:
The role involves designing, implementing, and optimizing document intelligence pipelines and ML models for document classification, segmentation, and field extraction on AWS platforms.
Zakres obowiązków:
Design and implement end-to-end document intelligence pipelines on AWS
Develop and optimize ML models for document classification, segmentation, and field extraction
Build scalable data processing systems handling PDFs up to 2000 pages
Collaborate with subject matter experts to create and refine requirements for extraction
Own features from research through production deployment and monitoring
Establish evaluation frameworks and quality metrics for extraction accuracy
Advanced knowledge of Python (native, Pandas, ScikitLearn, Tensorflow or Pytorch, PyStats, Pydantic)
Experience with AWS tools for ML Engineering and ML deployment (Sagemaker, Lambda, Cloudformation/CDK, Step Functions)
Advanced knowledge of SQL and Data Modeling
Experience with GenAI for document intelligence, including prompt engineering, RAG (Retrieval Augmented Generation), multi-modal models (vision + text), and production deployment using AWS Bedrock or Azure OpenAI APIs
Experience in experiment design (power analysis and hypothesis testing)
Proficiency in both written and verbal communication, required for a remote and largely asynchronous work environment
Demonstrated capacity to clearly and concisely communicate complex technical problems and propose iterative solutions
Experience owning a feature from concept to production, including proposal, discussion, and execution
Mile widziane:
Experience with document processing tools (AWS Textract, Azure Document Intelligence, or similar OCR/layout detection systems)
Experience with PDF and Image processing libraries (e.g. PyMuPDF, OpenCV, Pillow)
Experience in Machine Learning/Data Science (e.g., ML algorithm selection, feature engineering, model training, hyperparameter tuning, supervised and unsupervised learning implementation, building model pipelines, using Machine Learning tools/libraries/frameworks)
Experience working with AWS big data technologies (Redshift, S3, EMR, Glue, etc.)
O projekcie:
The role involves designing, implementing, and optimizing document intelligence pipelines and ML models for document classification, segmentation, and field extraction on AWS platforms.
Zakres obowiązków:
Design and implement end-to-end document intelligence pipelines on AWS
Develop and optimize ML models for document classification, segmentation, and field extraction
Build scalable data processing systems handling PDFs up to 2000 pages
Collaborate with subject matter experts to create and refine requirements for extraction
Own features from research through production deployment and monitoring
Establish evaluation frameworks and quality metrics for extraction accuracy
🔍 Dekoder Ogłoszenia
🔴
Experience owning a feature from concept to production, including proposal, discussion, and execution
Oczekuje się, że będziesz samodzielnie prowadzić projekt od pomysłu do wdrożenia, co może oznaczać dużą odpowiedzialność i potencjalnie brak wsparcia.
🔴
Proficiency in both written and verbal communication, required for a remote and largely asynchronous work environment
Praca zdalna i asynchroniczna oznacza, że będziesz musiał samodzielnie zarządzać swoim czasem i komunikować się głównie pisemnie, co może być wyzwaniem dla osób preferujących bezpośredni kontakt.
🔴
Demonstrated capacity to clearly and concisely communicate complex technical problems and propose iterative solutions
Oczekuje się, że będziesz potrafił nie tylko rozwiązywać problemy techniczne, ale także skutecznie je komunikować i proponować rozwiązania, co może wymagać umiejętności prezentacyjnych i negocjacyjnych.
🟡
Advanced knowledge of Python (native, Pandas, ScikitLearn, Tensorflow or Pytorch, PyStats, Pydantic)
Wymóg zaawansowanej wiedzy o Pythonie i jego bibliotekach może oznaczać, że będziesz pracować nad złożonymi projektami wymagającymi głębokiego zrozumienia tych narzędzi.
🟡
Experience with AWS tools for ML Engineering and ML deployment (Sagemaker, Lambda, Cloudformation/CDK, Step Functions)
Znajomość narzędzi AWS do wdrażania ML sugeruje, że projekt będzie intensywnie korzystał z tej chmury, co może wymagać ciągłego uczenia się i adaptacji do ekosystemu AWS.