About the Role
What you’ll do
• Apply the knowledge of data characteristics and data supply pattern, develop rules and tracking process to support data quality model.
• Prepare data for analytical use by building data pipelines to gather data from multiple sources and systems.
• Integrate, consolidate, cleanse and structure data for use by our clients in our solutions.
• Perform design, creation, and interpretation of large and highly complex datasets.
• Stay up-to-date with the latest trends and advancements in GCP and related technologies, actively proposing and evaluating new solutions.
• Understand best practices for data management, maintenance, reporting and security and use that knowledge to implement improvements in our solutions.
• Implement security best practices in pipelines and infrastructure.
• Develop and implement data quality checks and troubleshoot data anomalies.
• Provide guidance and mentorship to junior data engineers.
• Review dataset implementations performed by junior data engineers.
What experience you need
• BS degree in a STEM major or equivalent discipline; Master’s Degree strongly preferred
• 5+ years of experience as a data engineer or related role
• Cloud certification strongly preferred
• Advanced skills using programming languages such as Python or SQL and intermediate level experience with scripting languages
• Intermediate level understanding and experience with Google Cloud Platforms and overall cloud computing concepts, as well as basic knowledge of other cloud environments
• Experience building and maintaining moderately-complex data pipelines, troubleshooting issues, transforming and entering data into a data pipeline in order for the content to be digested and usable for future projects
• Experience designing and implementing moderately complex data models and experience enabling optimization to improve performance
• Demonstrates advanced Git usage and CI/CD integration skills
What could set you apart
• Master’s degree in a related field is a strong plus
• Exposure in Vertex AI, GCP AI/ML services (AutoML, BigQuery ML, Cloud Run, etc.) or a similar cloud technology
• Strong foundational skills in Linux Operating System.
• Understanding of NLP, deep learning, and generative architectures (Transformers, Diffusion Models, etc.)
• Background in credit risk, financial data analytics or risk modeling.
• Experience working with large datasets on a big data platform (e.g., Google Cloud, AWS, Snowflake, Hadoop)
• Experience in Business Intelligence, data visualization, and customer insights
• generation.
• Familiarity with data governance, model bias mitigation, and regulatory frameworks (GDPR, AI Act, SEC compliance).
• Experience with MLOps practices, model monitoring, and CI/CD for AI workflows.
• Knowledge of prompt tuning, fine-tuning, and parameter-efficient methods (LoRA, PEFT).
• Hands-on experience with RAG, multi-modal AI, and hybrid AI architectures.
• Contributions to the AI community through publications, open-source projects, or conference presentations.
What you'll do
- Apply the knowledge of data characteristics and data supply pattern, develop rules and tracking process to support data quality model
- Prepare data for analytical use by building data pipelines to gather data from multiple sources and systems
- Integrate, consolidate, cleanse and structure data for use by our clients in our solutions
- Perform design, creation, and interpretation of large and highly complex datasets
- Stay up-to-date with the latest trends and advancements in GCP and related technologies, actively proposing and evaluating new solutions
- Understand best practices for data management, maintenance, reporting and security and use that knowledge to implement improvements in our solutions
- Implement security best practices in pipelines and infrastructure
- Develop and implement data quality checks and troubleshoot data anomalies
- Provide guidance and mentorship to junior data engineers
- Review dataset implementations performed by junior data engineers
Requirements
- 5+ years of experience as a data engineer or related role
- Advanced skills using programming languages such as Python or SQL and intermediate level experience with scripting languages
- Intermediate level understanding and experience with Google Cloud Platforms and overall cloud computing concepts, as well as basic knowledge of other cloud environments
- Experience building and maintaining moderately-complex data pipelines, troubleshooting issues, transforming and entering data into a data pipeline in order for the content to be digested and usable for future projects
- Experience designing and implementing moderately complex data models and experience enabling optimization to improve performance
- Demonstrates advanced Git usage and CI/CD integration skills
- Master’s degree in a related field is a strong plus
- Exposure in Vertex AI, GCP AI/ML services (AutoML, BigQuery ML, Cloud Run, etc.) or a similar cloud technology
- Strong foundational skills in Linux Operating System
- Understanding of NLP, deep learning, and generative architectures (Transformers, Diffusion Models, etc.)
- Background in credit risk, financial data analytics or risk modeling
- Experience working with large datasets on a big data platform (e.g., Google Cloud, AWS, Snowflake, Hadoop)
- Experience in Business Intelligence, data visualization, and customer insights
- Familiarity with data governance, model bias mitigation, and regulatory frameworks (GDPR, AI Act, SEC compliance)
- Experience with MLOps practices, model monitoring, and CI/CD for AI workflows
- Knowledge of prompt tuning, fine-tuning, and parameter-efficient methods (LoRA, PEFT)
- Hands-on experience with RAG, multi-modal AI, and hybrid AI architectures
Benefits
- Contributions to the AI community through publications, open-source projects, or conference presentations