Digital Health Data Engineer - Remote Latam
Excellent opportunity to work REMOTELY with a U.S.-based company. Candidates living in Mexico, or all LATAM are welcome to apply.
About the Company
Bydrec, Inc. is a California-based company that connects top Tech talent from Latin America with U.S. companies looking to expand their development teams. Learn more at bydrec.com.
Position Summary
We are seeking a Digital Health Data Engineer with expertise in extracting features and analyzing multimodal time-series data from biosensors such as accelerometer, ECG, PPG, and EEG. This role will also focus on developing advanced data pipelines for digital health applications, including audio and video data processing.
The ideal candidate has strong experience with Python, cloud-native architectures, AWS Batch, containerization, and time-series databases. This person will lead data exploration initiatives, drive technical innovation, and collaborate across teams to advance healthcare solutions through the development of digital biomarkers.
Responsibilities
- Design, build, and maintain data pipelines to ensure seamless integration and high-performance processing of large-scale time-series datasets.
- Develop rapid QC metrics used in dashboards to present complex datasets to stakeholders.
- Provide Python expertise, supporting team members with queries and troubleshooting while promoting best practices in code quality and development.
- Communicate insights and results through reports, presentations, and technical documentation.
Qualifications
Required:
- Bachelor’s degree with 5+ years of industry experience, or a Master’s degree with 3+ years of experience, in Computer Science, Data Science, Bioinformatics, or a related quantitative field.
- Advanced English proficiency (written and spoken).
- Strong proficiency in Python, with the ability to mentor and support the team in solving complex Python-related challenges. Proficiency in R is also required.
- Experience in data visualization for complex and large-scale datasets, particularly time-series data, using tools such as Power BI.
- Expertise in SQL, PySpark, and Ray clusters for data engineering and large-scale analysis.
- Experience with containerization tools such as Docker and deploying data workflows in modern environments.
- Hands-on experience with machine learning, particularly with large datasets.
- Familiarity with cloud technologies, including AWS and Snowflake.
- Experience working with multimodal time-series biosensor data such as accelerometer, ECG, PPG, or EEG.
- Strong communication skills, with the ability to collaborate across teams and present complex technical concepts clearly.
Nice to Have:
- Familiarity with large language models (LLMs) and applying generative AI approaches such as Retrieval-Augmented Generation (RAG) in digital health environments.
- Knowledge of GPU computing, high-performance computing, and cloud-native architectures.
- Experience with relational and cloud databases, including PostgreSQL and Redshift.
- Hands-on experience managing and optimizing cloud infrastructure using AWS and Snowflake, including tools such as Kubernetes, AWS Bedrock, AWS Batch, Athena, AWS Glue, KIRO, and S3 integrations.
- Familiarity with cardiovascular, neuroscience, or epidemiology datasets.
- Experience with FDA submissions, validation processes, and working within GxP environments.