SandboxAQ is a high-growth company delivering AI solutions that address some of the world's greatest challenges. The company’s Large Quantitative Models (LQMs) power advances in life sciences, financial services, navigation, cybersecurity, and other sectors.
We are a global team that is tech-focused and includes experts in AI, chemistry, cybersecurity, physics, mathematics, medicine, engineering, and other specialties. The company emerged from Alphabet Inc. as an independent, growth capital-backed company in 2022, funded by leading investors and supported by a braintrust of industry leaders.
At SandboxAQ, we’ve cultivated an environment that encourages creativity, collaboration, and impact. By investing deeply in our people, we’re building a thriving, global workforce poised to tackle the world's epic challenges. Join us to advance your career in pursuit of an inspiring mission, in a community of like-minded people who value entrepreneurialism, ownership, and transformative impact.
SandboxAQ’s AI Simulation team is advancing the frontiers of drug and materials discovery by integrating physics-based simulations with cutting-edge AI. We are looking for an experienced and innovative Machine Learning Engineer to develop AI systems that are capable of reasoning across complex biological systems over multi-modal datasets—including genomics data, clinical information, and physics-based simulations.
In this role, you will work with a team to architect and train AI systems (eg. Foundation Models) that enable a deeper understanding of biological mechanisms and accelerate scientific discovery. You will bring expertise in Large Language Models, NGS sequencing pipelines, multi-modal data processing (especially multi-OMICS) and collaborate closely within a high-performing, interdisciplinary team of drug discovery scientists, computational chemists, physicists, AI researchers, bioinformaticians, and software engineers.
Key responsibilities- Develop robust, scalable ML software for predictive and generative modeling tasks related to genomics data (eg. Interactome, Cell & Tissue modeling)
- Design and implement ML algorithms to enhance NGS sequencing pipelines
- Apply reasoning techniques—including LLMs, Graph Neural Networks, Gen AI models—for extracting insights to advance drug discovery from simulation, omics data, and literature
- Identify, ingest, and curate relevant data sources. Own data quality control, validation, and integration workflows
- Research and prototype novel bioinformatics and deep learning approaches to interpret human genetic variants, gene regulation mechanisms and disease pathways using diverse multimodal data (e.g. multi-omics, single-cell data, proteomics, genomics, biomedical imaging)
- Communicate complex ideas effectively across audiences, including internal collaborators, external stakeholders, and clients—tailoring technical depth as needed
- Contribute to the scientific community through patent filings, peer-reviewed publications, white papers, and conference presentations
- Ph.D. in Computer Science, Computational Biology, High-Performance Computing, or a related field
- 3–5 years of hands-on experience, preferably in the private sector, working on one or more of the following:
- Large Language Models and GenAI techniques
- NGS sequencing pipelines
- Graph neural networks
- Experience in processing and curating multi-modal data—including large-scale omics, clinical datasets, and scientific literature
- Proficiency in running analyses and training machine learning or deep learning models in high-performance computing (HPC) environments, particularly those using GPUs
- Strong collaboration mindset, with the ability to identify problems and communicate technical concepts clearly to both technical and non-technical stakeholders
- Demonstrated ability to dive deep into technically complex problems and a track record of driving initiatives through to completion
- Familiarity with advanced AI concepts, including:
- Generative AI (LLMs, Biological Foundation Models, Diffusion & Optimal Transport techniques)
- ML-based advancements in NGS sequencing pipelines
- Biomedical Imaging
- Demonstrate good grasp of molecular biology concepts, particularly the central dogma (DNA, RNA, protein), and related high-throughput technologies such as RNA-seq, epigenomics, single-cell and spatial omics
- Working knowledge of graph databases and graph data structures
- Strong publication record in peer-reviewed venues (eg. NeurIPS, ICLR, ICML, CVPR, ECCV, ICCV)
- Willingness to travel up to 25% for conferences, customer engagements, team offsites, or internal meetings
The US base salary range for this full-time position is expected to be $167k - $234k per year. Our salary ranges are determined by role and level. Within the range, individual pay is determined by factors including job-related skills, experience, and relevant education or training. This role may be eligible for annual discretionary bonuses and equity.
Top Skills
Similar Jobs
What you need to know about the Montreal Tech Scene
Key Facts About Montreal Tech
- Number of Tech Workers: 255,000+ (2024, Tourisme Montréal)
- Major Tech Employers: SAP, Google, Microsoft, Cisco
- Key Industries: Artificial intelligence, machine learning, cybersecurity, cloud computing, web development
- Funding Landscape: $1.47 billion in venture capital funding in 2024 (BetaKit)
- Notable Investors: CIBC Innovation Banking, BDC Capital, Investissement Québec, Fonds de solidarité FTQ
- Research Centers and Universities: McGill University, Université de Montréal, Concordia University, Mila Quebec, ÉTS Montréal