Waabi Jobs

Distillation Lead

Waabi

Distillation Lead

Reposted 10 Days Ago

Be an Early Applicant

Remote or Hybrid

Hiring Remotely in Canada

Senior level

Remote or Hybrid

Hiring Remotely in Canada

Senior level

The Distillation Lead will define and implement strategies for model distillation and compression, collaborating with various teams to deliver efficient models for real-time systems and simulations.

The summary above was generated by AI

Waabi, founded by AI visionary Raquel Urtasun, is the leader in Physical AI. With a world-class team, we're unlocking the next era of autonomous transportation with technology that's powering commercial autonomous trucks and robotaxis. Waabi is backed by and partners with world leaders in AI, automotive, logistics, and deep tech.

With offices in Toronto, San Francisco, Dallas, and Pittsburgh, Waabi is growing quickly and looking for diverse, innovative and collaborative candidates who want to impact the world in a positive way. To learn more visit: www.waabi.ai

Waabi’s Physical AI platform is powered by state of the art ML models which must be deployed efficiently across diverse use-cases, from onboard vehicle inference to large-scale simulation. As the Distillation Lead, you will own the strategy and execution for distillation across Waabi's AI stack, ensuring our most capable models run efficiently in every deployment context. You will partner closely with ML Platform, Infrastructure, Onboard Autonomy, and Simulation teams to deliver compressed models that meet the performance requirements of both real-time onboard systems and high-throughput simulation pipelines.

You will…

- Define and drive the technical strategy for model distillation and compression across Waabi's AI stack — spanning perception, world models, and planning — with an eye toward both onboard deployment and simulation use-cases.

- Design, implement, and scale state-of-the-art distillation and efficiency pipelines, which may include:

Distillation for generative models (diffusion, autoregressive, flow-matching, video models)
Quantization-aware training (QAT) and post-training quantization (PTQ)
Knowledge distillation (feature-level, response-based, and relation-based)
Structured and unstructured pruning and sparsification
Low-rank factorization and efficient architecture design
Speculative decoding and other inference-time efficiency techniques

- Collaborate closely with ML Platform, Infrastructure, Onboard, Autonomy, and Simulation teams to integrate compressed models into production pipelines and meet latency, memory, and throughput targets across deployment contexts.

- Define rigorous benchmarks and evaluation frameworks to characterize efficiency vs. quality trade-offs across models and hardware targets.

- Mentor and guide researchers and engineers working in the distillation and model efficiency space, setting a high technical bar and fostering a culture of rigorous experimentation.

- Champion best practices for model compression across the organization; disseminate knowledge through internal design reviews, documentation, and technical talks.

- Stay at the cutting edge of model efficiency research; contribute to the broader scientific community through publications and open-source contributions.

Qualifications:

- Deep distillation expertise: You have extensive hands-on experience designing and implementing distillation, quantization, pruning, and model compression techniques for large-scale neural networks, with demonstrated impact in production settings.

- Strong research and engineering foundation: A Bachelor's or Master's degree in Machine Learning, Computer Vision, Robotics, or a related field, or equivalent industry experience; relevant hands-on experience in model distillation and efficiency is what matters most. Expert Python and PyTorch (or JAX) skills with experience in large-scale distributed training.

- Technical leadership: You have a proven track record of setting technical direction and driving projects from conception to production. You inspire and elevate those around you through deep technical expertise and mentorship.

- Cross-functional collaboration: You have experience working closely with infrastructure, platform, and autonomy teams to deploy compressed models under real engineering constraints.

- Clear communicator: You can communicate complex technical trade-offs clearly to diverse audiences and drive alignment across research and engineering teams.

Bonus:

- Experience with hardware-aware optimization (TensorRT, ONNX, custom CUDA kernels, hardware-specific quantization).

- Publications at top-tier ML/CV venues (NeurIPS, ICML, CVPR, ICLR, ECCV) in model compression, efficient deep learning, or related areas.

- Experience distilling large generative models (diffusion models, LLMs, VLMs, or video models).

- Background in autonomous vehicles or robotics.

The US yearly salary range for this role is: $195,000 - $286,000 USD in addition to competitive perks & benefits. Waabi (US) Inc.’s yearly salary ranges are determined based on several factors in accordance with the Company’s compensation practices. The salary base range is reflective of the minimum and maximum target for new hire salaries for the position across all US locations. Note: The Company provides additional compensation for employees in this role, including equity incentive awards and an annual performance bonus.

Perks/Benefits:

- Competitive compensation and equity awards.

- Health and Wellness benefits encompassing Medical, Dental and Vision coverage (for full-time employees only).

- Unlimited Vacation.

- Flexible hours and Work from Home support.

- Daily drinks, snacks and catered meals (when in office).

- Regularly scheduled team building activities and social events both on-site, off-site & virtually.

- As we grow, this list continues to evolve!

Waabi is a technology start-up building technologies to transform the way the world moves. Join our talented team to be a part of the future and to make an impact!

Waabi is an equal opportunity employer. We celebrate diversity and are committed to creating a supportive, inclusive, and accessible workplace for all our employees. We seek applicants of all backgrounds and identities, across race, color, ethnicity, national origin or ancestry, age, citizenship, religion, sex, sexual orientation, gender identity or expression, military or veteran status, marital status, pregnancy or parental status, caregiver status, disability, or any other characteristic protected by law. We make workplace accommodations for qualified individuals with disabilities as required by applicable law. If reasonable accommodation is needed to participate in the job application or interview process please let our recruiting team know.

Similar Jobs

NBCUniversal

Solutions Architect

29 Minutes Ago

Remote or Hybrid

Montréal, QC, CAN

Senior level

AdTech • Cloud • Digital Media • Information Technology • News + Entertainment • App development

Design high-level systems that integrate ML models into production, enable ML–rendering engine interaction, build language-agnostic APIs/wrappers and microservices, evaluate build-vs-buy options, coordinate cross-functional integration requirements, and decompose product vision into scalable ML-enabled architecture.

Top Skills: APIsC#C++ContainerizationDeep LearningGitMessage BrokersMicroservicesModel DeploymentModel Fine-TuningPythonReal-Time Rendering EngineReinforcement LearningUnix Shell

NBCUniversal

Systems Engineer

29 Minutes Ago

Remote or Hybrid

Montréal, QC, CAN

Senior level

AdTech • Cloud • Digital Media • Information Technology • News + Entertainment • App development

Develop and optimize C++ systems implementing machine learning, computer vision, and (inverse-)procedural 3D modeling algorithms. Collaborate with leadership to translate product vision, manage code with Git, deploy and test on cloud platforms, work with large-scale geospatial datasets, and operate in Unix (bash) environments.

Top Skills: BashC++Cloud PlatformCmakeGitLinuxmacOSMercurialUnix ShellUnreal Engine

NBCUniversal

Senior Systems Engineer

29 Minutes Ago

Remote or Hybrid

Montréal, QC, CAN

Senior level

AdTech • Cloud • Digital Media • Information Technology • News + Entertainment • App development

Design and implement high-performance C++ systems for machine learning, computer vision, and 3D/procedural modeling. Apply research, optimize code for speed and scalability, collaborate with leadership, use Git, deploy and test on cloud with large-scale geospatial datasets, and operate in a Unix shell environment.

Top Skills: 3D ModelingC++Cloud PlatformCmakeComputational GeometryComputer GraphicsComputer VisionGeospatial Data ProcessingGitLinuxMachine LearningmacOSMercurialPythonUnix Shell/BashUnreal Engine

What you need to know about the Montreal Tech Scene

With roots dating back to 1642, Montreal is often recognized for its French-inspired architecture and cobblestone streets lined with traditional shops and cafés. But what truly sets the city apart is how it blends its rich tradition with a modern edge, reflected in its evolving skyline and fast-growing tech industry. According to economic promotion agency Montréal International, the city ranks among the top in North America to invest in artificial intelligence, making it le spot idéal for job seekers who want the best of both worlds.

Key Facts About Montreal Tech

Number of Tech Workers: 255,000+ (2024, Tourisme Montréal)
Major Tech Employers: SAP, Google, Microsoft, Cisco
Key Industries: Artificial intelligence, machine learning, cybersecurity, cloud computing, web development
Funding Landscape: $1.47 billion in venture capital funding in 2024 (BetaKit)
Notable Investors: CIBC Innovation Banking, BDC Capital, Investissement Québec, Fonds de solidarité FTQ
Research Centers and Universities: McGill University, Université de Montréal, Concordia University, Mila Quebec, ÉTS Montréal