Coursera Logo

Coursera

Site Reliability Engineer, II

Reposted 16 Days Ago
Be an Early Applicant
Canada
Junior
Canada
Junior
The Site Reliability Engineer will enhance Coursera's infrastructure by maintaining observability systems, automating deployments, collaborating with teams, and supporting incident response. The role focuses on reliability and cost optimization across services.
The summary above was generated by AI

Coursera was launched in 2012 by Andrew Ng and Daphne Koller with a mission to provide universal access to world-class learning. It is now one of the largest online learning platforms in the world, with 183 million registered learners as of June 30, 2025. Coursera partners with over 350 leading university and industry partners to offer a broad catalog of content and credentials, including courses, Specializations, Professional Certificates, and degrees. Coursera’s platform innovations enable instructors to deliver scalable, personalized, and verified learning experiences to their learners. Institutions worldwide rely on Coursera to upskill and reskill their employees, citizens, and students in high-demand fields such as GenAI, data science, technology, and business. Coursera is a Delaware public benefit corporation and a B Corp.

We’re a global platform aiming to transform lives through learning by offering transformative courses, certificates, and degrees that empower learners worldwide to advance their careers through skill mastery. We’re looking for inventors, innovators, and lifelong learners eager to shape the future of education. If you’re ready to build the global programs and tools that fuel the power of online learning, join Team Coursera.

At Coursera, we are committed to building a globally diverse team and are thrilled to extend employment opportunities to individuals in any country where we have a legal entity. We require candidates to possess eligible working rights and have a compatible timezone overlap with their team to facilitate seamless collaboration. 

Coursera has a commitment to enabling flexibility and workspace choices for employees. Our interviews and onboarding are entirely virtual, providing a smooth and efficient experience for our candidates. As an employee, we enable you to select your main way of working, whether it's from home, one of our offices or hubs, or a co-working space near you.

Job Overview:

Our SRE team is part of the Coursera Infrastructure group that builds the foundation that keeps Coursera reliable, scalable, and efficient. We partner with product and platform teams to deliver resilient systems through automation, observability, and operational excellence. From incident response to infrastructure as code, we enable fast, safe, and cost-aware delivery of global learning experiences. We are hiring an IC3 Site Reliability Engineer (SRE) based in Canada to join our SRE team. This role will support reliability, observability, infrastructure automation, and cost optimization efforts across multiple services. The engineer will work closely with senior SREs to build scalable and efficient systems using our AWS-based tech stack, and gain hands-on experience with real-world SRE projects. Joining this team means working on high-impact projects that keep Coursera running smoothly for millions of learners and partners.

Responsibilities:

  • Contribute to building and maintaining observability systems (e.g., metrics, logs, dashboards)
  • Assist in automating infrastructure provisioning, system configuration, and reducing toil
  • Participate in on-call rotations and support incident response processes
  • Collaborate with senior engineers on improving the reliability and scalability of services
  • Implement cost monitoring tools and assist in cloud resource optimization
  • Support disaster recovery planning, compliance tasks, and documentation

Basic Qualifications:

  • 2+ years of experience in Site Reliability, DevOps, or Backend Engineering roles
  • Hands-on experience with at least one cloud platform (e.g., AWS, GCP, Azure)Experience with monitoring and logging tools (e.g., Datadog, CloudWatch, SumoLogic, Graphana)
  • Familiarity with Infrastructure as Code tools (e.g., Terraform, Ansible)
  • Experience writing automation scripts and backend systems in Java, Python, Bash or similar languages

Preferred Qualifications:

  • Exposure to incident management processes and tools (e.g., PagerDuty)
  • Familiarity with containerized infrastructure (e.g., Docker, Kubernetes)
  • Experience working on cost visibility or optimization in cloud environments
  • Knowledge of version control systems and CI/CD practices
  • Experience contributing to disaster recovery or multi-region infrastructureKnowledge of security/compliance practices (e.g., audit logging, access controls)

If this opportunity interests you, you might like these courses on Coursera:

  • Site Reliability Engineering: Measuring and Managing Reliability – Learn SRE fundamentals including SLIs, SLOs, and error budgets
  • Introduction to Cloud Computing – Understand core cloud concepts, including AWS services and architecture
  • Getting Started with Terraform for Cloud Infrastructure Automation – Learn infrastructure-as-code using Terraform with hands-on AWS examples

Compensation:

Coursera offers competitive pay and equitable compensation practices. Our job titles may span more than one career level. The targeted hiring base salary range for this role is between CAD $113,600 - 170,400 for all Canada candidates. The actual base pay is dependent upon many factors, including but not limited to prior work experiences, training/education, transferable skills, business needs, and geographical location. The base pay range is subject to change and may be modified in the future. This role may also be eligible for variable pay, equity, and benefits. 

Coursera is an Equal Employment Opportunity Employer and considers all qualified applicants without regard to race, color, religion, sex, sexual orientation, gender identity, age, marital status, national origin, protected veteran status, disability, or any other legally protected class.
 
If you are an individual with a disability and require a reasonable accommodation to complete any part of the application process, please contact us at [email protected].
 
For California Candidates, please review our CCPA Applicant Notice here.
For our Global Candidates, please review our GDPR Recruitment Notice here.
 

Top Skills

Ansible
AWS
Azure
Bash
Cloudwatch
Datadog
Docker
GCP
Graphana
Java
Kubernetes
Python
Sumologic
Terraform

Similar Jobs

6 Days Ago
Easy Apply
Hybrid
Toronto, ON, CAN
Easy Apply
Mid level
Mid level
Artificial Intelligence • Cloud • Information Technology • Machine Learning • Software • Big Data Analytics • Automation
The Site Reliability Engineer II will enhance platform reliability, manage Kubernetes clusters, troubleshoot production issues, and improve CI/CD workflows while participating in on-call rotations.
Top Skills: AnsibleArgocdAWSBuildkiteChefCircleCICloudFormationDatadogEksGoGrafanaJenkinsKubernetesNew RelicPrometheusPuppetPythonRubySplunkSumologicTerraformTravis Ci
12 Days Ago
In-Office
2 Locations
Senior level
Senior level
Fintech • Financial Services
The Senior Site Reliability Engineer will design and support infrastructure for applications, ensure reliability through metrics, and collaborate with teams to automate processes and solve issues.
Top Skills: AnsibleAWSAzureBladelogicChefJenkinsLinuxPerlPowershellShell ScriptsTerraformWindows
20 Minutes Ago
Hybrid
Vancouver, BC, CAN
Senior level
Senior level
Artificial Intelligence • Hardware • Information Technology • Security • Software • Cybersecurity • Big Data Analytics
Seeking a Senior Software Developer with expertise in C++ and C# for developing large-scale applications and APIs, focusing on cloud and on-premise environments. Responsibilities include technical leadership, mentoring, and contributing to software development processes.
Top Skills: Ai/Machine LearningAzureC#C++Ci/CdDatabases (BerkleyDevOpsPostgres)Test Automation

What you need to know about the Montreal Tech Scene

With roots dating back to 1642, Montreal is often recognized for its French-inspired architecture and cobblestone streets lined with traditional shops and cafés. But what truly sets the city apart is how it blends its rich tradition with a modern edge, reflected in its evolving skyline and fast-growing tech industry. According to economic promotion agency Montréal International, the city ranks among the top in North America to invest in artificial intelligence, making it le spot idéal for job seekers who want the best of both worlds.

Key Facts About Montreal Tech

  • Number of Tech Workers: 255,000+ (2024, Tourisme Montréal)
  • Major Tech Employers: SAP, Google, Microsoft, Cisco
  • Key Industries: Artificial intelligence, machine learning, cybersecurity, cloud computing, web development
  • Funding Landscape: $1.47 billion in venture capital funding in 2024 (BetaKit)
  • Notable Investors: CIBC Innovation Banking, BDC Capital, Investissement Québec, Fonds de solidarité FTQ
  • Research Centers and Universities: McGill University, Université de Montréal, Concordia University, Mila Quebec, ÉTS Montréal

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account