ClickUp Logo

ClickUp

Staff Site Reliability Engineer

Reposted 17 Days Ago
Hybrid
Canada
Mid level
Hybrid
Canada
Mid level
As a Staff Site Reliability Engineer at ClickUp, you will enhance the stability and reliability of cloud infrastructure, lead system design, respond to incidents, and mentor team members.
The summary above was generated by AI

At ClickUp, we’re not just building software. We’re architecting the future of work! In a world overwhelmed by work sprawl, we saw a better way. That’s why we created the first truly converged AI workspace, unifying tasks, docs, chat, calendar, and enterprise search, all supercharged by context-driven AI, empowering millions of teams to break free from silos, reclaim their time, and unlock new levels of productivity. At ClickUp, you’ll have the opportunity to learn, use, and pioneer AI in ways that shape not only our product, but the future of work itself. Join us and be part of a bold, innovative team that’s redefining what’s possible! 🚀

ClickUp is the world’s only all-in-one productivity platform that flexes to the way people want to work. It replaces all individual workplace productivity tools with a single, unified platform that includes project management, document collaboration, whiteboards, spreadsheets, and AI. With our headquarters based in San Diego and a rapidly expanding global presence, we are shaping the future of work. Join our team at ClickUp, one of the fastest-growing SaaS companies worldwide, and help millions of users be more productive - saving them at least one day every week. 🦄
 
We are looking for driven and innovative software engineers with strong site reliability engineering (SRE) discipline or interest in this area to help us make ClickUp the "one app to rule them all". As an SRE at ClickUp, your primary roles will be improving the stability, availability and reliability of our globally distributed and cloud-based infrastructure that powers our app for thousands of users daily. If you are a rockstar engineer with an entrepreneurial and high-paced mindset who are ready to own, drive and tackle some of the most complex problems there are out there we would love to hear from you!
 
What you'll do:
  • Lead designing and building systems for maximum performance, reliability, and scalability.
  • Serve as a lead in partnership with engineering teams on product design, decisions, and troubleshooting.
  • Increase general stability, observability, and metrics surrounding both uptime and stability.
  • Champion our monitoring infrastructure.
  • Implement and improve our general site reliability posture (error and downtime budgets, MTTD and MTTR improvements, improving alerting and notifications, minimizing customer impact from incidents, etc.)
  • Respond to and troubleshoot downtime events while actively developing safeguards to prevent them.
  • Participate in brainstorming sessions with the engineering team and contribute ideas to our technology and algorithms.
  • Mentor members of the team to improve overall excellence.
 
What we’re looking for:
  • 4-6+ years of knowledge of the Amazon Web Services ecosystem (EC2, ECS, VPC, Redis, RDS, ALB, ECR),
  • Experience working with Kubernetes,
  • Experience in managing production-critical infrastructures and DevOps mindset.
  • Be familiar with SRE best practices and procedures.
  • Experience with IaC (CDK, Terraform), CI/CD (GitHub Actions, ArgoCD), 
  • Familiar with Containerisation (Docker),
  • Knowledgeable in network, firewall, and security best practices.
  • Experience with self-healing automation and monitoring tools (DataDog, CloudWatch)
  • Knowledge of relational databases, preferably PostgreSQL (not mandatory)
  • A strong self-starter, operationally-focused; a problem-solver.
  • Excellent interpersonal, written, and oral communication skills.
  • Experience with application security testing is a plus (not mandatory)
  • Familiarity or experience with Node.js is a plus (not mandatory).
  • Experience with management of Linux-based EC2 instances.
 

#LI-Remote

#LI-CDG


Unsure if you meet all the qualifications of this job description but are deeply excited about the role? We hire based on ambition, grit, and a passion for improving the way people work. If you think ClickUp is the company for you, we encourage you to apply!

At ClickUp, we assess every candidate based on the potential impact they can have. We hire the best people for the job and support each person’s journey to build their boldest career.
 
ClickUp is an Equal Opportunity Employer, and qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, or national origin.

ClickUp collects and processes personal data in accordance with applicable data protection laws.

  • If you are a European Job Applicant, see our privacy policy for further details.
  • If you are a Philippine Job Applicant, see our privacy policy and our Philippine Data Privacy Notice for further details.

Please note we are unable to sponsor or take over sponsorship of an employment visa for roles outside of engineering and product at this time. Sponsorship for engineering and product roles is not guaranteed, but is instead based on the business needs for that specific role at that time. Please reach out to the recruiter with any questions.

ClickUp Talent Acquisition will only initiate contact via an @clickup.com email or through our official careers portal on clickup.com. We will never request fees, payments, or sensitive personal information. Please disregard any offers received outside these channels and report them to [email protected].

Top Skills

Alb
Amazon Web Services
Argocd
Cdk
Cloudwatch
Datadog
Docker
Ec2
Ecr
Ecs
Github Actions
Kubernetes
Linux
Postgres
Rds
Redis
Terraform
Vpc

Similar Jobs

3 Days Ago
In-Office
Toronto, ON, CAN
Senior level
Senior level
Cloud
The role involves managing a critical SaaS platform, replatforming initiatives, automating processes, and ensuring system reliability across cloud environments.
Top Skills: AnsibleAWSChefEcsGoKubernetesLinuxPythonRancherRustTerraform
15 Days Ago
Hybrid
Toronto, ON, CAN
Mid level
Mid level
Artificial Intelligence • Cloud • Information Technology • Legal Tech • Productivity • Software
As a Site Reliability Engineer, you'll automate processes, maintain cloud infrastructure, advocate for SRE principles, and enhance platform security and reliability.
Top Skills: AksAzureBashChefDockerEfkElkGoGrafanaJavaKubernetesPowershellPrometheusPythonRubyTerraform
21 Days Ago
In-Office
Toronto, ON, CAN
Senior level
Senior level
Cloud
Responsible for building and maintaining reliable infrastructure, automating processes, ensuring security compliance, and participating in incident response as a Staff Site Reliability Engineer (SRE).
Top Skills: Ai/MlFlywayKubernetesSnowflakeSpinnakerTerraform

What you need to know about the Montreal Tech Scene

With roots dating back to 1642, Montreal is often recognized for its French-inspired architecture and cobblestone streets lined with traditional shops and cafés. But what truly sets the city apart is how it blends its rich tradition with a modern edge, reflected in its evolving skyline and fast-growing tech industry. According to economic promotion agency Montréal International, the city ranks among the top in North America to invest in artificial intelligence, making it le spot idéal for job seekers who want the best of both worlds.

Key Facts About Montreal Tech

  • Number of Tech Workers: 255,000+ (2024, Tourisme Montréal)
  • Major Tech Employers: SAP, Google, Microsoft, Cisco
  • Key Industries: Artificial intelligence, machine learning, cybersecurity, cloud computing, web development
  • Funding Landscape: $1.47 billion in venture capital funding in 2024 (BetaKit)
  • Notable Investors: CIBC Innovation Banking, BDC Capital, Investissement Québec, Fonds de solidarité FTQ
  • Research Centers and Universities: McGill University, Université de Montréal, Concordia University, Mila Quebec, ÉTS Montréal

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account