Solace Logo

Solace

Senior Cloud Site Reliability Engineer

Reposted 20 Hours Ago
Be an Early Applicant
In-Office
Ottawa, ON
Senior level
In-Office
Ottawa, ON
Senior level
The Senior Cloud Site Reliability Engineer will ensure the health of Solace Cloud services, manage production incidents, optimize operations, and implement infrastructure tooling across multiple cloud platforms.
The summary above was generated by AI

Solace helps companies connect and integrate all of their assets through the power of event-driven architecture. Our technology makes it easy to unlock data silos and capture events occurring across large enterprises; stream information about those events everywhere it needs to be in real-time; and give the apps, AI agents and people who receive it the power to immediately react with decisive actions and smart decisions. 

  

Many of the world’s biggest companies trust Solace to modernize their IT infrastructure by embracing trends like AI, cloud and IoT so they can create awesome experiences for their customers, partners and employees. 

  

So, the next time you drive a car, order furniture online, fly in a plane, check your bank balance on your phone, your positive experience could be a direct result of our technology—and your hard work 
 

Overview 

This position is for a Senior Cloud Site Reliability Engineer. You will be responsible for the daily operations of
Solace Cloud, our market-leading SaaS offering, across leading cloud providers and platforms such as Amazon Web Services, Microsoft Azure, Google Cloud Platform, Kubernetes, etc. 

What You Will Do: 

  • Ensuring that the Solace Cloud Services are healthy and reliable, and that SLAs are being met 
  • Design and implement our infrastructure tooling, observability, and automation 
  • Contribute to making the production operations more efficient, less error-prone, etc. 
  • Expert-level knowledge in handling production Incidents in production-grade multi-cloud environments according to industry-standard Incident management process 
  • Process handling service requests and provisioning by the customers. 
  • Proven ability to manage customer escalations and drive resolution in mission-critical, high-impact production environments 
  • Work directly with customers to identify, troubleshoot, and resolve operational issues. 
  • Expert debugging knowledge in Linux and Kubernetes to detect operational issues. 
  • Be on-call rotation and provide 24x7 off-hours support 

 

Ideally, You Will Be: 

  • Highly technical, excited by technology, and eager to stay up to date in a rapidly evolving environment. 
  • Expert-level knowledge in Cloud Networking Solutions 
  • Knowledgeable in demonstrating the ability to debug at a system level and resolve incidents in complex cloud-based environments 
  • Expert in Site reliability engineering and Incident response 
  • A strong communicator who can articulate complex technical issues clearly and concisely & get on the phone with customers. 
  • Experienced in SaaS operations and customer-facing technical support 

 

Required Skills: 

  • Proven expertise with public cloud providers (AWS, Azure, GCP) services & features
  • Proven expertise with cloud Kubernetes infrastructure platforms such as AWS Elastic Kubernetes Service, Azure Kubernetes Service, Google Kubernetes Service 
  • Hands-on experience with Monitoring tools like Datadog, Kibana, Prometheus etc. 
  • Hands-on experience with Infrastructure Automation using Terraform, Cloud Formation 
  • Hands-on expertise in debugging production alerts  
  • Expert-level understanding of Linux Operating Systems 
  • Programmer in languages such as Groovy, Python, and Go 
  • Certified Kubernetes Administrator 
  • Certified Cloud Administrator (AWS, Azure, or GCP) 

 

Why You’ll Want to Join Us at Solace 

  • We have an awesome team! You’ll get to work with some of the smartest individuals in the business. 
  • We believe in work-life balance, and that it’s important to love what you do. 
  • We have adopted a hybrid work model to create an inclusive environment for everyone. 
  • We live by our values every day: craftsmanship, trust, courage, freedom, momentum, humility, and human experience.  
  • Our training programs are top-notch. 
  • We like to brag about our stellar customer lineup! 
  • We are social – we like to keep things simple and fun! 
  • We are one of the top-ranked employers on Glassdoor. 
  • We have a sense of humour and make cool videos on cool topics like MITT and this! 

  

We understand that experience takes on various shapes and sizes. Not sure you meet all the requirements? We still want to hear from you! Your unique experience could be exactly what we are looking for. 

  

At Solace, we believe that diversity and inclusion drive innovation and growth, both in business and in life. We strive to create an enriching and safe workplace where you can be who you are. If you want to do the best work of your career and feel supported every step of the way, we encourage you to join us! 

  

Accommodations are available upon request for anyone taking part in the hiring process. Let us know how we can help! We thank all candidates for their interest, however, only those selected to continue in the selection process will be contacted. 

 

Top Skills

AWS
Azure
Cloud Formation
Datadog
GCP
Go
Groovy
Kibana
Kubernetes
Prometheus
Python
Terraform

Similar Jobs

2 Minutes Ago
Remote or Hybrid
Ontario, ON, CAN
Mid level
Mid level
Artificial Intelligence • Hardware • Information Technology • Security • Software • Cybersecurity • Big Data Analytics
The Implementation Consultant II will ensure customer satisfaction with Noggin software implementations, including gathering requirements, configuring solutions, conducting training, and supporting clients throughout the process.
Top Skills: AIMachine LearningSaaSSoftware
4 Minutes Ago
Hybrid
Toronto, ON, CAN
Senior level
Senior level
Blockchain • Fintech • Payments • Consulting • Cryptocurrency • Cybersecurity • Quantum Computing
The Manager of Risk Management oversees regional risk assessments, facilitates discussions in the Business Risk Control Committee, and develops response plans while providing strategic insights for risk-related policies and processes.
Top Skills: Payments Industry KnowledgeProject Management SoftwareRisk Reporting Tools
2 Hours Ago
In-Office
Toronto, ON, CAN
Senior level
Senior level
Artificial Intelligence • Fintech • Payments • Financial Services • Generative AI
The Talent Acquisition Partner will lead full-cycle recruitment for corporate functions, developing sourcing strategies and collaborating with leadership to attract diverse talent, ensuring an exceptional candidate experience, and utilizing data to optimize processes.
Top Skills: Ashby

What you need to know about the Montreal Tech Scene

With roots dating back to 1642, Montreal is often recognized for its French-inspired architecture and cobblestone streets lined with traditional shops and cafés. But what truly sets the city apart is how it blends its rich tradition with a modern edge, reflected in its evolving skyline and fast-growing tech industry. According to economic promotion agency Montréal International, the city ranks among the top in North America to invest in artificial intelligence, making it le spot idéal for job seekers who want the best of both worlds.

Key Facts About Montreal Tech

  • Number of Tech Workers: 255,000+ (2024, Tourisme Montréal)
  • Major Tech Employers: SAP, Google, Microsoft, Cisco
  • Key Industries: Artificial intelligence, machine learning, cybersecurity, cloud computing, web development
  • Funding Landscape: $1.47 billion in venture capital funding in 2024 (BetaKit)
  • Notable Investors: CIBC Innovation Banking, BDC Capital, Investissement Québec, Fonds de solidarité FTQ
  • Research Centers and Universities: McGill University, Université de Montréal, Concordia University, Mila Quebec, ÉTS Montréal

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account