Everbridge
Staff Platform Site Reliability Specialist (Observability & Kubernetes)
Be an Early Applicant
The Staff Platform Site Reliability Specialist will manage and enhance the observability platform, ensuring system reliability and scalability while collaborating with engineering teams and managing EKS clusters.
Everbridge is seeking a Staff Platform Site Reliability Specialist to own, operate, and evolve our enterprise observability platform. In this role, you will be responsible for the up-keep, reliability, scalability, and strategic growth of Everbridge’s observability stack, EKS, and supporting services, ensuring our engineering teams have deep visibility into system health, performance, and reliability across a large-scale, cloud-native environment. You will also be working with other cloud technologies within the AWS and GCP areas.
We’re looking for someone who shows up for the team, not just themselves. This role works best for a person who communicates clearly, collaborates easily, and treats interactions with other teams with respect and professionalism. You should be comfortable being involved, offering support, and helping move work forward without ego. We value people who build trust, keep things running smoothly, and make the teams around them better.
What You'll Do:
- Observability Platform Ownership
- Head the design, operation, and evolution of Everbridge’s observability stack
- Build and maintain a highly available, scalable observability platform
- Standardize instrumentation, dashboards, alerts, and SLOs
- Support incident response, root cause analysis, and capacity planning
- Operate and scale Grafana and technology
- Grafana Loki (logs)
- Grafana Mimir (metrics)
- Grafana Tempo (tracing)
- Grafana Alerting
- Maintain reliability and security of EKS clusters running observability
- Manage cluster lifecycle and upgrades
- - Infrastructure as Code & Automation
- Terraform for infrastructure provisioning
- HashiCorp Packer
- Gitlab CI/CD at Scale
Grafana Stack & Telemetry
Kubernetes
What You'll Bring:
- 6+ years in SRE / Platform Engineering
- Strong Grafana ecosystem experience
- Kubernetes and Amazon EKS expertise
- Terraform proficiency
The reasonably estimated salary for this role at Everbridge ranges from $135,000 - $165,000 CAD and may also include variable compensation. Actual compensation is based on factors such as the candidate's skills, qualifications, and experience. In addition, Everbridge offers a wide range of best-in-class, comprehensive and inclusive employee benefits for this role including healthcare, dental care, mental health benefits, disability income benefits, life and AD&D insurance, retirement savings plan with employer match, and paid time off.
Fair Chance Statement US & Canada
We are committed to providing equal employment opportunities in compliance with all applicable Federal, Provincial/State and Local laws, including the California Fair Chance Act and any local County Fair Chance Ordinance (or local equivalent). Pursuant to these and other relevant regulations, we consider qualified applicants with criminal histories in a manner consistent with the law.
For roles subject to background checks, the following material job duties may be affected by an applicant’s criminal history:
- Access to sensitive or confidential information, such as financial records, proprietary data, or client information.
- Management of cash, company funds, or other valuable assets.
- Work in environments requiring heightened security measures.
- Compliance with contractual or regulatory requirements specific to the position.
We evaluate each applicant's criminal history individually, considering its nature, timing, and relevance to the specific job duties, while maintaining our commitment to fair hiring practices and promoting workplace equity.
About Everbridge
Everbridge empowers enterprises and government organizations to anticipate, mitigate, respond to, and recover stronger from critical events. In today’s unpredictable world, resilient organizations minimize impact to people and operations, absorb stress, and return to productivity faster when deploying critical event management (CEM) technology. Everbridge digitizes organizational resilience by combining intelligent automation with the industry’s most comprehensive risk data to Keep People Safe and Organizations Running™. For more information, visit www.everbridge.com, read the company blog, and follow on Twitter. Everbridge… Empowering Resilience
Everbridge is an Equal Opportunity/Affirmative Action Employer. All qualified Applicants will receive consideration for employment without regard to race, creed, color, religion, or sex including sexual orientation and gender identity, national origin, disability, protected Veteran Status, or any other characteristic protected by applicable federal, state, or local law.
Top Skills
AWS
Eks
GCP
Gitlab Ci/Cd
Grafana
Hashicorp Packer
Kubernetes
Terraform
Similar Jobs
Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Lead a team of Enterprise Sales Engineers in cybersecurity, articulating security solutions and driving customer engagement through technical expertise.
Top Skills:
AvAWSAzureBashEdrEndpoint SecurityFirewallForensicsGCPHips/IdsIncident ResponseLinuxmacOSPowershellPythonSIEMWindows
Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
The role involves incident handling, malware analysis, incident response, and developing processes to improve detection and remediation in a cybersecurity environment.
Top Skills:
.NetCC#Forensic AnalysisIncident ResponseLinuxmacOSMalware AnalysisPerlPythonRuby On RailsVbWindows
Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Manage Enterprise accounts, execute account strategy to close new business, establish relationships with decision makers, collaborate with internal teams for sales success.
Top Skills:
CloudCybersecuritySaaSSalesforce
What you need to know about the Montreal Tech Scene
With roots dating back to 1642, Montreal is often recognized for its French-inspired architecture and cobblestone streets lined with traditional shops and cafés. But what truly sets the city apart is how it blends its rich tradition with a modern edge, reflected in its evolving skyline and fast-growing tech industry. According to economic promotion agency Montréal International, the city ranks among the top in North America to invest in artificial intelligence, making it le spot idéal for job seekers who want the best of both worlds.
Key Facts About Montreal Tech
- Number of Tech Workers: 255,000+ (2024, Tourisme Montréal)
- Major Tech Employers: SAP, Google, Microsoft, Cisco
- Key Industries: Artificial intelligence, machine learning, cybersecurity, cloud computing, web development
- Funding Landscape: $1.47 billion in venture capital funding in 2024 (BetaKit)
- Notable Investors: CIBC Innovation Banking, BDC Capital, Investissement Québec, Fonds de solidarité FTQ
- Research Centers and Universities: McGill University, Université de Montréal, Concordia University, Mila Quebec, ÉTS Montréal

