Zealogics LLC
Site Reliability Engineer – Azure & Microsoft 365 Automation (Remote Opportunity)
Key Responsibilities:
-
Lead investigation and resolution of critical, recurring, or high-impact incidents across Azure and Microsoft 365 automation workflows.
-
Deep-dive into PowerShell, Bicep, and YAML scripts to identify logic errors, misconfigurations, or scalability limitations within automated provisioning workflows.
-
Debug and optimize .NET (C#) components within Azure Functions or related application layers used in workflow orchestration.
-
Analyze usage patterns and telemetry data from Azure Monitor, Application Insights, and Log Analytics to identify systemic issues or opportunities for automation enhancement.
-
Implement fixes and design improvements to automation logic that reduce manual intervention and improve workflow reliability (e.g., auto-remediation scripts, retry logic).
-
Own and evolve the automation framework for Teams and SPO lifecycle operations — including operations like create/delete, external sharing restrictions, and role/ownership changes.
-
Collaborate with product owners and architects to introduce new automation use cases or extend existing workflows.
-
Conduct post-incident reviews (PIRs) for high-severity incidents, drive root cause analysis (RCA), and implement corrective actions.
-
Mentor L1 and L2 engineers, conduct knowledge-sharing sessions, and support onboarding of new team members.
-
Stay updated with changes in Azure, Microsoft 365 APIs, and automation tooling (PowerShell modules, Bicep schema updates, etc.)
-
Provide guidance on architecture and best practices for automation reliability
Required Skills & Experience:
-
12+ years of experience in cloud platform engineering, DevOps, or site reliability engineering (SRE) roles with a focus on automation and operational excellence.
-
Proficiency in PowerShell scripting, including writing reusable modules, automation logic, and error handling for production workloads.
-
Extensive experience with Infrastructure as Code using Bicep, including authoring, debugging, and deploying templates for complex Azure resources.
-
Strong understanding of CI/CD processes and YAML pipelines, with hands-on experience in automating build/release workflows in Azure DevOps.
-
Proficient in .NET (C#) — especially for debugging Azure Functions or working on backend components integrated into M365 automation flows.
-
In-depth knowledge of Microsoft 365 platform, including API usage, Teams & SharePoint Online provisioning, governance, and permissions management.
-
Proven ability to troubleshoot and optimize Azure-native services such as API Management, Azure Functions, Storage, Service Bus, Key Vault, and Container Apps.
-
Skilled in telemetry and observability — leveraging Azure Monitor, Log Analytics, Kusto queries, and custom logging to proactively identify issues.
-
Experience conducting root cause analysis, post-incident reviews, and implementing system-wide improvements to reduce incident frequency and MTTR.
-
Experience in mentoring support engineers, contributing to runbook creation, and improving team capability over time.
-
Strong analytical, documentation, collaboration and stakeholder communication skills
Top Skills
Similar Jobs
What you need to know about the Montreal Tech Scene
Key Facts About Montreal Tech
- Number of Tech Workers: 255,000+ (2024, Tourisme Montréal)
- Major Tech Employers: SAP, Google, Microsoft, Cisco
- Key Industries: Artificial intelligence, machine learning, cybersecurity, cloud computing, web development
- Funding Landscape: $1.47 billion in venture capital funding in 2024 (BetaKit)
- Notable Investors: CIBC Innovation Banking, BDC Capital, Investissement Québec, Fonds de solidarité FTQ
- Research Centers and Universities: McGill University, Université de Montréal, Concordia University, Mila Quebec, ÉTS Montréal