CrowdStrike Logo

CrowdStrike

Sr. Problem Management Engineer – Engineering Service Management (Remote)

Posted Yesterday
Be an Early Applicant
Remote or Hybrid
Hiring Remotely in CA
Senior level
Remote or Hybrid
Hiring Remotely in CA
Senior level
The Senior Problem Management Engineer will lead process transformation, implement automation, and focus on improving incident management and service stability using AI/ML.
The summary above was generated by AI

As a global leader in cybersecurity, CrowdStrike protects the people, processes and technologies that drive modern organizations. Since 2011, our mission hasn’t changed — we’re here to stop breaches, and we’ve redefined modern security with the world’s most advanced AI-native platform. We work on large scale distributed systems, processing almost 3 trillion events per day. We have 3.44 PB of RAM deployed across our fleet of C* servers - and this traffic is growing daily. Our customers span all industries, and they count on CrowdStrike to keep their businesses running, their communities safe and their lives moving forward. We’re also a mission-driven company. We cultivate a culture that gives every CrowdStriker both the flexibility and autonomy to own their careers. We’re always looking to add talented CrowdStrikers to the team who have limitless passion, a relentless focus on innovation and a fanatical commitment to our customers, our community and each other. Ready to join a mission that matters? The future of cybersecurity starts with you.

About the Role:

We are seeking a Senior Engineering Problem Manager to lead the transformation of our Problem Management Engineering function. This strategic role will focus on embedding resilient, automated, and intelligent problem management practices into our engineering, operations, and platform ecosystems. You will be responsible for building technical integrations, leveraging AI/ML for advanced root cause analysis, and driving a culture of continuous learning and operational excellence.

You’ll lead end-to-end delivery of initiatives that reduce incident recurrence, improve service stability, and create measurable business value — with a strong focus on automation, governance, and DevOps alignment.

What You'll Do:

  • Design and implement modern problem management workflows, tightly integrated into engineering and operations toolchains.

  • Lead the governance of key problem management deliverables including post-incident action tracking, known error records, and systemic remediation.

  • Drive continuous evolution of a structured retrospective process that promotes learning and resilience engineering.

  • Partner with platform, SRE, and observability teams to automate known error workarounds, temporary fixes, and proactive health checks.

  • Utilize AIOps and ML-driven tooling to correlate events, detect patterns, and identify root causes more effectively.

  • Work closely with business units and product teams to perform business impact analysis and prioritize problem resolution based on value and risk.

  • Integrate post-incident review outcomes into continuous improvement loops, product backlogs, and technical roadmaps.

  • Maintain and evolve the tooling ecosystem supporting problem management, including dashboards, knowledge repositories, and workflows.

  • Act as a coach and change agent to promote a culture of accountability, proactive risk reduction, and shared ownership of reliability.

Key Focus Areas:

  • Retrospective Process Management: Facilitate structured reviews and systemic RCA that drive long-term improvements.

  • Automation of Known Errors & Workarounds: Reduce manual overhead through scripts, workflows, and proactive detection.

  • AI-Augmented Root Cause Analysis: Integrate ML models and historical telemetry to improve diagnostic speed and accuracy.

  • Post-Incident Governance: Ensure action items are documented, assigned, and driven to closure with cross-functional visibility.

  • Business Impact Analysis: Collaborate with stakeholders to prioritize recurring problems based on cost, customer experience, and risk.

  • Toolchain Integration: Seamlessly embed problem management into DevOps tools (e.g., Jira, ServiceNow, PagerDuty, GitHub).

What You'll Need:

  • 8+ years of experience in Engineering Operations, DevOps, Service Management, Platform/SRE Engineering.

  • Strong understanding of ITSM, particularly Problem, Incident, and Change Management.

  • Experience managing or building post-incident processes, RCAs, and follow-through governance models.

  • Proven ability to automate operational workflows and known error processes using scripting or platform tooling.

  • Proficiency with observability platforms and AIOps tools (e.g., Datadog, Splunk, New Relic, Moogsoft, or similar).

  • Exceptional collaboration and communication skills across technical and non-technical stakeholders.

  • Data-driven mindset with the ability to perform root cause trend analysis and report on service health metrics.

  • Experience working in DevOps, cloud-native, or agile environments.

Preferred Qualifications:

  • Experience with structured problem-solving methodologies (e.g., 5 Whys, Fishbone, Fault Tree).

  • Familiarity with knowledge management systems, runbooks, and self-healing infrastructure practices.

  • Background in software engineering, platform reliability, or infrastructure automation.

  • Certifications in ITIL, SRE, Agile, or SAFe frameworks.

#LI-LY1

#LI-Remote

#HTF

This role will require the candidate to periodically undergo and pass additional background and fingerprint check(s) consistent with government customer requirements.

Benefits of Working at CrowdStrike:

  • Remote-friendly and flexible work culture

  • Market leader in compensation and equity awards

  • Comprehensive physical and mental wellness programs

  • Competitive vacation and holidays for recharge

  • Paid parental and adoption leaves

  • Professional development opportunities for all employees regardless of level or role

  • Employee Resource Groups, geographic neighbourhood groups and volunteer opportunities to build connections

  • Vibrant office culture with world class amenities

  • Great Place to Work Certified™ across the globe


CrowdStrike is proud to be an equal opportunity employer. We are committed to fostering a culture of belonging where everyone is valued for who they are and empowered to succeed. We support veterans and individuals with disabilities through our affirmative action program.

CrowdStrike is committed to providing equal employment opportunity for all employees and applicants for employment. The Company does not discriminate in employment opportunities or practices on the basis of race, color, creed, ethnicity, religion, sex (including pregnancy or pregnancy-related medical conditions), sexual orientation, gender identity, marital or family status, veteran status, age, national origin, ancestry, physical disability (including HIV and AIDS), mental disability, medical condition, genetic information, membership or activity in a local human rights commission, status with regard to public assistance, or any other characteristic protected by law. We base all employment decisions--including recruitment, selection, training, compensation, benefits, discipline, promotions, transfers, lay-offs, return from lay-off, terminations and social/recreational programs--on valid job requirements.

If you need assistance accessing or reviewing the information on this website or need help submitting an application for employment or requesting an accommodation, please contact us at [email protected] for further assistance.

Find out more about your rights as an applicant.

CrowdStrike participates in the E-Verify program.

Notice of E-Verify Participation

Right to Work

CrowdStrike, Inc. is committed to equal pay for equal work in its compensation practices. The base salary range for this position in the U.S. is $155,000 - $255,000 per year + variable/incentive compensation + equity + benefits. A candidate's salary is determined by various factors including, but not limited to, relevant work experience, skills, certifications, job level, supervisory status, and location.

Top Skills

AI
Datadog
DevOps
Git
Itsm
JIRA
Ml
Moogsoft
New Relic
Pagerduty
Servicenow
Splunk

Similar Jobs at CrowdStrike

35 Minutes Ago
Remote
Hybrid
USA
Senior level
Senior level
Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Lead cybersecurity sales across the SLED vertical, collaborating with internal teams and customers, and driving product adoption and sales pipelines.
Top Skills: Cloud SecurityCrowdstrike Falcon PlatformCybersecurity
51 Minutes Ago
Remote
Hybrid
USA
Mid level
Mid level
Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
The A/V Architect will design and oversee AV systems and networks, support collaboration technologies, and create documentation aligned with best practices.
Top Skills: AutocadBiampCrestronDnsGoogle MeetPolycomQscShureSlackTcp/IpVisioWebexZoom
Yesterday
Remote
Hybrid
USA
Expert/Leader
Expert/Leader
Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
This role involves leading the Enterprise Identity Transformation, architecting IAM solutions, managing teams, and utilizing AI for identity management. It requires extensive experience in IAM infrastructure and proven leadership skills.
Top Skills: Active DirectoryAWSAzureDelinea PamGCPIamOktaPythonSailpointTerraform

What you need to know about the Montreal Tech Scene

With roots dating back to 1642, Montreal is often recognized for its French-inspired architecture and cobblestone streets lined with traditional shops and cafés. But what truly sets the city apart is how it blends its rich tradition with a modern edge, reflected in its evolving skyline and fast-growing tech industry. According to economic promotion agency Montréal International, the city ranks among the top in North America to invest in artificial intelligence, making it le spot idéal for job seekers who want the best of both worlds.

Key Facts About Montreal Tech

  • Number of Tech Workers: 255,000+ (2024, Tourisme Montréal)
  • Major Tech Employers: SAP, Google, Microsoft, Cisco
  • Key Industries: Artificial intelligence, machine learning, cybersecurity, cloud computing, web development
  • Funding Landscape: $1.47 billion in venture capital funding in 2024 (BetaKit)
  • Notable Investors: CIBC Innovation Banking, BDC Capital, Investissement Québec, Fonds de solidarité FTQ
  • Research Centers and Universities: McGill University, Université de Montréal, Concordia University, Mila Quebec, ÉTS Montréal

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account