Product Offering:
Spreedly provides an open payments platform. The platform’s connectivity provides payments performance. Key products and services include:
Connect — A unified API that integrates with hundreds of payment gateways, processors, and alternative payment methods worldwide, including digital wallets. Merchants access the global payments ecosystem through one connection.
Vault — A PCI-compliant secure repository for payment methods. Merchants store card data once and reuse it across any payment service, reducing PCI scope and protecting cardholder data at scale.
Optimize — Workflow-driven routing and retry logic that directs each transaction to the best-performing gateway in real time. On average, 7.9% of failed transactions succeed immediately when retried on a secondary gateway. This is where merchants recover lost revenue and increase authorization success rates.
Protect — A flexible fraud and authentication layer, incorporating advanced fraud tools and 3DS. Following Spreedly's acquisition of Dodgeball in September 2025, fraud orchestration and payment optimization now operate within the same platform.
Resolve — Centralized management and reporting that reduces operational silos, strengthens security, and improves billing control across a merchant's entire payment operation.
Responsibilities:
- Infrastructure Operations & Reliability:
- Operate, scale, and modernize AWS-based infrastructure supporting highly available, uptime-driven production systems.
- Design for fault tolerance, graceful degradation, and automated recovery across EC2- and ECS-based workloads.
- Support the organization’s roadmap toward multi-region, globally distributed infrastructure.
- Infrastructure as Code & Automation:
- Build, maintain, and improve infrastructure using Terraform, Ansible, and related tooling to ensure repeatability, auditability, and resilience.
- Support and evolve CI/CD pipelines (GitHub Actions, AWS tooling) with a focus on reliability, speed, and developer autonomy.
- Reduce operational brittleness by creating reusable, well-documented infrastructure patterns.
- Observability & Incident Response:
- Implement and maintain observability using Datadog, CloudWatch, OpenTelemetry, and related tools.
- Define and monitor SLOs, improve alert quality, and reduce MTTD/MTTR through actionable dashboards and runbooks.
- Participate in and help mature a 24/7 on-call rotation; confidently troubleshoot and resolve incidents under pressure.
- Security & Compliance:
- Serve as an infrastructure security subject-matter expert, helping bridge the Infrastructure Engineering and Security teams.
- Implement and operate security controls such as IAM policies, WAFs, DDoS protections, secrets management, and deployment safeguards.
- Support regulated environments and compliance efforts (PCI, SOC 2, or similar).
- Collaboration, Mentorship & Delivery:
- Proactively communicate status, risks, and tradeoffs in a distributed, async-first environment.
- Mentor engineers and contribute to shared learning across experience levels.
- Own small-to-medium scoped projects end-to-end: breaking down work, driving execution, and following through to completion.
Requirements:
- 5+ years of experience working with cloud infrastructure or systems engineering in a production environment.
- Deep hands-on experience operating and scaling production systems in AWS (ECS, EC2, ALB/ELB, ASG, IAM, VPC, Secrets Manager).
- Strong infrastructure-as-code experience with Terraform and configuration management tools such as Ansible.
- Experience supporting highly available, uptime-sensitive systems with on-call responsibility.
- Observability expertise using tools such as Datadog, CloudWatch, and OpenTelemetry.
- Linux systems experience (Debian- or RHEL-based distributions).
- Exposure to or experience with multi-region cloud environments to support global availability
- Experience with DevOps and/or GitOps and an understanding of CI/CD methods
- Experience with containers and container orchestration (Nomad, Docker, etc.)
- Infrastructure security experience (e.g., WAFs, DDoS mitigation, access controls).
- Experience in regulated environments (PCI, SOC 2, HIPAA, or similar).
- Proven ability to run projects end-to-end and deliver repeatable, maintainable solutions.
Additional Skills We Value:
- Familiarity with Edge CDN-type services
We Offer Our Canada-Based Employees:
- Competitive salary + Equity
- Group Life Insurance and Disability Coverage
- Medical, Vision, and Dental coverage
- Pension contribution
- Open Paid Time Off policy
- Monthly home working/digital lifestyle stipend, new MacBook, and one-time accessory reimbursement
- $1,ooo professional development stipend
- Access to company-paid professional coaching service
- Visits to HQ in Durham, North Carolina for remote employees
Top Skills
Similar Jobs
What you need to know about the Montreal Tech Scene
Key Facts About Montreal Tech
- Number of Tech Workers: 255,000+ (2024, Tourisme Montréal)
- Major Tech Employers: SAP, Google, Microsoft, Cisco
- Key Industries: Artificial intelligence, machine learning, cybersecurity, cloud computing, web development
- Funding Landscape: $1.47 billion in venture capital funding in 2024 (BetaKit)
- Notable Investors: CIBC Innovation Banking, BDC Capital, Investissement Québec, Fonds de solidarité FTQ
- Research Centers and Universities: McGill University, Université de Montréal, Concordia University, Mila Quebec, ÉTS Montréal



