Site Reliability Engineer (SRE)
About the Role
We are looking for a Senior Site Reliability Engineer (SRE) to help us scale and secure our cloud-native infrastructure. This hybrid role is ideal for someone with strong experience in Azure, AKS, Terraform, and Python, who enjoys working in a collaborative, Agile environment.
Key Responsibilities
- Design and maintain scalable, secure infrastructure on Microsoft Azure, with a focus on AKS (Azure Kubernetes Service).
- Implement and manage Infrastructure as Code (IaC) using Terraform.
- Automate operational tasks and workflows using Python.
- Monitor system performance and reliability, and respond to incidents with a focus on root cause analysis and continuous improvement.
- Collaborate with development and DevOps teams to streamline CI / CD pipelines.
- Promote and apply SRE best practices, including SLAs, SLOs, and error budgets.
Required Qualifications
Proven experience in a similar SRE, DevOps, or Cloud Engineering role.Strong hands-on experience with Azure services and AKS.Proficiency in Terraform and Python.Solid understanding of cloud architecture, networking, and security.Experience working in Agile teams and environments.Nice to Have
Familiarity with CI / CD tools like GitHub Actions or Azure DevOps.Experience with monitoring and observability tools (e.g., Prometheus, Grafana, Azure Monitor).Exposure to service mesh, zero-downtime deployments, or multi-cloud environments.What We Offer
Hybrid work model : 2 days / week onsite in Lisbon.A collaborative and innovative team culture.Opportunities for continuous learning and career growth.Competitive compensation and benefits.