You are currently viewing Lead Site Reliability Engineer – CREQ170378

Lead Site Reliability Engineer – CREQ170378

Description

  • 5+ years of experience in SRE
  • Strong AWS, Terraform and GitHub skills
  • Collaborate with Cloud Services and Application teams to deliver projects
  • Deploy infrastructure as code (IaC) releases to QA, staging, and production environments
  • Provide developers build assistance for applications where needed
  • Responsible for building the automation for any account customizations required by the application (ie: custom roles, policies, security groups, etc)
  • Service the L2 Service Now queue for their LOB
  • Resource Optimization. Optimize usage maximizing utilization of deployed resources and reduce spend leveraging cost management tooling
  • Environment Cleanup (ie: removing old EC2 instances, EBS volume clean up, S3 clean up, etc)
  • Provide L2 Operational Support, including off-hours incident response (this will be the development teams first point of contact for production support. The SRE will be responsible for escalating to Cloud Support should they determine the issue to be IaaS platform related)
  • Automate manual processes
  • Point of contact for Incident and Problem Mgmt. Responsible for root cause write up.
  • Analyze incoming incidents/alerts and presenting opportunities for appropriate/proper prioritization of incident/alerts
  • Follow the enterprise change management process to deploy fully tested and documented solutions/applications to a production environment. Assist in runbook review
  • Ability to multi-task and manage tasks with varying priorities
  • Self-motivated, innovative, and able to work across diverse technical and non-technical teams
  • Ability to write and implement infrastructure as code and platform automation
  • Experience implementing Infrastructure as Code Terraform
  • Strong public cloud provider experience (AWS SysOps or DevOps certification a plus)
  • Strong systems, network, and admin knowledge
  • Working knowledge of DevOps and delivery tools (GitHub)
  • Practical experience with modern scripting languages (Python, .Net, C#, Java)

Primary Location

Colombo, Western Province, Sri Lanka

Job Type

Experienced

Skill

MS-WindowsAdminCloud.

Qualification

  • 5+ years of experience in SRE
  • Strong AWS, Terraform and GitHub skills
  • Collaborate with Cloud Services and Application teams to deliver projects
  • Deploy infrastructure as code (IaC) releases to QA, staging, and production environments
  • Provide developers build assistance for applications where needed
  • Responsible for building the automation for any account customizations required by the application (ie: custom roles, policies, security groups, etc)
  • Service the L2 Service Now queue for their LOB
  • Resource Optimization. Optimize usage maximizing utilization of deployed resources and reduce spend leveraging cost management tooling
  • Environment Cleanup (ie: removing old EC2 instances, EBS volume clean up, S3 clean up, etc)
  • Provide L2 Operational Support, including off-hours incident response (this will be the development teams first point of contact for production support. The SRE will be responsible for escalating to Cloud Support should they determine the issue to be IaaS platform related)
  • Automate manual processes
  • Point of contact for Incident and Problem Mgmt. Responsible for root cause write up.
  • Analyze incoming incidents/alerts and presenting opportunities for appropriate/proper prioritization of incident/alerts
  • Follow the enterprise change management process to deploy fully tested and documented solutions/applications to a production environment. Assist in runbook review
  • Ability to multi-task and manage tasks with varying priorities
  • Self-motivated, innovative, and able to work across diverse technical and non-technical teams
  • Ability to write and implement infrastructure as code and platform automation
  • Experience implementing Infrastructure as Code Terraform
  • Strong public cloud provider experience (AWS SysOps or DevOps certification a plus)
  • Strong systems, network, and admin knowledge
  • Working knowledge of DevOps and delivery tools (GitHub)
  • Practical experience with modern scripting languages (Python, .Net, C#, Java)

Travel

No

Job Posting

18/09/2023


Closing Date: 2023/10/07

Share this job
  • Post category:Virtusa