• In this role you'll be leading a team of Site Reliability Engineers, subject matter experts across infrastructure operations, capacity management, configuration management, chaos Engineering .
• Servant leader ensuring team is on track to complete epics assigned for the current PI and remove blocks and resolve issues as well as pre-planning for upcoming PI.
• Represent the team to leadership up to and including officer level briefings
• Escalation point for T2-T3 support for on prem/cloud infrastructure Operations.
• Contribute to design and implementation of on-prem and cloud solutions which are secure, scalable, resilient, monitored, auditable and cost optimized
• Design and develop solutions for Azure migrations and transformation tools
• Migration of existing platforms and applications to Azure
• Automation of cloud-based infrastructure deployments and maintenance
• Build, manage and maintain tools for deployment, monitoring and operations.
• Evaluate and recommend tools, technologies, and processes to ensure the highest quality and performance is achieved. Focus on scalability, security and availability of all infrastructure and processes.
• Identifying and addressing infrastructure deficiencies, availability gaps, and performance bottlenecks
• Help to determine technical feasibility of solutions for business requirements
• Collaborate with peer organizations, product delivery teams, and support organizations on technical issues and provide guidance.
• Perform root cause analysis brainstorming session on incident resolutions provide corrective and preventative measures to perform & avoid or mitigate future incidents working with DevOps teams.
• Exercise a high degree of responsibility for the processes, systems, and tools created and managed.

Shift timing (if any):
• Shift falls typically between 6 am to 10 PM India standard time. Occasionally may have to work long hours in situations when it is needed.

• Overall Experience: 8+ years of experience in DevOps Engineers with emphasis on building overall eco system, Infra Operations supporting environment managing large scale applications in both on-prem and Cloud Environment
• Solid experience with Site Reliability Engineering.
• Solid experience in working in Linux Systems Administrator role
• Solid experience with Azure core cloud technologies in a high traffic production setting.
• Solid experience in application migrations to cloud using native patterns
• Solid experience and understanding of cloud security experience which includes preventative and retrospective controls.
• Extensive Experience in Devising strategies, Roadmap planning for On-Premise and Cloud Systems solutions meeting both business objectives and PCI/NON PCI, Security/CSO governance, OWASP top 10 security objectives
• Extensively in involved in all phases of SDLC with focus AGILE
• (SAFe)/DevSecOps methodologies, devising Cloud Agnostic Solutions for both on premise and hybrid cloud solutions from Inception/Design/Production rollout leveraging AZURE and AWS providers.
• Must be very seasoned in assessing technologies, Platform APIs, Internal and external dependencies , Capex/Opex Funding strategies , legal compliance, Time to market customer driven solution evaluations, training SMEs, in house resource evaluations, avoid vendor lock-ins, KPIs, TCO-Cost saving aspects to run or lead this team.
• Experience in Devising Design and Architecture for CI-CD/DEVSECOPS/Auto Scaling/SITE RELIABILITY ENGINEERING objectives through Automations – Infrastructure as CODE/Platform as CODE using ANSIBLE/HELM any such item for configurations and TERRAFORMS for provisioning the infrastructure.
• Experience in Mulesoft or Any gateway architecture solutions such as Strong loop IBM micro-gateway is also desired.
• Experience in Streaming Solutions such as Kafka or Cloud Equivalent such as Event Hub etc is a big plus
• Knowledge on CDN-Akamai/LOAD BALANCER/FIREWALL/DNS/PROXY/REVERSE PROXY/VPN TUNNEL ETC for meeting the needs of 3 layers and 7 layers architecture is a must have requirement for both on premise and cloud equivalent solutions
• Experience in Cassandra or any NO SQL is a big plus.
• Proven hands-on technical, managerial/leadership expertise, leading teams of geographically dispersed employees and contractors working on analyzing, defining, proposing IT platform and Infrastructure solutions for portfolios of DOTCOM.
• Mainly focused on being hands on in Leading/Assisting Systems architects/Leads with exploration of Latest technologies and rolling out Platform solutions with special focus on Highly Scalable, Self-Healing, Nimble, Flexible Infrastructure Solutions for Business Portfolios.
• Experience in Democratization of Platform and Systems Dashboard leveraging both real time and synthetic monitoring solutions with Primary focus on Site Reliability Engineering is Mandatory. Examples- Dyntrace/App Dynamics or New Relic for APM monitoring, EFK/ELK/Splunk for Logs Monitoring and any Synthetic monitoring components such as Catchpoint would be a big plus.
• Extensively seasoned in Managing, Coaching/Mentoring, Budgeting and Vendor Management, etc
• Experience with performance tuning in on-prem & cloud environment
• Experience architecting, implementing, and managing monitoring solutions for production cloud environments
• Build and manage on-prem Kubernetes services(K8s), Nginx, Application Gateways, Load balancers Redis webservers, app servers, cache engines, configuration management, CI/CD, GIT, Jenkins, Docker, Nexus, maven: 4 – Advanced
• Solid experience Build and manage in core Azure cloud technologies such as: Azure DevOps, VMSS, Vnet, Azure Load balancer, Azure Application gateway, Azure Private Link, Cosmos DB, Azure Monitor/Application Insights, AKS, Azure Cache, Event Hub, Azure Functions: 4 – Advanced
• Solid experience building cloud automation/orchestration solutions with technologies such as: Terraform, CloudFormation, Ansible, Chef, Puppet. 4 – Advanced
• Experience in designing / implementing highly available cloud/HybridCloud network solutions 4 – Advanced
• Experience with application performance management (APM), logging, tracing, and other monitoring tools like Dynatrace, Grafana, Prometheus, Nagios, ELK, Azure Insights: ( 3 – Advanced)
• Experience knowledge in Mulesoft architecture, development, administration experience 2 – Novice
• Knowledge & demonstrated experience in Agile methodologies and practice
• Ability to adapt to a rapidly changing environment and technologies
• Excellent written and verbal English communication skills to work in a Global team

Secondary / Desired skills:),
• experience in Agile, Lean Agile and/or Scaled Agile methodologies: 2 - Novice (limited experience)
• experience in following technologies Azure DevOps, VMSS, Vnet, Azure Load balancer, Azure Application gateway, Azure Private Link, Cosmos DB, Azure Monitor/Application Insights, AKS, Azure Cache, Event Hub, Azure Functions AWS EC2, ALB/ELB, RDS, S3, LAMBDA, API Gateway, CloudFront, SNS, SQS, DynamoDB, Cloudwatch, ElastiCache, and EKS, Ansible, Terraform, shell scripting, Kubernetes, Docker, Linux Administration RHEL/Centos/Ubuntu, Kafka, Rabbit, Redis, Cassandra, MongoDB, NGINX, Openstack, GIT, Jenkins, Splunk, ELK, Dynatrace, New Relic, Grafana, Prometheus, Mulesoft

Additional information (if any): Willing to work in Shift Duties, Willingness to learn is very important as AT&T offers excellent environment to learn Digital Transformation skills such as cloud, Big data, AI, Full stack etc.

Education Qualification: Bachelor’s/ Masters degree in Computer Science or related field



AT&T is bringing it all together for our customers, from revolutionary smartphones to next-generation TV services and sophisticated solutions for multi-national businesses.

For more than a century, we have consistently provided innovative, reliable, high-quality products and services and excellent customer care. Today, our mission is to connect people with their world, everywhere they live and work, and do it better than anyone else. We're fulfilling this vision by creating new solutions for consumers and businesses and by driving innovation in the communications and entertainment industry.

We're recognized as one of the leading worldwide providers of IP-based communications services to businesses. We also have the nation's most reliable 4G LTE network.* We also have the largest international coverage of any U.S. wireless carrier, offering the most phones that work in the most countries. AT&T operates the nation's largest Wi-Fi network** including more than 32,000 AT&T Wi-Fi Hot Spots at popular restaurants, hotels, bookstores and retailers, and provides access to nearly 1 million hotspots globally through roaming agreements.

AT&T U-verse is TV inspired by you. It's TV the way you want it, with tons of cool features and capabilities. AT&T is the only national TV service provider to offer a 100-percent IP-based television service. It's part of our "three-screen" integration strategy to deliver services across the three screens people rely on most - the mobile device, the PC and the TV.

As we continue to break new ground and deliver new solutions, we're focused on delivering the high-quality customer service that is our heritage.

Similar jobs