Austin, Texas

Job Description

The Oracle Cloud Infrastructure (OCI) team can provide you the opportunity to build and operate a suite of massive scale, integrated cloud services in a broadly distributed, multi-tenant cloud environment. OCI is committed to providing the best in cloud products that meet the needs of our customers, who are tackling some of the world's biggest challenges.

We offer unique opportunities for smart, hands-on engineers with the expertise and passion for solving difficult problems in distributed, highly available services and virtualized infrastructure. Our engineers have a significant technical and business impact at every level, designing and building innovative new systems to power our customer's business-critical applications.

Senior Site Reliability Engineer

Oracle Cloud Infrastructure (OCI) - OCI National Security Region Networking

Reston, VA, Seattle, WA, and Austin, TX

Are you interested in building large-scale distributed networking solutions for the cloud? Do you love the idea of working in an environment with the excitement of a start-up but the financial backing of a Fortune 500 company? You'll be joining a fast-growing venture that offers a lot of autonomy and a lot of variety. This role offers huge upside potential, high visibility, and fast career growth without the risk of a typical start-up. This is a unique opportunity to work with smart people solving complex problems in distributed systems, networking, multi-tenant Infrastructure-as-a-Service (IaaS), and Software-Defined Networking (SDN) operating at a massive scale.

Our customers always want higher availability, more bandwidth, greater network security, less network latency, and lower overall cost. We are reimagining the traditional planning, provisioning, and life cycle by creating SDN services that allow customers to easily migrate their business to OCI or connect their on-premises, data center, and/or other networks via enterprise-grade links to Oracle's cloud. At its core, our SDN services provide customers with rapid configuration, pay-as-you-go pricing, and seamless scalability.

OCI National Security Region Networking team is looking for a Senior Site Reliability Engineer. As a Site Reliability Engineer, you will solve interesting technical challenges by defining, designing, deploying, and troubleshooting key Network Automation services focusing on scalability, security, and performance. The role involves software engineering, systems engineering, automation, network operations, and DevOps. You should be comfortable with building complex distributed systems. You will incorporate the ethos of software engineering and apply it to large-scale operational problems. Your primary goals are to create highly reliable and services, platforms, and infrastructure, always thinking about reliability, security, and ultra-scalable software systems to manage operations. When not working on operations, you will be working on software engineering tasks such as designing and developing systems that increase reliability, scalability and reduce operational overhead through automation. You should value simplicity and scale, work comfortably in a collaborative, agile environment, and be excited to learn.

A great software engineer will make all the difference in delivering quality solutions to our customers. Are you passionate about designing, developing, testing, and delivering cloud services? Do you thrive in a fast-paced environment and want to be an integral part of a truly great team?

Join us!

We are looking for a Senior Site Reliability Engineer to be part of a team of engineers who will support a wide range of network automation and control plane services critical to managing and scaling our network infrastructure.

As a Senior Site Reliability Engineer, you will be responsible for:
  • Developing automation services to increase network automation deployment velocity.
  • Deep dive analytics into system uptime, service metrics, performance, deployment automation
  • Develop meaningful service metrics and dashboards
  • Managing reliability and manageability of network automation and control plane services
  • Develop service debugging tools, developing deployment automation solutions, build and manage test environments for services
Mandatory Qualifications:

  • US Government TS/SCI with Polygraph
  • U.S. Citizenship- Federal Government customer
  • Bachelor's or Master's degree in CS or related engineer field
  • 5+ years of experience in software development/operations
  • 2+ years of experience in developing/operating large scale distributed services
  • Experience with scripting and compiled languages Java and Python, bash and RESTful API experience
  • Experience managing a Linux environment, docker, managing distributed systems
  • Knowledge of Linux internals, TCP/IP, DNS, Load balancing technologies, and socket programming
  • Knowledge of cloud compute technologies, network monitoring, data processing, and analytics
  • Aptitude to be a good team player and the willingness to learn and implement new Cloud technologies as needed
  • Excellent organizational, verbal, and written communication skills
  • Experience with participating in an on-call rotation and driving live site incidents to resolution
  • Experience with SQL or NoSQL technologies

The position is located in Reston, VA /Seattle, WA/ Austin, TX.

Oracle is an Equal Employment Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability, and protected veteran status, or any other characteristic protected by law.

Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. Design, write, and deploy software to improve the availability, scalability, and efficiency of Oracle products and services. Design and develop designs, architectures, standards, and methods for large-scale distributed systems. Facilitate service capacity planning and demand forecasting, software performance analysis, and system tuning.

Work with Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas. Understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of production services. Responsible for the design and delivery of the mission critical stack, with focus on security, resiliency, scale, and performance. Authority for end-to-end performance and operability. Partner with development teams in defining and implementing improvements in service architecture. Articulate technical characteristics of services and technology areas and guide Development Teams to engineer and add premier capabilities to the Oracle Cloud service portfolio. Understand and communicate the scale, capacity, security, performance attributes, and requirements of the service and technology stack. Demonstrate clear understanding of automation and orchestration principles. Act as ultimate escalation point for complex or critical issues that have not yet been documented as Standard Operating Procedures (SOPs). Utilize a deep understanding of service topology and their dependencies required to troubleshoot issues and define mitigations. Understand and explain the affect of product architecture decisions on distributed systems. Professional curiosity and a desire to a develop deep understanding of services and technologies.

A BS or MS in Computer Science, or equivalent. Identifies solutions to knowledge of server hardware and software configuration, networking, standard internet services, scripting languages, cloud computing patterns, technology security and compliance. Experience running large scale customer facing web services. Identifies solutions to understanding of load balancing technologies and experience with development in programming languages, databases and big data stores, and container technologies. Work involves defining and documenting technical architecture of complex and highly scalable products. A minimum of 5+ years experience of running large scale customer facing web services.

About Us

Diversity and Inclusion:

An Oracle career can span industries, roles, Countries and cultures, giving you the opportunity to flourish in new roles and innovate, while blending work life in. Oracle has thrived through 40+ years of change by innovating and operating with integrity while delivering for the top companies in almost every industry.

In order to nurture the talent that makes this happen, we are committed to an inclusive culture that celebrates and values diverse insights and perspectives, a workforce that inspires thought leadership and innovation.

Oracle offers a highly competitive suite of Employee Benefits designed on the principles of parity, consistency, and affordability. The overall package includes certain core elements such as Medical, Life Insurance, access to Retirement Planning, and much more. We also encourage our employees to engage in the culture of giving back to the communities where we live and do business.

At Oracle, we believe that innovation starts with diversity and inclusion and to create the future we need talent from various backgrounds, perspectives, and abilities. We ensure that individuals with disabilities are provided reasonable accommodation to successfully participate in the job application, interview process, and in potential roles. to perform crucial job functions.

That's why we're committed to creating a workforce where all individuals can do their best work. It's when everyone's voice is heard and valued that we're inspired to go beyond what's been done before.


Oracle is an Equal Employment Opportunity Employer*. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability and protected veterans' status, or any other characteristic protected by law. Oracle will consider for employment qualified applicants with arrest and conviction records pursuant to applicable law.

* Which includes being a United States Affirmative Action Employer

Austin, Texas

For over three decades, Oracle has been the center of innovation for business software—birthplace of the first commercially available relational database, the first suite of internet-based applications, and the next-generation enterprise-computing platform, Oracle Fusion. Today, Oracle provides the world's most complete, open, and integrated business software and hardware systems, with more than 370,000 customers—including 100 of the Fortune 100—representing a variety of sizes and industries in more than 145 countries around the globe. And Oracle's 104,500 global employees—including 30,000 developers working full-time on Oracle products—are critical to that success.

Oracle recruiters are always searching for brilliant employees with an entrepreneurial spirit, looking for a work culture where innovation is the goal, hard work is expected, and creativity is rewarded. Oracle employees enjoy competitive salaries, excellent health benefits, and a network of like-minded co-workers that drive innovation across the entire technology industry.

Similar jobs