Portland, Oregon

The Senior System Administrator at Xandr is responsible for effective provisioning, installation, configuration, operation, and maintenance of systems hardware and software and related infrastructure. This individual participates in technical research and development to enable innovation within the infrastructure. This individual ensures that system hardware, operating systems, software systems, and related procedures adhere to organizational values. 

 

About the team: 

The System Operations team at Xandr maintains operation of over 8,000 Linux servers, distributed across 6 data centers globally. The systems run various distributed applications such as Kubernetes, Nginx, and Artifactory, as well as more traditional applications such as DHCP, DNS, and Kickstart. Responsibilities on these systems include system administrative engineering and provisioning, operations and support, maintenance and research and development to ensure the platform adapts or exceeds business needs. System Administrators on the team will assist project teams with technical issues in the initiation and planning phases. These activities include the definition of needs, benefits, and technical strategy, research & development within the project life-cycle, technical analysis and design, and support of operations staff in executing, testing and rolling-out the solutions. Participation on projects is focused on smoothing the transition of projects from development staff to production staff by performing operations activities within the project life-cycle. 

 

About the job: 

System Administration Engineering and Provisioning 

  • Engineer and provide proof-of-concept technical solutions for various project and operational needs. 
  • Manage servers and configure hardware, services, settings, storage, etc. in accordance with standards and project or operational requirements. 
  • Research and recommend innovative, and where possible, automated approaches for system administration tasks. Identify approaches that leverage our resources and provide economies of scale. 
  • Identify areas of operation where automation can increase efficiency and decrease human error and implement a solution to do so 
  • Evaluate new versions of software/technologies and provide and implement any changes and tasks necessary to leverage it for operations or project needs 
  • Identify potential security risks and propose practical mitigation measures 
  • Assess several, often conflicting constraints and make rapid decisions in a dynamic environment 
  • Create, verify, and review patches to the software that runs the infrastructure in the form of pull-requests 
  • Provide technical leadership in planning, development, and execution of software efforts 

Operations and Support 

  • Ensure the integrity and availability of all hardware and key services by utilizing monitoring tools, log aggregation tools, and customer reports 
  • Ensure business data integrity by supporting our storage systems and performing any maintenance tasks necessary to prevent data loss (hardware repairs, fire drills, integrity checks) 
  • Review security reports to identify any possible violations on a regular cadence 
  • Provide support per requests from various constituencies.  Investigate and troubleshoot any issues reported. 
  • Repair and recover from hardware or software failures.  Coordinate and communicate with impacted constituencies. 
  • Provide on-call support (escalations from Level 1 Support Team) 

Maintenance 

  • Maintain operations runbooks, configuration, or other procedures. 
  • Perform periodic performance reporting to support capacity planning. 
  • Perform ongoing performance tuning, hardware upgrades, and resource optimization as required. This requires using various performance tuning tools to identify bottlenecks internal and external to the system. 
  • Provide support for datacenter maintenance and operations as needed. 

About your skills:  

  • 5+ years of Linux experience in supporting Debian-based distributions such as Ubuntu 
  • 5+ years writing scalable tools using scripting languages such as Perl, python and shell 
  • 5+ years in configuration management tools such as Puppet, Ansible, and Terraform 
  • 5+ years of managing storage systems running ZFS or CephFS 
  • 3+ years of deploying and administering repository managers, especially with JFrog Artifactory 
  • 3+ years of using monitoring tools such as Nagios and Sensu 
  • 1+ years of deploying and administering systems using container technologies, especially with Kubernetes and Docker, as well as Helm, Spinnaker, Prometheus, Calico, Flannel, Fluentd, and influxdb 
  • 2+ years of building and managing Debian software packages from source, including creation of Makefiles. 
  • Familiarity with Git and other source control tools are required 
  • Familiarity with using AWS or Azure is preferred but not required 
  • Familiarity with configuring NGINX and Kerberos is preferred but not required  
  • Familiarity with log management tool such as Splunk or SumoLogic is preferred but not required 

 

More about you: 

  • You are passionate about a culture of learning and teaching. You love challenging yourself to constantly improve, and sharing your knowledge to empower others 
  • You like to take risks when looking for novel solutions to complex problems. If faced with roadblocks, you continue to reach higher to make greatness happen 
  • You care about solving big, systemic problems. You look beyond the surface to understand root causes so that you can build long-term solutions for the whole ecosystem 
  • You believe in not only serving customers, but also empowering them by providing knowledge and tools 
  • You believe in solving problems, not fixing them 

Portland, Oregon

AT&T is bringing it all together for our customers, from revolutionary smartphones to next-generation TV services and sophisticated solutions for multi-national businesses.

For more than a century, we have consistently provided innovative, reliable, high-quality products and services and excellent customer care. Today, our mission is to connect people with their world, everywhere they live and work, and do it better than anyone else. We're fulfilling this vision by creating new solutions for consumers and businesses and by driving innovation in the communications and entertainment industry.

We're recognized as one of the leading worldwide providers of IP-based communications services to businesses. We also have the nation's most reliable 4G LTE network.* We also have the largest international coverage of any U.S. wireless carrier, offering the most phones that work in the most countries. AT&T operates the nation's largest Wi-Fi network** including more than 32,000 AT&T Wi-Fi Hot Spots at popular restaurants, hotels, bookstores and retailers, and provides access to nearly 1 million hotspots globally through roaming agreements.

AT&T U-verse is TV inspired by you. It's TV the way you want it, with tons of cool features and capabilities. AT&T is the only national TV service provider to offer a 100-percent IP-based television service. It's part of our "three-screen" integration strategy to deliver services across the three screens people rely on most - the mobile device, the PC and the TV.

As we continue to break new ground and deliver new solutions, we're focused on delivering the high-quality customer service that is our heritage.