Cloud - Principal Site Reliability Engineer US

Elastic is a search company with a simple goal: to solve the world's data problems with products that delight and inspire. As the creators of the Elastic Stack, we help thousands of organizations including Cisco, eBay, Goldman Sachs, Microsoft, The Mayo Clinic, NASA, The New York Times, Wikipedia, Verizon, and many more use Elastic to power mission-critical systems. From stock quotes to Twitter streams, Apache logs to WordPress blogs, our products are extending what's possible with data, delivering on the promise that good things come from connecting the dots. We have a distributed team of Elasticians across 30+ countries (and counting), and our diverse open source community spans over 100 countries. Learn more at elastic.co

Thanks to our ongoing expansion we have the opportunity to grow our Site Reliability team. We're a part of the Elastic Cloud engineering team with a focus on solving Cloud operations problems and keeping the SaaS online, who aren’t afraid to get our hands dirty. We are the first line of consumers for Elastic's products and our experience helps influence the direction of the stack. While most organizations may have a single or a handful of Elastic Stack deployments, here you’ll be responsible for identifying, troubleshooting and reporting platform problems to product engineers (or fixing the code yourself) in order to ensure that the thousands of Elasticsearch clusters we manage are providing a stable and reliable service. We’re looking for people who are just as passionate about troubleshooting issues with distributed systems as they are to automate, code and collaborate to solve problems.

Responsibilities

You will report and solve problems within the Elastic Cloud infrastructure services and collaborate on issues with product engineers

You will participate in SRE software engineering, writing code for the continuing reduction of human intervention in operational tasks and automation of processes

You will monitor the Elastic Cloud platform and Cloud infrastructure, responding to incidents, correcting and improving systems to prevent incidents and planning capacity

You will manage Cloud provider infrastructure, system deployments and product releases

You will be involved in resolving Elastic Cloud customer support issues

You will demonstrate and promote best practices for teams using Cloud platforms

You will participate in 24x365 on-call schedules

Experience

You have 2-3 years experience providing technical leadership and mentoring members of a Site Reliability Engineering team

You have experience establishing, monitoring, and reporting on team Service Level Objectives

You are a software-focused engineer with real interest, and ideally some experience, in Linux systems, networking, monitoring and automation.

You have at least three years of experience using a public Cloud; AWS, GCP, Azure, SoftLayer or OpenStack

You are comfortable writing software to automate API-driven tasks at scale. SRE's use Python and Go regularly but are also encouraged to contribute to the product codebase in Java, Scala, and Python.

You have used Ansible, Puppet, Chef or another config management suite, know where it's broken, and open to trying new alternatives

You preferably have extensive GovCloud experience

Key Skills

Healthy knowledge of Linux (have compiled your own kernel at some point, know how to trace syscalls, understand TCP, care about the difference between sysvinit/runit/systemd, etc.)

Relentless desire to automate and build software tools

Desire to represent work in git, driven by a GitHub workflow through issues and pull requests

Love open source development, and have contributed to some project somewhere (doesn't have to be ours), whether through mailing lists, patches, documentation, etc.

Enjoy working remotely and the communication it requires

Love a diverse environment, working with people all over the world

#LI-MD1

Additional Information - We Take Care of Our People

At Elastic, we strive to have parity of benefits across regions. While regulations differ from place to place, we believe taking care of people is the right thing to do.

Healthcare for you and your family in many locations.

Flexible location and schedule for many roles.

Generous number of vacation days each year.

Double your charitable giving — we match up to 1% of your salary.

Up to 40 hours each year to use toward volunteer projects you love.

Elastic is an Equal Employment employer committed to the principles of equal employment opportunity and affirmative action for all applicants and employees. Qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender perception or identity, national origin, age, marital status, protected veteran status, or disability status or any other basis protected by federal, state or local law, ordinance or regulation. Elastic also makes reasonable accommodations for disabled employees consistent with applicable law.

When you apply to a job on this site, the personal data contained in your application will be collected by Elasticsearch, Inc. (“Elastic”) which is located at 800 W. El Camino Real, Suite 350 Mountain View, CA 94040 USA, and can be contacted by emailing jobs@elastic.co. Your personal data will be processed for the purposes of managing Elastic’s recruitment related activities, which include setting up and conducting interviews and tests for applicants, evaluating and assessing the results thereto, and as is otherwise needed in the recruitment and hiring processes. Such processing is legally permissible under Art. 6(1)(f) of Regulation (EU) 2016/679 (General Data Protection Regulation) as necessary for the purposes of the legitimate interests pursued by Elastic, which are the solicitation, evaluation, and selection of applicants for employment. Your personal data will be shared with Greenhouse Software, Inc., a cloud services provider located in the United States of America and engaged by Elastic to help manage its recruitment and hiring process on Elastic’s behalf. Accordingly, if you are located outside of the United States, your personal data will be transferred to the United States once you submit it through this site. Because the European Union Commission has determined that United States data privacy laws do not ensure an adequate level of protection for personal data collected from EU data subjects, the transfer will be subject to appropriate additional safeguards under the standard contractual clauses. You can obtain a copy of the standard contractual clauses by contacting us at privacy@elastic.co. Elastic’s data protection officer is Daniela Duda, who can be contacted at daniela.duda@elastic.co. We plan to keep your data until our open role is filled. We cannot estimate the exact time period, but we will consider this period ended when a candidate accepts our job offer for the position for which we are considering you. When that period is over, we may keep your data for an additional period no longer than 3 years in case additional opportunities present themselves in which yours skills might be better suited. For additional details, please see our Elastic Privacy Statement https://www.elastic.co/legal/privacy-statement.