Software Engineer, Site Reliability Engineer

Our mission.

As the world’s number 1 job site, our mission is to help people get jobs. We need talented, passionate people working together to make this happen. We are looking to grow our teams with people who share our energy and enthusiasm for creating the best experience for job seekers.

The team.

We are a rapidly growing and highly capable engineering team building the most popular job site on the planet. Every month, over 200 million people count on us to help them find jobs, publish their resumes, process their job applications, and connect them to qualified candidates for their job openings. With engineering hubs in Seattle, San Francisco, Austin, Tokyo and Hyderabad, we are improving people's lives all around the world, one job at a time.

Your job.

Site Reliability Engineering (SRE) applies software engineering techniques and discipline to production operations to attack major problems and fix them for good. SRE adds nines to our already well-engineered and highly reliable software products supporting job seekers, employers, and internal customers. Every month, over 200 million people count on us to help them find jobs, publish their resumes, process their job applications, and ultimately help them get hired at their next job.

SRE is always on call to keep our products available and fast. The team spans Austin, Dublin, Hyderabad, Seattle, San Francisco, and Tokyo in order to have a follow-the-sun on call rotation. SRE is new at Indeed, and members of this team will have the chance to influence the direction for a critical and global SRE organization. There will be ample opportunities for growth in many areas: technology skills, leadership, mentorship, design, and more.

Responsibilities

Participate in the entire software lifecycle including design, delivery, measurement, and learning.

Design, write, ship, and motivate the creation of software and systems to increase product reliability and organizational efficiency.

Support the software lifecycle through activities like reviewing designs, creating platforms and frameworks, capacity planning, and chaos testing.

Maintain service health by through monitoring and follow-the-sun incident response.

Improve service reliability through root cause analysis, blameless postmortems, and using code to prevent or respond to problem recurrence.