MTS1 DevOps/Site Reliability Engineer
in Scottsdaleat PayPal

Job Snapshot

Job Description

Fueled by a fundamental belief that having access to financial services creates opportunity, PayPal (NASDAQ: PYPL) is committed to democratizing financial services and empowering people and businesses to join and thrive in the global economy. Our open digital payments platform gives PayPal’s 254 million active account holders the confidence to connect and transact in new and powerful ways, whether they are online, on a mobile device, in an app, or in person. Through a combination of technological innovation and strategic partnerships, PayPal creates better ways to manage and move money, and offers choice and flexibility when sending payments, paying or getting paid. Available in more than 200 markets around the world, the PayPal platform, including Braintree, Venmo and Xoom enables consumers and merchants to receive money in more than 100 currencies, withdraw funds in 56 currencies and hold balances in their PayPal accounts in 25 currencies.

The Monitoring Team at PayPal seeks a talented Site Reliability Engineer to help us improve the reliability and resiliency of site monitoring. This position will be located in our Scottsdale, AZ office. Our team builds the world-class products that are used to collect, ingest, store, alert, and report on critical health metrics produced by the large-scale distributed system that runs PayPal. Our monitoring platform is built using n-tiered, scalable technologies such as Flink, Kafka, Java, GoLang, HBase, OpenTSDB, Elastic Search and Druid. As a site reliability engineer on our team you are required to ensure availability of the distributed monitoring platform by managing configurations, system health, tuning parameters to achieve optimal application performance, stability and reliability. A successful candidate will require strong programming skills, sound working knowledge of dev-sec-ops (guidelines + tools + process), understanding of cloud technologies, automation systems, data centers, load balancing, as well as excellent communication and planning skills. In addition to reliability aspects, you will be asked to contribute to our software stack to better understand the gaps in implementation which can cause potential performance issues in production.
If you are passionate about systems design, scaling beyond 99.9% reliability and working in a highly dynamic environment with a team of smart and talented engineers then this is the job for you.

Responsibilities:

Ensure reliability of stream based applications which process up to 10 million data points / second with high reliability, low operational overhead and minimal data loss.

Automate the maintenance of systems after they go live by measuring and monitoring availability, latency and overall system health.

Collaborate with other engineers on code reviews, internal infrastructure improvements and process enhancements.

Passionate about mentoring team members and bringing in new technologies when necessary.

Prior experience in monitoring large scale distributed systems. Demonstrated knowledge of automation for most of the manual tasks around SDLC with techniques such as packaging with Docker, provisioning with Ansible, ensuring a reliable CI/CD pipeline to build and deploy code, automated system restarts and alerting for all critical modules.

Should be able to isolate errors by trouble-shooting the application stack from application to framework to underlying infrastructure dependencies and network.

Experience with Cloud Native software development and other monitoring like Nagios or Splunk

Experience with any of the following monitoring tools: Grafana, TSDB, Druid or types of monitoring tools: alerting, logging, tracing and time-series metrics.

Working experience in supporting massively scalable high performance systems.

Excellent problem solving skills.

Ability to identify performance bottlenecks and mitigate system failures.

Contribution to an open source project in operations, automation, or monitoring.

Education:

Bachelors or Master’s degree or equivalent in computer science or related field with minimum of 5 years of directly related work experience.

We're a purpose-driven company whose beliefs are the foundation for how we conduct business every day. We hold ourselves to our One Team Behaviors which demand that we hold the highest ethical standards, to empower an open and diverse workplace, and strive to treat everyone who is touched by our business with dignity and respect. Our employees challenge the status quo, ask questions, and find solutions. We want to break down barriers to financial empowerment. Join us as we change the way the world defines financial freedom.