Democratizing Data at Go-JEK

Having grown 6666x in the last 3 years; the data generated has grown exponentially. As a Data Engineer at GoJEK, we faced the issue with having our complete team managing infrastructure requests. This led us to create an internal portal for other teams to self-provision their data. This talk is divided into two parts. In the first part, I will cover details about how we have scaled our data engineering infrastructure to manage the scale of more than 40 million messages per day. I will explain the data consumption, aggregation, monitoring and cold storage. This will also cover details about how we scaled our infrastructure to achieve the scale that we are at today. In the second part, I will cover how we created our internal portal for infrastructure orchestration. The infrastructure backed by kubernetes enables teams to self-provision data infrastructure without any supervision.

MAULIK SONEJI

Go-jek

Maulik is currently working as a Data Engineer at Go-JEK, where he provides reliable data infrastructure across all of GO-JEK’s 18+ products. He provides real-time data for use cases like AI-based allocations, fraud detections, recommendations, to critical real-time business reporting and monitoring. We have open sourced our internal tools like heimdall and loki; we have plans to open source other projects as well. I am an open source contributor to Mifos, contribute to Apache Fineract and serve as a mentor for programs like Google Summer of Code and Google Code-In. I am interested in learning about data infrastructure, blockchain, and application of AI and ML in real-world applications.