The Five Minute Interview – Zonar

This article is one in a series of quick-hit interviews with companies using Apache Cassandra and DataStax Enterprise for key parts of their business. For this interview, we spoke with Jesse Young, Vice President of Software Development, and Josh Hansen, System and Application Architect, at Zonar Systems.

“We needed to be able to quickly expand our storage and store this data in real-time without any bottlenecks.”

— Jesse Young
VP of Software Development, Zonar Systems

DataStax: Jesse, what does Zonar Systems do for its customers?

Jesse: At Zonar we offer a safety inspection system and telematics for heavy fleet vehicles. Ultimately our offering is GPS tracking of vehicles that weigh over 10,000 pounds or carry more than 8 passengers. We make sure users know exactly where those vehicles are going and collect a lot engine diagnostics information, so that we know exactly what’s going on with the vehicle in real-time.

DataStax: You’re doing a ton of fleet logistic tracking, which is a great use case for something like Cassandra with DataStax Enterprise. What is your specific use case for Cassandra at Zonar Systems?

Jesse: Today we’re tracking over 350,000 vehicles across the United States and Canada; we’re quickly growing and expect to be at 500,000 devices by the end of next year. We’re a leader in our industry space. We’re over 100TB in our data stores right now. We’ve maxed out our RDBMS solution and have been looking at how we can quickly store data and retrieve it as fast as possible.

DataStax: Is the vast majority of that 100TB all GPS-type data?

Jesse: Most of it. We get very heavily into GPS data, but we’re also collecting a lot of information off of the engine computer itself such as oil temperatures, cooling temperatures, cruise control state, fault code, check engine lights, and stop engine light information.

DataStax: That’s awesome. How often do the devices send out information?

Josh: We collect data around every 18 seconds; if you imagine that, it stacks up pretty quickly across 300,000 vehicles. We’re running anywhere from 6 to 12 hours a day, so do the math. It’s quite a bit of data.

DataStax: What was your original motivation for looking at alternate technologies to a relational system?

Jesse: Many different factors motivated us to start looking for a better solution. Again, we really knew that we had a lot of data that we’re potentially starting to store. We needed to be able to quickly expand our storage and store this data in real-time without any bottlenecks.

At the same time our users require us to report on that data very quickly and we didn’t really have the desire to have both OLTP type databases and data warehousing, as they became very expensive. From a system’s approach, we needed a system that had built-in multi-data center replication.

DataStax: What information can you share about what your current infrastructure looks like around Cassandra?

Jesse: We’re still fairly private with that, but we are running multiple data centers and leveraging virtual private cloud providers.

DataStax: Awesome. Other than GPS information and time series data, are there any other use cases inside the environment that is utilizing Cassandra?

Josh: We plan on supporting elevation data. We have a digital elevation model that we received from USGS that we store in Cassandra as well. We tried using it in the relational system, but it pretty much fall on its face. Cassandra was a perfect fit.

DataStax: When you said that it “fell on it’s face”, what’s the primary issue there? Is it write throughput? Help me understand that a little more.

Josh: The volume of data, on a single node, that you get in a relational system was the issue. We would have to scale it out similar to how Cassandra scales, and shard it, and all that stuff. If you’re running all that code, it makes sense to adopt a system that does all that for you.

DataStax: What’s your favorite thing about Cassandra?

Josh: My favorite part has to be the scaling aspect, to be honest. It’s so much easier working with a whole cluster of nodes that’s one big mesh. In our Postgres systems you have to work on them individually; if you want to run any jobs you have to connect to each one, one by one around the job, wait for it to finish and then go to the next. Cassandra just gets rid of a lot of that and lets us hit the cluster and use it that way.

Jesse: The performance is amazing. One of the things that I love about it too is the community; there’s a giant field of experts out there that are willing to help people for free, whether it be on Twitter, IRC, Planet Cassandra or all the meetups happening or even the summit events.