Pages

Tuesday, 19 September 2017

Podcast Interview with Chuck Calio, IBM

Last year at GraphConnect San Francisco, we had this great announcement where we were having some of IBM's most senior leaders, Doug Balog, talk about what they were doing together with Neo4j to let the graph database perform like crazy on the Power8 hardware platform:

Doug came on stage and talked to Emil and the audience about all the hard work that was going on there, and now, just before GraphConnect New York - it felt like the right time to check in with friends at IBM to talk about their work with Neo4j and how that might affect the Graph community. So we got Chuck Calio to spend some time with us on the podcast - and here's our chat:

Here's the transcript of our conversation:

RVB: 00:03.212 Hello, everyone. My name is Rik, Rik Van Bruggen from Neo Technology. And it's been a long summer. It's been a really long summer and I've enjoyed it a lot, but it's time to get this podcast show on the road again. And so today I've invited and got a wonderful person on the other side of this Skype call from our dear friends at IBM, and that's Chuck Calio. Hi, Chuck.

CC: 00:32.966 Hey, Rik! How we doing?

RVB: 00:34.711 Good.

CC: 00:34.668 Summer's come to an end and here we go with a podcast to kick things off in September.

RVB: 00:40.038 Exactly. That's the way it is. And thank you for making the time, Chuck. I really do appreciate it. And as always, many people may know you but other people may not. So I'd like you to introduce yourself a little bit. Who are you, what do you do, and what's your relationship to the wonderful world of graphs?

CC: 00:58.212 Okay. Thanks, Rik. So I'm Chuck Calio. I work in IBM. I'm based out of Poughkeepsie, New York, which is 100 kilometers north of New York City. And I'm actually the worldwide offering manager for IBM's Neo4j on the Power Hardware solution. And so my role is to do everything from develop new Neo4j on Power Systems hardware offerings, to help sell and market the solution. And spend a lot of time working with individual clients on responding to opportunities for Neo4j on Linux and Power, to help them understand Neo4j on Linux and Power, to guide them through the process of learning about graph and how graph would complement their existing relational database environment, and/or their other NoSQL environments, many of which I've deployed like the MongoDB or Redis. And now, they're looking at the exciting growth opportunities that graph and connected data also mean and present to them. So basically, my role is to really be the individual leading advocate for Neo4j on Linux and Power, and it's very easy because there's an incredible demand for Neo4j on Linux and Power. And so that makes my job easy. It's a fascinating job. I typically, on every day, will get requests from all over the world to do just about anything. So that's what makes it fun. I drive into work in the morning and I don't know what's going to happen, and I kind of enjoy that. I'm pretty well--

RVB: 02:25.884 Super cool.

CC: 02:26.986 --[inaudible] kind of environment. So--

RVB: 02:29.614 So, Chuck, I mean, I think many people may not know exactly what we do together, right? I mean, Neo4j and IBM, we've been integrating those two environments quite a bit. And our chief scientist, Jim Webber, talked about it at GraphConnect San Francisco last year, and in London. But maybe it will be useful to kind of repeat that. What's the story there? What's the vision behind this integration between graph databases and Power?

NOTE: some of the audio in this next section was unfortunately lost in recording. We have tried to represent/save as much as possible - but some parts of this next part of the conversation are sadly missing.

CC: 02:59.535 Yeah, sure. So Neo4j runs on different types of hardware. And in particular, on the Power hardware, we started out with Neo4j a couple years ago, where we just basically ported Neo4j to Linux on the Power Systems hardware and sort of-- that gave a kind of a solution that would allow Neo4j to inherit the quality of service that the larger memory, more threads, faster CPUs, faster memory to CPU, and the basically better I/O of Power Systems. So that was sort of the first phase of the work. And then we did find that Neo4j and graph databases in particular respond very well to larger amounts of memory and faster bandwidth. So then we worked on further optimizations with Neo4j on the IBM POWER8 Systems [inaudible] accelerator [inaudible] also extend the memory. So we started out with just the basic [inaudible] solution, then we worked out on more optimized [inaudible] look at larger memory sizes to enable Neo4j to scale [inaudible] users and transactions and graph sizes and such. So that was sort of the second step in the process. Then the third step of the process is the actual hardware designers [inaudible] next generation of IBM Power Hardware which is called POWER9. That will come out starting in the fourth quarter of 2017, and then more in 2018. We actually had our electronic design engineering team actually start to use Neo4j to better optimize chip design and timing design. So based on that, we sort of had a next step beyond that, where we could actually do some hardware traces of the Neo4j software running on the IBM Power hardware, and now we're even identifying further enhancements to Neo4j software based on the traces that we did on the IBM POWER8 hardware, which is based on running Neo4j on IBM POWER8, which was designed with Neo4j.

CC: 05:07.774 So we have a recursive kind of thing going on here. We have some incredibly valuable use cases that we're finding between the two companies, but more importantly, creating a kind of an innovation, a one plus one equals three solution that our clients will benefit from greatly going forward in the future. And that's the most important thing to me.

RVB: 05:28.047 That's so cool. Are there any kind of indicative advantages? Like quantitative results in terms of what types of systems we can deploy on POWER8 using Neo4j? Have we done some tests there?

CC: 05:45.127 Yeah. We do. We have some performance data. So we're typically 80% better price performance than our competitors. And we enable up to 56 terabytes of either RAM and/or near RAM speed memory. So you can have very, very large graphs all on memory with Neo4j on Linux and Power. It's a very unique part of our solution that makes it very, very scalable. And very large clients appreciate that part of the solution. And so going forward, we're looking at further ways to optimize the transfer rates between memory and the CPU. And further looking at exploiting accelerator technologies. Because in general purpose hardware the advancements are being slowed a little bit due to Moore's law's limitations and such. But the big thing happening in hardware nowadays is to exploit accelerators, both GPUs and FPGAs. And we're more aligned with the CAPI technology that we have at IBM, which is essentially larger memory in an FPGA, and working closely with Neo4j's engineering team around trying to see if some of the algorithms that Neo4j uses today that are used a lot by their clients can benefit from really deeper optimization and acceleration. So the thing going on in hardware now is really all about accelerators, both GPUs and FPGAs. And you need these for the really heavy duty use cases, like AI, machine learning, deep learning, graph, other areas that benefit greatly from it. So really exciting stuff. We're finding hardware does matter in some of these newer growth solutions that really challenge the hardware much more robustly than the traditional relational database models which--

RVB: 07:29.064 It's a little bit like what Jim Webber was saying at GraphConnect. Basically, this new kind of a wave where software engineers and hardware engineers are going to have to work together much more intimately in order to get these really big workloads to perform, right? The collaboration between POWER8 and Neo4j seems to be in line with that, I guess.

CC: 07:58.083 Yeah, that's absolutely right. I think the very specialized expertise and really hardware that steps up and runs certain workloads in use cases much, much better, orders of magnitude better, than just general purpose hardware I think is a very common approach. And a lot of the growth solutions-- in particular a lot of the growth solutions in the analytics including artificial intelligence, machine learning and deep learning, that specific area where I would put graph and Neo4j into seems to benefit greatly from the latest levels of hardware and accelerated hardware. So really exciting area to work on. It's really stepping up to meet the big challenges that our clients have of us. Of course price is very important and always does matter, so that's also something we have to keep our eyes on.

RVB: 08:49.563 Totally. Hey, Chuck, and so where is this going, you think? What does the future hold, you know? Do you see this accelerating in the next couple of years, or what's in store for POWER9 and maybe beyond that and [crosstalk] databases like ours?

CC: 09:06.731 Yeah, I think we're going to continue to see the interfacing and the interplay between graph and artificial intelligence, machine learning, and deep learning. I think that's a given. I think that that's an important area that we see an expansion on. I think super advanced cyber security solutions is also an area that we're both really interested in and focusing on. So those kinds of things I think are where I see it going. The other thing I would like to mention is expansion to areas like the Asian markets, China in particular, Japan, countries like that. We're seeing a lot of big step up in interest from those areas. Quite a big recent increase in Neo4j on Linux and on Power in the Asian market. So that's another trend I'd like to mention, which I'm very, very pleased with. And then inside of IBM, I feel the direction now is a natural expansion to areas beyond my subject matter of expertise. So for example, the IBM Watson cognitive team is now using Neo4j. Like I said before, our cyber security teams, other hardware teams are looking closely at Neo4j. And I anticipate that many other areas, including potentially software development teams inside of IBM, would also look at using Neo4j and use cases around software development and graph to identify areas to review it's defects, to increase productivity, to be more agile and that kind of stuff, so.

RVB: 10:29.003 Lots of stuff happening. It's a very exciting time. And I guess we'll be hearing a lot more of that at GraphConnect New York, right? I'm assuming you'll be there.

CC: 10:37.579 Yes. I will be a featured speaker. And you know what I'd like to see is anybody who wants to come on over and meet me, meet us at the IBM booth. We're a gold sponsor of GraphConnect, and we are very pleased to be at the New York City event. New York City is absolutely a wonderful town to come and visit. Please come and see us. We will be there. A number of my colleagues from IBM will be there with a diverse set of background. I think you'll find it fascinating to stop by. Meet us. Like I said, we are a gold sponsor and we'll have a booth and some feature sessions. And please stop on by in GraphConnect New York City. We'd love to see you. It's a great town as well, so. And I'm sure the conference will be absolutely fabulous in terms of a broad variety of speakers with a lot of subject matter expertise, skills, ability and experience. Yeah, really ranging from smaller start ups, up to the biggest firms showcasing how Neo4j is bringing value to them.

RVB: 11:34.526 That's it. And I think that's what we're all looking forward to. I'll be there as well for a full week. So I'm looking forward to meeting you there face to face Chuck. And I'll continue the conversation then.

CC: 11:49.546 Very good. Very good.

RVB: 11:50.550 Thank you so much for coming online. It was great talking to you and I'll see you in about a month.