Pages

Thursday, 24 December 2015

Yey! The festive season is upon us! And: here's the 50th (!!!) podcast episode in the Graphistania recordings. What a journey it has been! To celebrate, I got to talk to my friend and colleague Andreas B. Kollegger. Andreas was probably one of the first people that I talked to when I met the Neo4j team for the first time in the summer of 2012, and he always impressed me with his calm and creative mindset. To give you an example, here's a short demo that Andreas created showing how you can create the essence of a recommendation engine in Neo4j in 2 minutes:

Super sweet. Nowadays, "ABK" is part of the product management team at Neo4j, and he has plenty of interesting things to talk about when it comes to past, present and future of Neo4j. So let's listen to the episode:

Here's the transcript of our conversation:

RVB: 00:02 Hello everyone. My name is Rik, Rik Van Bruggen from Neo Technology. Tonight, I am going to record a podcast episode that I've been looking forward to for a very long time, with my dear friend in dark, rainy, beautiful Portland, Andreas Kollegger. Hi Andreas?

ABK: 00:20 Hello, Rik. Thank you for having me on the podcast today.

RVB: 00:23 Yeah, I know. It's such a joy, thanks for coming on. I am going to call you ABK for short, if you don't mind, lots people know you as ABK. Why don't you introduce yourself Andreas? You've been part of the new Neo4j ecosystem for a very long time but maybe some people don't know you yet.

ABK: 00:42 Sure. My name is Andreas B. Kollegger. I'll let the B be a mystery for the moment. I work for Neo Technology and I've been a part of the Neo4j community for-- it feels like as long as I can remember from at least as long back as back is the epic 0.9 release of Neo4j.

RVB: 01:01 Oh no way.

ABK: 01:03 Yeah, it's been quite awhile and I've grown from community member all the way through to now being a product manager or product designer depending on who you talk to and the time of day.

RVB: 01:14 [laughter] That's very funny. And you've been a great stimulator of the Neo4j community. I remember talking to people on the East coast, West coast, South, wherever, that said, ABK made me start up a community over here or meet up over here. You've been part of it for such a long time, right?

ABK: 01:34 Yeah that's right. From when we were truly just trying to get things going and get excitement through meet-ups and lots of advance and actually trying to continue that trend. When I was on the east coast it was my duty and honor and pleasure to travel up and down almost all the way from Boston down to around Washington, having meet ups, meeting people, talking whenever I could about Neo4j and spreading the great love of crafts that I'd found. And I'm doing the same now here in Portland. Actually tonight will be my very first community meet up in Portland. I'm very excited about that.

RVB: 02:11 Super cool, super cool. When you say product manager, product designer, what does that mean? What do you do as a day job?

ABK: 02:21 Yeah, that's why the sort of manager verse designer split is interesting. When I moved into like this role, it was thought of more as doing product design, which I guess is certainly a bit of a vague term if you we were building physical products like phones or something then I would very clearly be doing like graphic design for the phones like in 3D or something, sculpting physical objects would be part of the product design. But as they involved in the software project, product design ranges a bit more obviously from the front-end user experience elements of the product. But also we thought about it as paying a bit more attention to sort of the experience generally of using Neo4j. Any of the parts that you touch of Neo4j whether it's an API or the documentation on to the website. It would be good to have somebody just to try and connect all the different parts, and have them all make sense so that when you read something on the website it reflected how that part actually behaved, messages and things had the same voice and tone as maybe some of our blog posts. Obviously, that was only really something that when we were a smaller organization made sense. Now that we've grown substantially, we have people who are superb at each one of these things. I've been doing less and less of that focus and thinking much more in terms of just purely product management type of stuff which is take on different features of Neo4j. Taking a look, I guess, at the giant list of things we would love for your Neo4j to become, and what it is today, and where we think we are going, and figuring out what to do next, and how much of it do next.

RVB: 04:03 But you're figuring more big things, right and yes I mean you're the movie star in the Neo4j trainings, right?

ABK: 04:11 [laughter] I suppose that's true. I do forget that, and every now and again, I'll still meet people and they'll look at me and they'll say, "Wait I know you. Aren’t you the guy from the videos?" "[laughter] That's right. That is me." So, if you’ve had the pleasure of using our online tutorials and watching the short video clips, that is me in the video clips.

RVB: 04:30 Exactly, in a beautiful tie and [laughter]... So why graphs, ABK? What attracted you to the graphs in the first place? And what fires you up every morning to keep on working on this stuff?

ABK: 04:48 I have to say that my motivation hasn't changed since the early days of when I first went and tracked down Neo4j. I was in that generation of people who started looking for a graph before we knew we were looking for a graph. I've been doing international non-profit work. Actually, with this wonderful organization doing work in sub-Saharan Africa, and effectively use medical informatics work, right? We were doing patient care and disease surveillance and things like that and so many of the data models we were working with, we have the sort of classic realization that our sort of traditional, good old relational database models were either, really perfect and awesome for the reporting we needed to do. But maybe, not so great for actually doing any analysis and trying to understand public health concerns like, why did this pattern of disease progress in the way that it did? And, sort of collectively I think, we had an understanding that what we were doing was a graph problem. We didn't think about it, I think, in that way. Except that, I happen to be lucky enough to be, at the time, living in Baltimore and one of my neighbors was heavy into ontology databases. And, I was looking at his database and I thought, "Oh wow, that's brilliant. That's maybe exactly what I want I want." He talked to me about it, he said, "Well, maybe, but this may be more than you actually need. There could be something in between that has the flexibility and expressiveness of an ontology database but without being entirely prescriptive, so it could be a bit more flexible for the application and easy to use. And he actually introduced me to Neo4j. He said, "Why don't you go check out this project. It looks like it might be perfect for what you're trying to get done." And I fell in love. It was exactly what I wanted. It thought about data the way that I want it to think about data. And so, I used for few projects, I tried to you know get involved with community and make some contributions of my own to the code base and that's what began my life long sort of journey with Neo4j and the organisation and community.

RVB: 06:56 Do you remember what was the killer feature that attracted you to sort of get started with it, what was that? Was it the domain model? Or what was it exactly that attracted you so much?

ABK: 07:07 Honestly, it was this whole-- the simple to say the thing like that, it's all about relationships [chuckles]. It's almost trife, but like, that simple shift in thinking from looking at the, and caring about the individual records but to thinking about how these records relate. That's where all the value was and all the data modeling I was doing, all the applications I was doing. That was so much more powerful than the individual records themselves because that's where you see patterns and progressions of things and Neo4j elevated it and it made it an actual concern you dealt with as part of normal modeling, rather than maybe later on you add in some foreign key constraints or something.

RVB: 07:51 Yeah, totally. Is that something that you still think is a core thing to the product? This relationship-centric view on things, is that still one of the core things?

ABK: 08:03 I do think that it really is-- I think that's really were with the long term relationship with their graph model is, that's where the power is, is in the relationships. One of the near term challenges we had and startup challenge is that one of the getting started and introducing Neo4j. I happen to have my own epiphany. I realized that this is what I want until it felt perfect and it was awesome. But until you think in that way, it can seem weird, right? I feel like we were in this place where we've done a really great job with making graphs awesome, but we can actually do a little bit more to make grass easy to use as well. As it is right now, maybe you have to do. It's great that you think about relationships, but if you're always thinking about relationships, then some amount of structuring and just for getting started, it's like you have to think too much. And we'd like find a nice balance between, you will have to think only a little bit. If you're doing something simple then you don't have to think too much. The simple things you're really easier to do, but you don't get caught into a corner where, because we've made it too simple it's hard to do the more expressive and richer things. So, that's the balance I think that we're trying to move towards in the next-- actually certainly in the next release as well. We're starting to put in some bits of capabilities, that will make that, I think a nicer interaction.

RVB: 09:31 You know what, you're setting yourself up for my final question [laughter]. Where is it all going in this? Where do you see the industry, but also the product, a couple of years from now? What does the future hold?

ABK: 09:48 Yeah, I think that certainly in the industry - that's the broader data base industry - and soon we're part of the NoSQL segment of it, which I think people finally realized like, isn't really a separate segment. It's just people, trying to deal with lots of data and figure out what the best way is to work with all that data. And from each of our different starting points, whether it's graph databases, the column stores, the key values, or anything else, that we're all, of course, iterating on our world view and slowly progressing towards a common understanding that we want to be able to do all the things really well. Of course, I still think that in the end of the day, that graphs of course, are going to be the best way always to think about everything, but as I was saying I guess, like maybe they have a little bit of extra thinking you've got to do just before you start structuring things. So there's things we can do to improve that, but I feel like within the next couple of years we'll see other databases realize that they want to do graph stuff and they'll start adding graph features, and you'll see us making it easier to do stuff that isn't strictly graph stuff. Simple things like let's say my favorite is always to say if you want to manage a list of things, it's very easy to conceive of but you've got to do a little bit of work if what you're doing is always managing relationships, connecting and disconnecting things. That should be dead simple to do and I think you're going to see in the next couple years that we have an easy way of entrusting that as well.

ABK: 11:36 So there are two things I'm excited about in Neo4j 3.0. One is actually just a very simple change to how we present what's currently called Neo4j browser, are the user client for accessing the database. We're taking just some practical steps there to actually separate development to that, from development of the database. And coupling that with the new protocol that we have, this BOLT protocol for connecting with Neo4j gives us the opportunity to do something that was awkward to, previously, which is that you can run Neo4j client, separate from Neo4j, and it can connect to any Neo4j database that happens to be out. It doesn't have to be tied to the database that started up, right?

RVB: 12:17 Yup.

ABK: 12:18 I think that's going to be brilliant. For just day to day use of the Neo4j, it will make much more pleasurable, and also we'll be able to deliver the clients separately from the browser and have an up surf from the database and have more frequent updates and feature requests going in. So, that'll be pretty exciting.

RVB: 12:36 You know, I think there's so many nice things we could talk about. But as you know, I want to keep these podcasts digestible and short, so that people can listen to it on their commute, so I'm going to thank you so much for coming online Andreas. It's been a very nice conversation. I really appreciate it. Thanks again, and I look forward to seeing you soon.

Thursday, 10 December 2015

At the last GraphConnect in San Francisco, we had a wonderful Neo4j user on stage presenting their usage of Neo4j in a very modern and insightful way: to manage and automate some of their software development processes. Ashley Sun can tell you all about this in more detail, so without further ado, here's this weeks' Graphistania episode:

As always, here's the transcript of our conversation:

RVB: 00:01 Hello everyone, my name is Rik, Rik Van Bruggen from Neo, and here I am recording a podcast episode together with a wonderful guest on the podcast episode, all the way from California, Ashley Sun. Hi Ashley.

AS: 00:37 Okay. Hi, I'm Ashley. I work on the DevOps team at Lending Club based in San Francisco. I work a lot on deployment and release automation, and I use Neo4j to do it.

RVB: 00:54 Wow, that's great. How long have you been doing work with Neo4j, Ashley? It must have been for a long time already or--

AS: 01:01 Only a little over a year, I'd say, so my manager first introduced it to me. I think he stumbled upon graph databases on Twitter or something and he's like, "Hey, check out this new thing called Neo4j." And so, we started playing around with it, and it quickly evolved from just a side project to being a really critical part of a lot of our release and deployment automation, infrastructure mapping, app auto-discovery, and a lot of other things, actually.

RVB: 01:37 That's a great segue into what do you guys use if for exactly? Why don't you tell us a little bit more about that?

AS: 01:43 Sure. So, we use if for a lot, a lot of things, actually. So, as I was saying, I guess it was very opportunistic when we started using Neo4j. We had a lot of problems in DevOps and growth pains. So, we started with maybe like five micro-services and a couple years later, we're almost at 150, and so it was getting really difficult to manage and keep track of all these services, and so we did is we use Neo4j to keep track of all these instances. We had them radio home-- we have this internal app called MacGyver. So, we had them radio home every minute to MacGyver, and MacGyver would save all these app instances in Neo4j, and so already, immediately, we just gained a lot of visibility into what services were out there, where they were running, a lot of info like that. And it was really low maintenance, it was easy to scale, we didn't have to do any work because these new instances would just keep reporting back to MacGyver and get saved into Neo4j.

So, from there, we were like, "Oh, this is really useful, so we're going to take this a step further." And so, at Lending Club, we use blue-green deployments. Basically, this just means that we have two pools for every app, and at any given time, only one pool is live. We didn't have a good way before to track what pool is live and what pool is dark, and so we started using Neo4j. We were already mapping our app check-ins, and so we took that and then within Neo4j, created the server nodes, which we then mapped to belong to pool nodes, which we then mapped to belong to service groups. By keeping track of what servers existed in what pool, and whether that pool was live or dark, we were able to automate our releases. Whereas before, releases were very manual. We'd have to go into this GUI and check mark all these boxes; it was just very tedious and very time-consuming. Now with Neo4j, we are keeping track of these info, and so it was just really quick. It was like a flip of a switch and we could make a pool live or dark or even really easily look up what pool is live. And also we used it to track the health of our instances and our apps. So, that was also really, really important and that's what we're using now for deployments.

RVB: 04:18 Super cool. By the way, I love the naming. I'm a big fan of MacGyver [laughter].

AS: 04:24 Awesome. Actually, my manager came up with the idea of the name and I had no idea what MacGyver was and just-- my time of-- I'm like, "Is that like MacGruber from SNL?" So, I had to watch an episode of MacGyver to--

RVB: 04:41 I guess I'm showing my age here a little bit [laughter]. Ashley, so it's basically what you're using for dependencies between all the micro-services, is that what I'm hearing? You're basically tracking everything with these automated ping backs, but then you're mapping it onto like a model of all your micro-services, is that what I'm hearing?

AS: 05:05 Yes, that's correct. So, taking the instances and then arranging that data in a way that becomes useful, so that we know where our apps are and what's live at any time and what's dark. And also even-- so, we're mapping with dependencies from services. So, we map them onto-- for example, vCenter instances and vCenter hosts, and then we take those vCenter arrays, then map those to our storage arrays and our storage volumes. So, what we get is like this huge mapping of our infrastructure. For example, if we want to find a single point of failure, for example, we have an app called ABC and all of its instances reside on one vCenter host, and if that host goes down, then our entire app is wiped out. That's a single point of failure. So, we use Neo4j to keep track of things like that, to avoid these-- it would be a huge disaster if that were to happen.

RVB: 06:12 Yeah. And I seem to recall from one of your talks that you also use this tooling to help you guys do more stuff with Amazon web services. Can you tell us a little bit more about that or did I get that completely wrong?

AS: 06:28 No, no. You're totally right. One of our multi-year projects is we're moving into AWS, and so we're just starting that process now, but already we are mapping a ton of AWS stuff into Neo4j. For example, like our VPCs, subnets, availability zones, our RDS instances and EC2 instances, those map to load balancers, auto-scaling groups, launch configs. You can tell already, it's like a huge, a huge map in Neo4j. So, there's all these different parts, but we are able to make sense of it by mapping relationships together in Neo4j, and also as we move into AWS, we'll start using code deploy. But again, using Neo4j to automate that and put it into MacGyver, so that developers at any time can say, "Hey, I need an instance and I want to launch this app onto it." It'll just be really simple and we'll use MacGyver and Neo4j to do that.

RVB: 07:33 Super cool. So, that brings me to the question that I ask everyone on this podcast, why Neo4j? Why a graph database to do what you're doing? Was there any specific reason for that or is there anything you want to call out that you really like about it in your current environment?

AS: 07:54 Definitely. I guess, first off, the low latency and it's really the ad-hoc querying is super, super useful. I think another thing that really stands out to me is how flexible and scalable Neo4j is. So, we started small just with app instances, but it's really easy to build new layers and new relationships on top of already existing ones, and so where we started with just app instances, now it's become this huge infrastructure mappings of so many different types of nodes. It's really cool how with Neo4j, your data set can really easily evolve and grow in terms of complexity or structure. It's just so easy to use and that's why we've been able to keep using it. And also, obviously, it's just really good at graphing relationships between things and mappings. That's where I really--

RVB: 08:52 Yeah, it makes total sense. Do you guys use Cypher at all? Do you do interactive querying or maybe--?

AS: 08:58 Yeah.

RVB: 08:57 Yeah, you do.

AS: 08:59 Actually, within MacGyver, we have a web interface for Neo4j and people enter and Cypher queries to look up stuff.

RVB: 09:06 Super cool. Very good. So, where is it going, Ashley? Where are you guys going to take Neo4j in the future? Any perspectives on that? I'd love to know more about that.

AS: 09:17 One unit's already a really integral part of my MacGyver and we're just using to hold everything together and-- MacGyver also has become a central point of information. As I said, as we move into AWS, we're going to keep putting all that stuff into Neo4j. We also are using it to track our asset management, and even as I was saying before, the infrastructure mapping. We could add network and database components into that and get-- just keep building the infrastructure map, keep building our AWS map. Another thing that's on the road map is to utilize those app instance check-ins to create a service registry for all of our apps. That way, we can keep track of who owns this app, or maybe if we map it to get [?] what Repo is it? What Jenkins job does this correspond to? Is this out public? So, we'll have a service registry of all our apps, where people can go and just find out info that otherwise would be difficult to pin down.

RVB: 10:22 Super cool. I think it's a great use case for Neo4j, and I'm so happy that you guys found your way to it and are getting good use out of Neo4j. So, it's really--

AS: 10:34 Yeah, me too.

RVB: 10:34 Yeah, it's really great and I think we'll wrap up the podcast for now. I really want to thank you for coming online and talking to us about it.

Friday, 4 December 2015

This week I finally got round to updating and "voice-overing" my Introductory talk about Neo4j and Graph Databases. As you would expect, it's a bit different from some of the early introductory talks, and has a lot more examples and use cases mentioned in it. I would be curious to learn what you think about it - so PRESS PLAY below, sit back and RELAX.

Thursday, 3 December 2015

Last GraphConnect in , I spent some time at the GraphClinic helping lots of interested attendees get the most out of Neo4j. I really enjoyed, also because for a good time, I shared the "clinic" with one of my colleagues, Will Lyon, who is working in our Developer Evangelism team. Will has been working on lots of cool stuff with Neo4j for the longest time, and has plenty of stuff to share and discuss. So we got on a Skype call - and ... chatted away... here's the result:

Here's the transcript of our conversation:

RVB: 00:00 Hello everyone, my name is Rik, Rik Van Bruggen from Neo Technology and here we are again recording a Neo4j graph database podcast. It's been a while since we've been doing recordings, and tonight I'm joined by Will Lyon, all the way from California. Hi Will.

WL: 00:19 Hi Rik, thanks for having me.

RVB: 00:20 Hey, good to have you on the call. I thank you for joining us. Will, I've read a bunch of your blog posts and I've seen a bunch of yourwork but many people may not have seen it yet, so why don't you introduce yourself to get us going?

WL: 00:36 Sure, thanks. I'm Will Lyon, I'm on the developer relations team at Neo. That means that it's my job to help encourage awareness and drive adoption of Neo4j and also graph databases in general. So, I do this by writing blog post that talk about Neo4j and graph databases, building cool demo apps, integrating with other technologies, proving out new use cases. For example, earlier this week I was at QCon Conference in San Francisco talking to our users there. Tomorrow, I'll be giving a webinar about using Neo4j and MongoDB together.

RVB: 01:18 Wow! Super cool. And then, how long have you been working with Neo, just as a community member, Will? Quite some time, right?

WL: 01:25 Quite some time, I joined the company just in September. So, I have been with Neo Technology for about two months now. Prior to that, I was working as a software developer for a couple of start ups and always trying to work Neo into the job.

RVB: 01:43 That's very cool. Well, that also immediately begs the question, why, right? Why were you trying to work Neo into your job all the time? What attracted you to it, I suppose?

WL: 01:55 Sure. The first time I was exposed to Neo was a few years ago at a hackathon over the weekend, and the team I was working with, we needed a project. We had read a blog post about building recommender systems with Neo4j, this graph database thing. I didn't know anything about graph databases or collaborative filtering recommender systems, but I thought it sounded interesting. So, we tackled this project over the course of the hackathon and we were able to build a GitHub repository recommender system. So, it looked at your previous activity on GitHub as an open source contributor and recommended other repositories that you might be interested in. It was a really fun project to put together, and I was amazed at how sort of easy it was to get going with Neo4j and Cypher, the query language, and actually build this application. At the end of the weekend, it worked and we went on to actually win the hackathon. So, I was sort of--

RVB: 03:02 Wow! That's cool.

WL: 03:03 Yeah. I was hooked from that point on. What I really liked about Neo is the way that you think about the data model with graph data is very close to how we think about data in the real world. So you have this very close mental map. It seems very intuitive when we're thinking about our data model. For example, Rik is my co-worker, I'm at a conference, the conference is in San Francisco. These are all entity nodes and relationships, and so it [crosstalk]-- so, it seems very easy to express very complex data models. We don't have this weird transformation that we have to go through.

RVB: 03:52 Absolutely. What made it so productive then to implement that recommender system? What was it that made-- is it just the model or is it also Cypher? What made it so easy to develop with, in that particular case?

WL: 04:05 Sure, I think, really, Cypher was the biggest thing for us, and just being able to define the problem that we were trying to solve as a traversal to this graph, and being able to very clearly define that pattern in a Cypher query and get that back right away. It was actually very easy to build something that was not quite trivial.

RVB: 04:36 Yeah, I know. I understand. Well I've seen some of you other hackathon works, like for example, that thing that you built to fire multiple Cypher queries and now you're working on something really interesting to import CSVs you told me, right?

WL: 04:51 Sure. So, on the developer relations team, one thing that we're focused on is the new user experience. So, for users seeing Neo4j for the first time, what's the first thing they want to do? Well, a lot of times that's play with their own data. And so, we are trying to make that process of importing your data into Neo4j much easier. So, one of the projects I'm working on is a web application that guides the user through the process of converting their CSV files into a graph data model, and then allows to quick execute those against the Neo4j instance to import your data.

RVB: 05:29 You mean it's going to be even easier than with Load CSV, then?

WL: 05:33 That's right [laughter], exactly.

RVB: 05:34 That's super cool. I mean, I've been with Neo a couple of years now, and when I started it was a brutal experience [chuckles]. It's gotten so much easier, and it's going to get even more easy. So, that's great to hear. Thanks for that.

WL: 05:51 Yeah. Absolutely.

RVB: 05:52 Very cool. So, Will, one of the topics that we always cover on this podcast is, where is it going? What are the big things that you see coming up and you would love to see happen in Graphistania, as we call it sometimes [chuckles]. Where do you see this going? What's your perspective on that?

WL: 06:13 Sure. I think we're at a really interesting time now where we're seeing lots of improvements in the technology - Neo4j, graph databases in general, around performance - but also around the API's that we're using to interact with graph data. So things like Cypher, it's becoming much cleaner, much easier to work with. And I think this investment in the technology is really indicative of a larger trend in applications in general. Users are expecting more from our applications. So, let's take e-commerce as an example. Browsing and searching and filtering are great, but users are really expecting things like personalized recommendations in their e-commerce platform, and a great way to generate those is with a graph database. Same with things like contents delivery, we expect personalized content recommendations. So, I really think we're seeing the case where going forward, we're going to see graph databases used in more and more applications, used alongside more and more technologies, and it will feel very natural and easy to use Neo4j in your modern application stack.

RVB: 07:31 Does that mean things like availability of Neo4j to other development platforms as well? Not just Java, .NET, and all those types of things as well. Is that part of that?

WL: 07:46 Sure, absolutely. I think with Cypher, that's becoming much easier now. It's very easy to shoot a Cypher script to Neo4j server from a .NET environment, from a Python application. We're really seeing a standardization around the API there.

RVB: 08:06 Well, I'm really looking forward to it, as you are, I imagine [chuckles]. What I'll do is when we write up the podcast and transcribe it, we'll put a bunch of links to some of your work and all the other developer evangelists' work in the article so that people can find the way around even more easily. So, thank you so much Will for coming on the podcast, really appreciate it. I'll wrap up here and I look forward to seeing you at an event very soon.