It took Max De Marzi, Neo Technology software field engineer, one weekend to build his own Facebook graph search with Neo4j. At Big Data Techcon 2014, he explained to developers how they could do the same and more importantly, why they should. With De Marzi’s tools and strategies -- namely, Neo4j and Cypher -- developers can build a graph search without, as De Marzi put it, querying hundreds of servers and putting in six months of work.
In the following Q&A, De Marzi explains how he painlessly built a graph search and offers use cases for why this technology would be an asset to the enterprise. This Q&A has been edited for length, clarity and editorial style.
Could you tell us about your session on building your own Facebook graph search?
Max De Marzi: Sure. So the title is just that: How to build your own Facebook graph search -- without querying thousands of servers or putting in six months of work. It’s a project I did over a weekend. It was actually inspired and requested by a meet up organizer in Boston who heard about Facebook graph search last year and wanted to replicate it in Neo4j.
How did you build your own Facebook graph search?
De Marzi: I’m a Ruby developer at heart and one of the nice things about Ruby is that someone, somewhere, has written a library to help you out --so I asked Google for help finding that library. I also cheated a little bit, I don’t use natural language processing in the way that Facebook does, what I do is I build a grammar that looks, feels and tastes like English. So, you can just type in English what you’re looking for and get that to translate into a Cypher query which, in turn, graphs your answer.
You’re looking for connections and the graph makes the most sense.
Max De Marzi, Neo Technology
What are some use cases for this technology?
De Marzi: The technology is really meant for end users. So you’re looking at people who are not going to be writing Sequel or Cypher queries. These are people who want to get to the information quickly and they have a domain that they understand -- whether it’s people, medicine, servers, whatever it is they know. They can ask things like,' find me the things connected to this server, or the people who like this, or the drugs that interact with this disease.' So they can speak in their language that will translate into a graph query. It will make their life a lot easier and you can get more insights to the end user quickly.
What skills would developers need to build their own graph search?
De Marzi: It’s written in Ruby and there’s a version out there that somebody wrote in Java but the idea is that you’re going to be building a parser. That’ll take your language and translate it into Cypher so there’s not a lot going on. You have to learn Cypher so you know what to ask it. You have to know Neo4j so you can save the data in the database and be able to query it back. Then just pick your favorite language for the UI, whether you’re a Python developer or Ruby -- it doesn’t really matter. You’re just passing a string and showing the output in some way.
Why was Neo4j the best tool to use for this project?
De Marzi: The graph understands how things are connected and understands how some people reference their data. You’re not searching for one specific thing, you’re searching for how they’re connected. For example, you may not search for David Lee, there may be a thousand David Lees in the world. But if you search for David Lee that’s connected to me, somehow, there may be only three of them, who happen to be friends of friends. So it’s a different way to ask for something. You’re not looking for general stuff. You’re looking for connections and the graph makes the most sense, as far as where to put your data.
Neo4j's Cypher query language hooks into Apache Spark