Graphs are data structures highly useful to understand and represent many real-world problems in all kinds of areas such as business, government, and science.
To take advantage of graph databases we don't need to take a Masters in Graph Theory. Instead of that, we must understand what a graph is, and be able to build one by drawing it on a paper.
Mathematically speaking, a graph is just a collection of vertices and edges. Or if you don't like math, a set of nodes and relationships that connect them. Graphs represent entities with nodes (vertices), and the way that entities relate with each other are expressed by relationships (edges).
If you stop now and think about, this structure allows us to model countless scenarios, from commercial systems to more complex problems such as optimization algorithms.
This graph model is formally known as a Property Graph. A property graph has the following characteristics:
- It contains nodes and relationships.
- Nodes contain properties (key-value pairs).
- Relationships are named and directed, and always have a start and end node.
- Relationships can also contain properties (key-value pairs).
Despite being intuitive and easy to understand, the property graph model can be used to describe almost all graph use cases.
As you probably are thinking, graph databases use the graph model to store data as a graph, with a structure consisting of vertices and edges, the two entities used to model any graph. In addition, you can use all the algorithms from the long history of graph theory to solve graph problems and in less time than using relational database queries. I'll be covering some of them in my next posts.
Beyond the image above, and now talking specifically about Neo4j, it is an open-source graph database supported by Neo Technology, that stores data using the Property Graph model. It is reliable, with full ACID transactions, expressive, with a powerful, human-readable graph query language called Cypher, and simple, accessible by a convenient REST interface or an object-oriented Java API.
Enough theory and talking for now. Let's prepare our environment to play a little with Neo4j, and build a simple Rails application.
Installing Neo4j on development machines is very easy. If you are on OSX and is using brew, go ahead and issue
brew install neo4j on a terminal window.
Or, if you prefer, follow these five steps:
- Download the Neo4j Community package.
- Unzip on your installations folder, let's say
- Create a symbolic link named
neo4jto the unzipped folder. For instance:
ln -s ~/Applications/neo4j-community-2.1.2 ~/Applications/neo4j.
- Create an environment variable named
NEO4J_HOMEpointing to this symbolic link.
- Change the
PATHenvironment variable, adding the
This way, in the future, when you want to update the Neo4j database on your machine, you can just download the new version, unpack, and update the symbolic link pointing it to the new version.
When you have it installed, open a terminal window and type:
neo4j start. This command will start the Neo4j server on your machine. Now go check it on your browser accessing
http://localhost:7474/. You'll be presented with a super nice administration panel, where you can visualize the data stored on your Neo4j instance, manipulate data using the Cypher Query Language, check all instance configuration and more.
Neo4j is built on top of Java and the rock-solid JVM. As we want to use (MRI) Ruby on Rails here, let's connect our app using Neo4j's REST API.
To make things simpler, we'll use the awesome gem (surprisingly) called neo4j from @andreasronge. The version 2.x is the stable version. But here we will use it directly from the master branch where version three is under active development, and which enable us to use the MRI Ruby connecting to Neo4j via its REST interface. If you are into JRuby, you can even use the stable version and connect using the embedded DB (by filesystem), which means a Neo4j instance running on the same JVM of your app.
This blog post intended to introduce you to the world of Graph Databases, giving some basic theory about graphs and a practical hands-on using Neo4j and Rails.
For future posts expect to read more about Neo4j, Cypher Query Language, and traversal algorithms.
So, what about learning by doing? And remember: graphs are everywhere!