Starting with Graph Databases: A Quick look into Neo4j

Blue image with neo4j written over it.

Let’s talk about Neo4j, a graph database that recently has attracted a significant number of fans.

My goal is for you to have a brief (I promise to be quick) vision of how it works and to give you some examples to make it more tangible.

We are used to write data using tables, to relate them through primary keys and, by looking directly at the data, you only see IDs.

Graph databases were designed mainly so that this doesn’t happen.

Its purpose is for you to have a complete understanding when looking at the data.

Example of a simple school database diagram.

First of all, to understand the advantage of graph databases, it’s very good to know a few concepts that reinforce the idea of software close to the business rules.

For that, it is worth reading Domain-Driven Design, where the model should be as close as possible to the business.

In Neo4j, that is done with Cypher: a declarative SQL-inspired language for describing patterns in visual graphs using an ASCII syntax.

It allows us to state what we want to select, insert, update or delete from our graph data without requiring us to describe exactly how to do it.

To be more practical, mind the example below, where the same “query” is implemented with SQL and Cypher.

 
SELECT f.* FROM students s
INNER JOIN person p
ON p.id = s.person_id
INNER JOIN friend f
ON f.friend_from_id = p.id
WHERE s.id = 45
ORDER BY p.name

 
MATCH
(student :Student)-[FRIEND]->(person :Person)
WHERE student.id = 45
RETURN person
ORDER BY person.name

Above there’s a “query” to search for a specific student’s friends, where in the traditional mode (SQL) we need to better understand the database structure and know how they relate, causing a greater use of “JOINS”. In Cypher relations are more intuitive for reading.

Graph databases are a more natural way of storing the data.

You don’t have to worry about tables and foreign keys and it keeps everything within two simple concepts: nodes and relations.

Each node or relation can have its attributes and labels of identification, a way to categorize the data.

To make it easier to understand, let’s map a school in a simple way, where we have students, teachers and courses.

How would a diagram represent this? Would you do it in a graph database? It is actually simpler than you think! Just draw it:

Now, hands-on!

You can install Neo4j on your computer or do it using Docker:

 
image: neo4j:3.4.5
ports:
  - "7474:7474"
  - "7473:7473"
  - "7687:7687"

Don’t forget to follow the rules and nomenclature recommendations.

Creating our first nodes:

 
CREATE (s)-[k:KNOWS]->(t), (t)-[ts:TEACHES]->(c), (s)-[e:ENROLLED]->(c)
CREATE (user1:Person:Student { name: "Natam" })
CREATE (user2:Person:Teacher { name: "Natalia" })
CREATE (course:Course { title: “Math” })

Now let’s relate them:


MATCH (s:Student {name:"Natam"}), (t:Teacher {name:"Natalia"}), (c:Course {title:"Math"})

Now that we have our data, we need to consult it. Let’s do this?


MATCH (n) RETURN n

This command will return an overview of your graph.

You can apply more filters and conditions to your “query”. Check this out: Neo4j Cypher Manual

Conclusion, but already?

Yes! The purpose of this post is for you to get to know this incredible technology and how basic is its structure.

If you found it interesting and want to learn more about it, try modeling applications you already know using Neo4j.

I’m sure you’ll be even more surprised!

I will leave a few links in case you want to know more concepts related to this technology.

REFERENCES

About the author.

Natam Oliveira
Natam Oliveira

VP of Engineering at Cheesecake Labs - IA / IoT enthusiast. Go bravely where no one have never gone before!