Starting with Graph Databases: A Quick look into Neo4j
Let’s talk about Neo4j, a graph database that recently has attracted a significant number of fans.
My goal is for you to have a brief (I promise to be quick) vision of how it works and to give you some examples to make it more tangible.
We are used to write data using tables, to relate them through primary keys and, by looking directly at the data, you only see IDs.
Graph databases were designed mainly so that this doesn’t happen.
Its purpose is for you to have a complete understanding when looking at the data.
First of all, to understand the advantage of graph databases, it’s very good to know a few concepts that reinforce the idea of software close to the business rules.
For that, it is worth reading Domain-Driven Design, where the model should be as close as possible to the business.
In Neo4j, that is done with Cypher: a declarative SQL-inspired language for describing patterns in visual graphs using an ASCII syntax.
It allows us to state what we want to select, insert, update or delete from our graph data without requiring us to describe exactly how to do it.
To be more practical, mind the example below, where the same “query” is implemented with SQL and Cypher.
SELECT f.* FROM students s
INNER JOIN person p
ON p.id = s.person_id
INNER JOIN friend f
ON f.friend_from_id = p.id
WHERE s.id = 45
ORDER BY p.name
MATCH
(student :Student)-[FRIEND]->(person :Person)
WHERE student.id = 45
RETURN person
ORDER BY person.name
Above there’s a “query” to search for a specific student’s friends, where in the traditional mode (SQL) we need to better understand the database structure and know how they relate, causing a greater use of “JOINS”. In Cypher relations are more intuitive for reading.
Graph databases are a more natural way of storing the data.
You don’t have to worry about tables and foreign keys and it keeps everything within two simple concepts: nodes and relations.
Each node or relation can have its attributes and labels of identification, a way to categorize the data.
To make it easier to understand, let’s map a school in a simple way, where we have students, teachers and courses.
How would a diagram represent this? Would you do it in a graph database? It is actually simpler than you think! Just draw it: