Starting with Graph Databases: A Quick look into Neo4j
Summary
Neo4j is a graph database that uses nodes and relations instead of tables and foreign keys, aiming to make data structure more intuitive and closer to business rules.
It uses Cypher, a declarative SQL-inspired language with ASCII syntax, which simplifies queries compared to SQL by reducing the need for JOINs when navigating relationships.
The post demonstrates basic usage by modeling a school scenario with students, teachers, and courses, including installation via Docker and example commands to create nodes, establish relationships, and query data.
Let’s talk about Neo4j, a graph database that recently has attracted a significant number of fans.
My goal is for you to have a brief (I promise to be quick) vision of how it works and to give you some examples to make it more tangible.
We are used to write data using tables, to relate them through primary keys and, by looking directly at the data, you only see IDs.
Graph databases were designed mainly so that this doesn’t happen.
Its purpose is for you to have a complete understanding when looking at the data.
Example of a simple school database diagram.
First of all, to understand the advantage of graph databases, it’s very good to know a few concepts that reinforce the idea of software close to the business rules.
For that, it is worth reading Domain-Driven Design, where the model should be as close as possible to the business.
In Neo4j, that is done with Cypher: a declarative SQL-inspired language for describing patterns in visual graphs using an ASCII syntax.
It allows us to state what we want to select, insert, update or delete from our graph data without requiring us to describe exactly how to do it.
To be more practical, mind the example below, where the same “query” is implemented with SQL and Cypher.
SELECT f.* FROM students s
INNER JOIN person p
ON p.id = s.person_id
INNER JOIN friend f
ON f.friend_from_id = p.id
WHERE s.id = 45
ORDER BY p.name
MATCH
(student :Student)-[FRIEND]->(person :Person)
WHERE student.id = 45
RETURN person
ORDER BY person.name
Above there’s a “query” to search for a specific student’s friends, where in the traditional mode (SQL) we need to better understand the database structure and know how they relate, causing a greater use of “JOINS”. In Cypher relations are more intuitive for reading.
Graph databases are a more natural way of storing the data.
You don’t have to worry about tables and foreign keys and it keeps everything within two simple concepts: nodes and relations.
Each node or relation can have its attributes and labels of identification, a way to categorize the data.
To make it easier to understand, let’s map a school in a simple way, where we have students, teachers and courses.
How would a diagram represent this? Would you do it in a graph database? It is actually simpler than you think! Just draw it:
Neo4j is a graph database that has recently attracted a significant number of fans. It is designed so that when looking at the data, you have a complete understanding of it, rather than only seeing IDs as in traditional table-based databases.
What is Cypher?
Cypher is a declarative SQL-inspired language used in Neo4j for describing patterns in visual graphs using an ASCII syntax. It allows you to state what you want to select, insert, update, or delete from graph data without requiring you to describe exactly how to do it.
How does a query in Cypher compare to SQL?
In SQL, querying related data (such as a student's friends) requires understanding the database structure and using multiple JOINS. In Cypher, relations are more intuitive for reading. For example, to find a student's friends: MATCH (student :Student)-[FRIEND]->(person :Person) WHERE student.id = 45 RETURN person ORDER BY person.name.
What are the core concepts in a graph database like Neo4j?
Graph databases keep everything within two simple concepts: nodes and relations. Each node or relation can have its own attributes and labels of identification, which serve as a way to categorize the data. You don't have to worry about tables and foreign keys.
How can I install Neo4j?
You can install Neo4j on your computer or use Docker with the image neo4j:3.4.5, exposing ports 7474, 7473, and 7687.