Starting with Graph Databases: A Quick look into Neo4j

Blue image with neo4j written over it.
Summary
  • Neo4j is a graph database that stores data using nodes and relationships instead of traditional tables and foreign keys, making data structures more intuitive and closer to real business models.
  • It uses Cypher, a declarative query language inspired by SQL, which allows more readable and natural queries especially when navigating relationships between data.
  • Graph databases simplify data modeling by eliminating the need for complex JOIN operations, representing connections between entities in a way that mirrors how we naturally think about relationships.

Let’s talk about Neo4j, a graph database that recently has attracted a significant number of fans.

My goal is for you to have a brief (I promise to be quick) vision of how it works and to give you some examples to make it more tangible.

We are used to write data using tables, to relate them through primary keys and, by looking directly at the data, you only see IDs.

Graph databases were designed mainly so that this doesn’t happen.

Its purpose is for you to have a complete understanding when looking at the data.

Example of a simple school database diagram.

First of all, to understand the advantage of graph databases, it’s very good to know a few concepts that reinforce the idea of software close to the business rules.

For that, it is worth reading Domain-Driven Design, where the model should be as close as possible to the business.

In Neo4j, that is done with Cypher: a declarative SQL-inspired language for describing patterns in visual graphs using an ASCII syntax.

It allows us to state what we want to select, insert, update or delete from our graph data without requiring us to describe exactly how to do it.

To be more practical, mind the example below, where the same “query” is implemented with SQL and Cypher.

 
SELECT f.* FROM students s
INNER JOIN person p
ON p.id = s.person_id
INNER JOIN friend f
ON f.friend_from_id = p.id
WHERE s.id = 45
ORDER BY p.name

 
MATCH
(student :Student)-[FRIEND]->(person :Person)
WHERE student.id = 45
RETURN person
ORDER BY person.name

Above there’s a “query” to search for a specific student’s friends, where in the traditional mode (SQL) we need to better understand the database structure and know how they relate, causing a greater use of “JOINS”. In Cypher relations are more intuitive for reading.

Graph databases are a more natural way of storing the data.

You don’t have to worry about tables and foreign keys and it keeps everything within two simple concepts: nodes and relations.

Each node or relation can have its attributes and labels of identification, a way to categorize the data.

To make it easier to understand, let’s map a school in a simple way, where we have students, teachers and courses.

How would a diagram represent this? Would you do it in a graph database? It is actually simpler than you think! Just draw it:

Now, hands-on!

You can install Neo4j on your computer or do it using Docker:

 
image: neo4j:3.4.5
ports:
  - "7474:7474"
  - "7473:7473"
  - "7687:7687"

Don’t forget to follow the rules and nomenclature recommendations.

Creating our first nodes:

 
CREATE (s)-[k:KNOWS]->(t), (t)-[ts:TEACHES]->(c), (s)-[e:ENROLLED]->(c)
CREATE (user1:Person:Student { name: "Natam" })
CREATE (user2:Person:Teacher { name: "Natalia" })
CREATE (course:Course { title: “Math” })

Now let’s relate them:


MATCH (s:Student {name:"Natam"}), (t:Teacher {name:"Natalia"}), (c:Course {title:"Math"})

Now that we have our data, we need to consult it. Let’s do this?


MATCH (n) RETURN n

This command will return an overview of your graph.

You can apply more filters and conditions to your “query”. Check this out: Neo4j Cypher Manual

Conclusion, but already?

Yes! The purpose of this post is for you to get to know this incredible technology and how basic is its structure.

If you found it interesting and want to learn more about it, try modeling applications you already know using Neo4j.

I’m sure you’ll be even more surprised!

I will leave a few links in case you want to know more concepts related to this technology.

REFERENCES

FAQ

What is Neo4j and what makes it different from traditional databases?

Neo4j is a graph database that stores data using nodes and relations instead of tables and foreign keys. It is designed to give users a complete and intuitive understanding of their data without relying on complex joins or IDs.

What is Cypher and how does it compare to SQL?

Cypher is a declarative SQL-inspired language used in Neo4j to describe patterns in graphs using ASCII syntax. Compared to SQL, Cypher makes relationships more intuitive and readable, requiring fewer joins to express the same queries.

What are the two core concepts in a graph database like Neo4j?

The two core concepts are nodes and relations. Each node or relation can have its own attributes and labels, which serve as a way to categorize and identify the data.

How can you get started with Neo4j?

You can install Neo4j directly on your computer or run it using Docker with the neo4j:3.4.5 image, exposing ports 7474, 7473, and 7687. Once running, you can create nodes and relations using Cypher commands.

How do you query all data in a Neo4j graph?

You can retrieve an overview of your entire graph by running the Cypher command 'MATCH (n) RETURN n', and then apply additional filters and conditions to narrow down the results.

About the author.

Natam Oliveira
Natam Oliveira

VP of Engineering at Cheesecake Labs - IA / IoT enthusiast. Go bravely where no one have never gone before!