Recommendation engines are one of the most practical real-world applications of graph databases. Whether it is Netflix suggesting movies, Amazon recommending products or LinkedIn proposing new connections, these systems rely heavily on traversing relationships between entities.
This is exactly where relational databases often begin to struggle. While SQL databases are exceptional for transactional systems and structured business data, relationship-heavy queries quickly become difficult to maintain and expensive to execute at scale.
In this article, we will build a simple Neo4j recommendation engine using Spring Boot, Docker and Cypher while also exploring practical engineering challenges encountered during implementation.

Why Graph Databases Matter for Neo4j Recommendation Engine
For decades, relational databases have been the default choice for backend systems. However, Neo4j recommendation engines expose one of their biggest weaknesses: recursive relationship traversal.
Imagine the following query:
“Find users who bought the same products as Alice, then recommend products those users purchased that Alice has not seen yet.”
In SQL, this often requires:
- multiple JOINs
- self-referencing many-to-many tables
- deeply nested queries
- expensive index scans
As the depth of traversal increases, the queries become increasingly difficult to read and optimize.
Graph databases approach this problem differently.
Neo4j stores relationships as first-class citizens. Instead of computing relationships dynamically through JOIN operations, nodes directly reference their neighbors through a mechanism called index-free adjacency.
This allows traversals to remain extremely fast even as the dataset grows.
In practical terms:
- relational databases optimize rows
- graph databases optimize relationships
For Neo4j recommendation engines, fraud detection, knowledge graphs and social networks, this architectural difference is extremely powerful.
Why We Chose Neo4j
Neo4j is currently the most popular graph database ecosystem and comes with several advantages for backend engineers:
- intuitive Cypher query language
- strong visualization tooling
- mature Java ecosystem
- official Spring Boot integrations
- excellent Docker support
Most importantly, Cypher queries resemble natural graph thinking.
Instead of describing complex JOIN logic, you simply describe relationships.
For example:
MATCH (u:User {name: $name})-[:PURCHASED]->(p:Product)<-[:PURCHASED]-(other:User)-[:PURCHASED]->(rec:Product)
WHERE NOT (u)-[:PURCHASED]->(rec)
RETURN DISTINCT rec.name AS recommendation
Even without deep Neo4j knowledge, the traversal logic is relatively readable.
This is one of the biggest strengths of graph databases.
Designing the Neo4j Recommendation Engine
The recommendation engine itself is intentionally simple.
We model:
- users
- products
- purchase relationships
The graph structure looks like this:
(User)-[:PURCHASED]->(Product)
The recommendation algorithm works as follows:
- Find products purchased by the target user
- Find other users who purchased the same products
- Find additional products purchased by those users
- Exclude products already purchased by the original user
This collaborative filtering pattern is one of the most common recommendation techniques.
The entire traversal is handled directly inside Neo4j using Cypher instead of implementing nested loops in Java.
This is an important engineering decision.
Many developers accidentally treat graph databases like relational databases by moving traversal logic into application code. This defeats much of the benefit of using a graph database in the first place.
The backend service should remain thin while the graph engine performs the heavy relationship traversal internally.
Docker-First Development Strategy
One practical challenge during development was the lack of a local Java environment.
Instead of installing:
- JDK
- Maven
- Gradle
- environment variables
we decided to fully containerize the application.
This turned out to be a much better long-term engineering decision.
The host machine only required Docker. Everything else was isolated inside containers.
We used a multi-stage Docker build:
FROM gradle:8-jdk17 AS build
WORKDIR /home/gradle/src
COPY --chown=gradle:gradle . .
RUN gradle build --no-daemon -x test
FROM eclipse-temurin:17-jre-focal
WORKDIR /app
COPY --from=build /home/gradle/src/build/libs/*.jar app.jar
ENTRYPOINT ["java", "-jar", "app.jar"]
This provides several major advantages:
- identical development and production environments
- simplified onboarding
- reproducible builds
- cleaner CI/CD pipelines
- explicit dependency management
A new developer can simply run:
docker-compose up
and immediately start working.
This approach dramatically reduces the classic “works on my machine” problem.
Container Networking Challenges
One of the most common Docker mistakes occurred during setup. Initially, the Spring Boot application attempted to connect to Neo4j using:
localhost:7687
Inside Docker containers, localhost refers to the current container itself – not other containers.
This resulted in an UnknownHostException.
The solution was to place both services inside the same Docker network and communicate using service names.
Example:
services:
neo4j-db:
container_name: neo4j-db
networks:
- graph-net
app:
environment:
- SPRING_NEO4J_URI=bolt://neo4j-db:7687
networks:
- graph-net
Docker Compose automatically provides internal DNS resolution between services. This is one of the most important concepts in containerized backend systems.
Spring Boot Transaction Pitfalls
Another interesting issue appeared during integration with Spring Data Neo4j.
Spring Boot auto-configured:
TransactionManagerReactiveTransactionManager
When using @Transactional, Spring could not determine which transaction manager to use and threw:
NoUniqueBeanDefinitionException
The fix was straightforward:
@Transactional("transactionManager")
public void seedData() {
...
}
This is an important lesson for senior backend engineers:
explicit configuration is often safer than relying entirely on framework magic.
As systems become more complex and involve multiple databases or paradigms, explicit transaction boundaries become critical for correctness and maintainability.
Why We Used Neo4jClient Instead of Repositories
Spring Data repositories are excellent for standard CRUD operations. However, Neo4j recommendation engines often require highly customized graph traversals.
For this reason, we used:
Neo4jClient- raw Cypher queries
instead of relying entirely on Object Graph Mapping abstractions.
This provided:
- more control
- better query visibility
- easier optimization
- simpler debugging
For graph-heavy applications, this hybrid approach is often preferable.
Integration Testing with Real Graph Traversals
One major engineering decision was avoiding mocks for graph testing. Mocking a graph database removes most of the value of testing relationship traversal behavior.
Instead, we introduced a dedicated Dockerized test runner:
test-runner:
image: gradle:8-jdk17
volumes:
- .:/home/gradle/src
environment:
- SPRING_NEO4J_URI=bolt://neo4j-db:7687
command: gradle test --no-daemon
The tests performed:
- graph seeding
- real Cypher execution
- API validation
- traversal verification
This allowed us to detect:
- invalid Cypher syntax
- transaction issues
- networking problems
- incorrect traversal logic
Testing against a real graph database provides significantly higher confidence compared to mocked repositories.
When Graph Databases Make Sense
Graph databases are not replacements for relational databases. They are specialized tools optimized for highly connected data.
Neo4j makes the most sense when:
- relationships are central to the domain
- traversals become complex
- JOIN-heavy queries dominate
- recommendation systems are required
- knowledge graphs are involved
Typical use cases include:
- recommendation engines
- fraud detection
- social networks
- AI knowledge graphs
- dependency mapping
- supply chain analysis
For standard CRUD business systems, PostgreSQL or MySQL are often still the better choice.
Final Thoughts
Building a Neo4j recommendation engine demonstrates one of the clearest strengths of graph databases: expressing complex relationship traversal in a natural and efficient way.
The project also highlighted several practical backend engineering lessons:
- containerized development environments
- Docker networking
- Spring transaction management
- graph query optimization
- integration testing strategies
Most importantly, it showed how moving traversal complexity into the database engine itself can dramatically simplify backend application logic.
As AI systems, Neo4j recommendation engines and knowledge graphs continue growing in importance, graph databases are becoming increasingly valuable tools for backend engineers to understand.
You can check the project in this GitHub repository.
