How Can I Test a Graph Database?

Graph Database Compatibility with ETL and Modelling Solution

Graph Database Compatibility

You may already have chosen the graph database you are going to use for your application or you may be deciding which one currently, either way we hope that the information on this page helps you in your journey and we are always happy to discuss Knowledge Graph projects and initiatives if this would assist.

The lists below show graph database vendors that can be written to using the Graph.Build platform. Note that this list is being added to constantly, if you can't see the database you are interested in then please get in touch. We aim to offer users the choice of any graph database and the freedom to switch between vendors. As such, any database that uses SPARQL, Cypher or Gremlin (Apache TinkerPop™) as a query language is supported as standard. Both RDF and property graph models are supported. Maximum graph database compatibility is our aim.

All of the graph databases in the compatibility table shown offer a Graph DBMS (sometimes called a Graph-Oriented DBMS) but they differ in their storage models as well as supporting different query languages and graph model types. These are important considerations which you can read more on further down this page. Additional considerations could be Quality of technical documentation, Support/SLA, Cloud/On-Premise hosting, Other supported programming languages, API availability, Scalability options, License type (e.g. Open source), Track Record or Endorsements, Server Operating System, The handling of schemes, indexes, typing, partitioning, replication, whether it has in-memory capabilities, other database types supported or the look and fee/ease-of-use of the UI.

It is difficult to compare each graph database vendor/solution without knowing many more specifics of the use case. For this reason, it makes sense to develop a model in the Graph.Build platform, testing and optimising it within the Graph.Build Studio before committing to a database vendor. This means that the creation is visual and inclusive so that you can engage all stakeholders but you are not constrained by what a particular graph database can do. In this way, once the model has been tested, you can write the model to the databases you wish to test out and run queries on them, as you would expect to do in your production environment. If you read on, you will find some pointers as to which databases you might want to trial.

This situation usually arises when a graph model has been built in tooling provided by a graph database vendor. This isn't always the case, some vendors offer tooling that isn't specific to their database solution although it is potentially specific to some of the query languages or graph model types supported by that database. Similarly without a sophisticated graph ETL integration it is difficult to visualise models during the design stage in a useful way.

If you find yourself vendor-locked then a good course of action is to abstract your model design using the Graph.Build platform. At this stage you can have the data source translations, data transformation (ETL), graph model visualisation and graph data publishing all in one pipeline that you control. You will, at this stage, be 'graph databases ready,' i.e. Your model can be plugged into a suitable graph database of your choosing whenever you like.

Deciding on the right graph database vendor to choose for your application is no simple task and the decision made could potentially determine whether or not your knowledge graph project is successful. The problem with graph databases is that, in order to be considered one, a database simply needs to be able to store data in a graph format i.e. data comprised of nodes with edges defining relationships between them.

This broad definition makes sense because the use of graph technologies/graph databases is not as long or widely established as the use of relational databases. Therefore, it currently seems accepted that a graph database is any database that supports the representation of data in the graph structure and allows queries that use graph traversals to perform what would be complex queries in a relational database. This being one of the fundamental advantages of graph databases.

A fundamental reason for using graph models is that you can answer certain questions much more quickly than you can with a relational database because you can traverse the edges of the graph data model which can represent complex relationships. In theory you can answer any question as long as there is a path to traverse the data in the right way. Ontologists and Knowledge Graph Architects have a deep level of understanding and expertise in designing models - much of their work is to optimise the structure of a model based on answer the kinds of questions that are expected to be asked of it.

The importance of a model that is optimised to be able to answer the questions demanded of it are clear and widely accepted. Selecting a graph database that works optimally with the graph model and how it will be used is equally important but could be glossed over. In the sections that follow we will explain some of the key differences between graph databases and show you how to test which one will be the best fit for your graph model and application.

It has been established that choosing the right graph database vendor is important but there are cases where applications are not using the optimal database solution, causing opportunity costs for the application users.

This can be for some of the below reasons:

1. Most graph vendors will claim to offer solutions that conduct complex searches at lightning speed, this is likely to be true for certain applications but not all. You may have taken the speed claims at face value without comparing how queries on your model run on different graph databases;

2. You have become 'Vendor Locked' because a model was built in tooling that is native and unique to one graph database vendor and this database either never was optimal for the application but was the best option at the time but no longer is;

3. You looked at graph database vendors as a first step in considering Knowledge Graph implementation, creating a bias in the way you designed your model;

4. Existing skillsets were prioritised in database selection so that the graph query language options for the database (e.g. SPARQL, Cypher, Gremlin) ruled out some graph database options that would have provided better performance.

5. You didn't consider using multiple vendors. It is not uncommon for organisations to use multiple different graph databases for different applications, this isn't always obvious though.
If you have found yourself in one of these scenarios, then it is possible to change graph database providers if you use Graph.Build platform as graph tooling, because it is vendor-agnostic. You can avoid the scenarios above altogether by starting your Knowledge Graph project within the Graph.Build platform.

On the basis that a database is a graph database if it can store graph data then there are many different ways that it could achieve this. There are a few common ways that graph databases (or NoSQL databases) store graph data including the use of document-oriented or value-key store or even storing graph data in a table, having similarities with a relational database in that regard. This is usually discovered by looking at the primary database model type that drives the graph database solution.

The types of data stores offered by different graph databases are typically Graph (RDF), Document, Key-Value and Wide Column. Some offer no stores and are simply database management systems. These are some popular graph database solutions which fall under some of these categories:

RDF Store (Graph)

Allegrograph (Franz)
Amazon Neptune (AWS)
AnzographDB (Cambridge Semantics)
Blazegraph
GraphDB (Ontotext)
IBM DB2
MarkLogic
Oracle Database - Spacial & Graph
RDFox
Stardog
TerminusDB
Virtuoso (Openlink)

Graph DBMS Only

DGraph
HugeGraph (Baidu)
JanusGraph
MemGraph
NebulaGraph
TigerGraph

Some graph databases are also designed for certain types of system architecture. Some are designed to be cloud-based, managed solutions whilst others could be native databases that could deploy approaches where nodes physically point to each other giving it the ability to find links between nodes very quickly compared to other approaches.

Graph database vendors may be known to better support one type of model over another, for example RDF/Semantic or Labelled Property Graphs. The decision of which type of graph model to use for your application is of course a vital one but it should not be confused with the vendor selection process. The model selected may well narrow down the choice of database providers but the remaining databases will perform differently when you test them with your model. In fact sometimes significantly differently. You can only really make an informed decision or cost/benefit analysis once you know how your model will run on a particular graph database solution.

Once a graph model has been created in the Graph.Build platform, it can be tested and published within the system before being written to multiple different graph databases. This means that the same model can be tested across multiple databases and you can establish which performs the best in your use case.

Your organisation may already have a long history with relational databases from a particular vendor and there could be a lot of reasons why the first steps into graph databases for you are to consider that vendor's graph database offering. Of course, different vendors have different specialisms but that's not to say that taking this approach won't lead to an optimal solution, in any case if it is a first step in a proof of concept and it helps to move things along then choosing your existing relational database provider's graph offering as a starting point and then testing it against alternatives before committing to using it in production could be an approach that works for you.

Some popular SQL/Relational database vendors that offer graph database (DBMS) solutions are:

Apache Hbase (Hgraph)
Microsoft SQL Server
Oracle DB
SAP HANA

Provided that both databases support the model you have written in the Graph.Build studio, this is as simple as running two Graph Writers, one for each database. You could take this further though and make changes to the model in the tool, for example reworking it as a labelled property graph rather than an RDF graph, then run both versions of the solution in parallel to test the two conceptual models against each other with different graph vendors. The choice is yours because the capability to do testing quickly is within the platform.

Graph Creation Platform

Graph.Build Studio

Platform Architecture

Graph.Build Transformers

The Graph Development Lifecycle

Graph.Build Writers

Visual Graph Modelling Tool

Knowledge Graph ETL

Platform Documentation

Register

Database Compatibility

How-To-Guides

Graph Database Resources

Choose the Best Graph Database for Your Application

Graph Database Compatibility

Graph Database Vendors

Read More About Graph Databases

Comparison and Compatibility

Graph Database Vendor Lock

Which Graph Database Should I Choose?

Using an Unsuitable Graph Database

Why Graph Databases Perform Differently to One-Another

RDF Store (Graph)

Graph DBMS Only

Selecting a Graph Database Model Based on Model Type

Graph Databases from Well-Known Relational/SQL Database Vendors

Finally - How to Test Two Graph Database Vendors with One Model