Graph Model Ontology
An Introductory Guide
What is an Ontology?
An ontology is a formally defined representation of knowledge that sets out the concepts and relationships within a particular domain. It is like a vocabulary or a set of rules that provides a common understanding of a specific subject area. Ontologies are used to enable sharing and reuse of knowledge and facilitate communication and reasoning among people or computer systems.
Ontologies are typically represented using a graph model, where the nodes represent concepts or classes, and the edges represent relationships between them. For example, in a medical ontology, the nodes might represent diseases, symptoms, and treatments, while the edges represent the relationships between them, such as "causes," "treats," or "diagnoses."
One of the main benefits of ontologies is that they provide a standardised way of representing and sharing knowledge, which can improve communication and collaboration across different organisations and systems. By using a common language and set of rules, ontologies can help avoid misunderstandings and improve the accuracy and consistency of information.
Ontologies are used in many fields, including healthcare, finance, and engineering. In healthcare, ontologies are used to facilitate interoperability between different electronic health record systems and to support clinical decision-making. In finance, ontologies are used to standardise financial reporting and analysis. In engineering, ontologies are used to support product design and development.
Ontologies can also be combined with other technologies like artificial intelligence and machine learning to enable more advanced applications. For example, ontologies can be used to train machine learning models and improve their results' accuracy and interpretability.
Relationship and Differences Between Graph Models and Ontologies
An ontology is not precisely the same as a graph model, but they are related concepts that can be used together to represent knowledge and relationships between entities.
An ontology is a formally defined representation of knowledge that sets out the concepts and relationships within a particular domain. It typically consists of a set of classes, properties, and relationships, which can be used to describe and organise information about entities in that domain. An ontology is typically represented using the OWL (Web Ontology Language) model, which is written in a format of RDF (Resource description framework). RDF has many written formats including XML, JSON and Turtle.
On the other hand, a graph model is a way of representing relationships between entities using nodes and edges. It can be used to visualise and analyse complex networks of relationships, such as social networks, supply chains, or biological systems.
While an ontology can be represented using a graph model, and a graph model can be used to represent relationships between entities in an ontology, they serve different purposes. Ontologies are used to define a formal vocabulary for a particular domain, while graph models are used to represent and analyse relationships between entities within that domain.
Ontologies are not necessary for creating or using a graph model. In fact, many graph models are based on data that does not require a formal ontology. However, ontologies can be helpful in creating a graph model in certain contexts, particularly when dealing with complex or specialised domains. Having an ontology can help ensure that the graph model is accurate, consistent, and easy to understand. Ontologies can also enable interoperability between systems using the same domain-specific knowledge.
Ontologies and Knowledge Graphs
There is a lot of vocabulary used in the field of data graphs that can be difficult to navigate for the newly initiated. For example, if someone refers to a knowledge graph as an RDF graph and someone else says that some RDF data is an ontology, then it seems logical to ask what the difference is between a knowledge graph and an ontology. The previous section on this page explained the difference between an ontology and a graph model. Because a knowledge graph is a graph model, the same section can be used to understand how a knowledge graph and an ontology are different.
RDF & OWL
RDF and OWL are names that come up a lot in this field. They are used when creating ontologies and are described here:
Resource Description Framework (RDF)
RDF is a framework for modelling and exchanging data on the web. It provides a set of specifications for representing data in the form of triples, which consist of a subject, predicate, and object. RDF can be used to create ontologies and to represent knowledge in a graph model.
Web Ontology Language (OWL)
OWL is a language for creating ontologies on the web. It is based on RDF and provides a more expressive way of describing concepts and relationships than RDF. OWL includes a set of constructs for representing classes, properties, and relationships between classes.
Turning to the connection between ontologies and knowledge graphs, it is that an ontology is used in the construction of a knowledge graph and provides the foundation for its use. Strictly speaking, based on its definition, a knowledge graph does not need an ontology. However, in practice, a knowledge graph is built so that the users can find answers to questions using inference and reasoning in sophisticated ways. If this wasn't so and the data was already organised consistently in a well-understood way, then it could be said that the use case for a knowledge graph would be less likely to be investigated. However, note on this point that there may be structural reasons why a knowledge graph would be used - graph models provide a more efficient structure for certain types of queries vs a relational model, for example.
In the real world of knowledge graph implementation, the use case usually involves heterogeneous data (different by nature) from multiple sources. Here, an ontology allows a knowledge graph implementation to provide integration, reasoning, and presentation of knowledge in a structured and meaningful way. Some examples of the foundational functionality provided by an ontology within a knowledge graph are:
Ontologies provide a shared vocabulary of terms and concepts that can be used to align data from different sources. By mapping the entities and relationships in a knowledge graph to concepts in an ontology, it becomes easier to integrate and reason over the data.
Reasoning and inference
Ontologies can be used to add semantic meaning to the entities and relationships in a knowledge graph, allowing for more sophisticated reasoning and inference. For example, an ontology can define rules for inferring new facts based on existing knowledge, such as transitive or inverse relationships.
Ontologies can also be used to validate the structure and content of a knowledge graph. By specifying constraints on the types of entities and relationships that can be used, an ontology can help ensure that the knowledge graph is consistent and conforms to best practices.
Ontologies can provide a structured way of organising and presenting the entities and relationships in a knowledge graph. An ontology can support faceted search by defining a hierarchy of concepts and properties, allowing users to explore the data by navigating through a series of filters.
Common Standard Ontologies
Standard ontologies are widely used in different domains to ensure interoperability between systems that use the same domain-specific knowledge. By using a standard ontology, systems can exchange data and knowledge more easily and accurately, improving the efficiency and effectiveness of data sharing and analysis. Some common standard ontologies are listed below:
Simple Knowledge Organisation System (SKOS)
SKOS is a standard for representing knowledge organisation systems such as thesauri, taxonomies, and classification schemes. It provides a way to represent concepts and relationships in a hierarchical structure.
Gene Ontology (GO)
GO is a widely used ontology in the field of molecular biology. It provides a standardised vocabulary for describing gene products and their functions in different biological processes. Gene Ontology is an example of a domain ontology because it is specific to a particular domain of knowledge.
Medical Subject Headings (MeSH)
MeSH is an ontology developed by the National Library of Medicine for indexing and searching biomedical literature. It provides a standardised vocabulary for describing concepts in the medical domain. MeSH is another example of a domain ontology.
Defining Your Own Ontology
There are several reasons why you might choose to create your own ontology instead of using a standard one:
A standard ontology may not cover all the concepts and relationships specific to your domain. By creating your own ontology, you can define concepts and relationships that are specific to your domain and capture the nuances of your domain-specific knowledge.
A standard ontology may be too general or too specific for your needs. By creating your own ontology, you can define concepts and relationships at a granularity appropriate for your use case.
A standard ontology may not fit your specific use case or requirements. By creating your own ontology, you can tailor the ontology to your particular needs and requirements.
By creating your own ontology, you can completely control the ontology and modify it as needed. With a standard ontology, you may have limited control over the ontology and may need to request changes or updates from the ontology owner.
By creating your own ontology, you can integrate it with other systems and data sources that use the same ontology, improving the interoperability and consistency of your data.
Creating your own ontology can be a complex and time-consuming task. It requires a deep understanding of domain-specific knowledge and the relationships between concepts. It also requires expertise in ontology modelling and ontology languages such as OWL or RDF. Therefore, it is essential to carefully weigh the benefits and costs of creating your own ontology before embarking on the task. Sometimes, using a standard ontology and customising it to fit your specific needs may be more efficient.
The Difference Between Ontology and Schema
An ontology is similar to a schema because both provide a framework for organising and describing data. However, there are some crucial differences between the two concepts.
An ontology is a formally defined specification of a shared domain conceptualisation. It provides a set of concepts and relationships that can be used to describe the data within a particular domain. Ontologies are typically designed to be flexible and extensible, allowing for the addition of new concepts and relationships as needed.
A schema, on the other hand, is a specific blueprint for organising data within a database or other data management system. A schema defines the structure of the data, including tables, columns, relationships, and constraints. Schemas are typically more rigid and static than ontologies, as they are designed to ensure the integrity and consistency of the data within a particular system.
While both ontologies and schemas provide a way to organise and describe data, they are typically used in different contexts. Ontologies are commonly used in the fields of artificial intelligence, the semantic web, and knowledge management, where the goal is to enable the intelligent processing of data. Schemas, on the other hand, are typically used in database design and management, where the goal is to ensure the accuracy and consistency of the data within a particular system.
Example of an Ontology
If you are reading about graph models and ontologies, then the chances are you're quite familiar with business consultants, so let's say we have a graph database about business consultants. In this example graph that we will use, each node represents a consultant, and the edges represent their relationships with other consultants or with different types of businesses they work with. An ontology for this graph might include the following concepts:
- Consultant: The main entity in the graph represents an individual who provides business consulting services.
- Company: A business entity that may hire consultants for various consulting services.
- Industry: A particular field or sector in which a company operates.
- Service: A specific type of consulting service that a consultant provides to a company.
- Skill: A particular skill or expertise area that a consultant possesses.
These concepts can then be further organised into hierarchies or taxonomies, for example:
- A consultant may specialise in one or more industries (e.g., finance, healthcare, technology) and may provide various services within those industries (e.g., strategy consulting, operations consulting, financial consulting).
- A company may operate within one or more industries and require various consulting services (e.g., market research, human resources consulting, financial planning).
- A consultant may possess various skills (e.g., project management, data analysis, communication), which they can apply to different consulting services and industries.
Using this ontology, we could create a graph database representing the relationships between consultants, companies, industries, services, and skills. For example, we might create nodes for each consultant and company, with edges representing the consulting services provided, their industries, and their skills. This graph could then be used to power a range of applications, such as search and recommendation systems for matching consultants with companies based on their skills and expertise, or analytics tools for analysing trends in the consulting industry.
Ontology Example in RDF and LPG
In the example of the graph database about business consultants previously mentioned on this page, the ontology could be described using RDF in a semantic graph by defining a set of classes and properties corresponding to the ontology's concepts and relationships.
An example of how the ontology could be represented in RDF would be to define the classes Consultant, Company, Industry, Service, and Skill, and the properties hasIndustry, hasService, hasSkill, and worksFor. We also define two individuals: consultant1, which is an instance of Consultant and has properties hasSkill, hasService, and worksFor, and company1, which is an instance of Company and has property hasIndustry.
In a labelled Property Graph (LPG) representation, nodes and edges would be defined that correspond to the concepts and relationships in the ontology. An example of how to do this would be to define nodes for Consultant, Skill, Service, Company, and Industry as well as edges for hasSkill, hasService, worksFor, and hasIndustry. We also assign IDs to each node to uniquely identify them.
The edges in the graph represent the relationships between the nodes. For example, the edge (:Consultant)-[:hasSkill]->(:Skill) represents the fact that the consultant identified by id: "consultant1" has the skill identified by id: "projectManagement".
In both the RDF and LPG examples, the ontology representations can be used to query and manipulate the data in the graph, using different techniques in each case. These graph models can be used to answer a wide range of queries related to the data they represent.
Graph queries are generally used to extract information about relationships between entities in a dataset. For example, in the business consultant ontology, a user might want to answer questions such as:
- What skills does a particular consultant have?
- What services does a particular consultant offer?
- What companies does a particular consultant work for?
- What industries are represented by the companies a particular consultant works for?
- Which consultants have expertise in a particular skill?
- Which consultants work for companies in a particular industry?
- Which services are offered by consultants with expertise in a particular skill?
- Which skills are commonly found among consultants who work for companies in a particular industry?
Graph query languages can be used to answer these types of questions. The three most commonly used graph query languages are Cypher, Gremlin and SPARQL. These query languages allow users to traverse the graph and extract the information they are interested in based on patterns of nodes and edges in the graph. In addition, graph algorithms can be used to perform more complex analyses of the graph, such as identifying clusters of highly interconnected entities or finding the shortest path between two nodes.
Graph Models Without Ontologies
It is possible to have a graph model without an ontology. In fact, many graph databases and graph-based applications do not use ontologies. In such cases, the graph model is often referred to as a schema-less or schema-free graph.
A schema-less graph allows for more flexibility in data modelling, as it does not require a predefined schema or ontology. Instead, the graph can evolve over time as new nodes and relationships are added. This capability can be advantageous in cases where the data is constantly changing or where the data model is not well understood in advance.
However, a schema-less graph also has some drawbacks. Ensuring data consistency and integrity can be more difficult without a predefined schema or ontology. It can also make it more challenging to query and analyse the data, as there may be less structure to guide the analysis.
Therefore, the decision to use a schema-less or schema-based graph depends on the specific requirements of the use case. A schema-based graph with an ontology may be a better choice when the data is well-understood and relatively stable. In cases where the data is highly dynamic and the structure is less well understood, a schema-less graph may be more appropriate.
Why Wouldn't a Graph Model Have an Ontology?
Using a social network as an example, it can undoubtedly have an ontology. In fact, many social networks do use ontologies to help structure and organise their data. An ontology can provide a common vocabulary and set of concepts that can be used to describe the data within the social network, making it easier to share and integrate data with other systems.
That being said, some social networks may choose not to use an ontology for various reasons. For example, suppose the social network is relatively simple and does not involve a lot of complex data relationships. In that case, an ontology may not be necessary. Additionally, some social networks may prioritise flexibility and agility over structure and may choose to use a schema-less approach to data modelling using a graph database.
However, using an ontology can significantly benefit a social network in many cases. It can help to ensure consistency and accuracy of the data within the network. It can make it easier to integrate data from different sources. It can also provide a framework for automated reasoning and machine learning, which can help to uncover new insights and improve the user experience.
Using Ontologies for Enterprise Scale Graph Models
Organisations need to consider how an ontology will be developed, maintained, and governed when implementing an enterprise knowledge graph. It may be desirable to have a single, centrally-controlled ontology within an organisation to avoid the need for complex ontology alignment.
The process of incorporating an ontology into the organisation's systems and workflow requires careful planning and implementation. Beyond the initial implementation, some functions need to be performed so that the knowledge graph remains fit for purpose.
Monitoring, Maintenance and Ontology Management
Both the ontology and database should be monitored and maintained over time to ensure that they are up-to-date and accurate. Therefore, the ontology, graph model and graph database need updating over time with continual improvement and optimisation. This process needs to be backed up by monitoring the data quality and performance of the knowledge graph.
The following are some of the activities involved in monitoring and maintaining an ontology:
Updating the ontology: The ontology should be updated periodically to reflect changes in the domain knowledge and requirements. This involves adding new concepts and relationships, removing outdated ones, and modifying existing ones. The updates should be based on input from domain experts and stakeholders.
Version control: The ontology should be managed using a version control system to keep track of the changes and versions over time. Using this ensures that the ontology is consistent and reproducible. It also allows for easy rollback in case of errors or issues.
Quality assurance: The ontology should be monitored for quality, such as consistency, completeness, and accuracy. This can be done using automated tools and manual inspections, and should involve domain experts and stakeholders.
Usage monitoring: The usage of the ontology should be monitored over time to evaluate its impact and effectiveness. This can be done using metrics such as query frequency, user feedback, and citation analysis.
Community engagement: The ontology should be promoted and supported by a community of users and stakeholders who can provide feedback, use cases, and contributions. This can be done through outreach, training, and collaboration activities.
Integration with other systems: The ontology should be integrated with other systems and workflows, such as databases, applications, and decision support systems. This ensures that the ontology is utilised effectively and efficiently, and that the data and knowledge are integrated and interoperable.
Ontology governance is integral to enabling semantic application development in an enterprise setting. It involves establishing policies, procedures, and guidelines for ontology development, maintenance, and usage, as well as assigning roles and responsibilities to the stakeholders involved in the process.
Establishing ownership and stewardship is an important first step when implementing ontology governance. This means determining who owns the ontology and who is responsible for its maintenance and updates over time.
Ontology governance and ontology management, discussed previously, are not the same thing. Ontology management focuses on the technical aspects of managing the ontology data, such as storage, retrieval, and integration with other systems. It involves using ontology editors, version control systems, and other software tools to create, edit, store, and retrieve ontology data.
Ontology governance, on the other hand, focuses on the processes that the management work procedures work within, including establishing policies and procedures for the management of the ontology. It involves assigning ownership, establishing quality control, and managing the lifecycle of the ontology. It can be said that ontology governance is a broader concept than ontology management. Without governance, management activities may not effectively meet an organisation's objectives.