Upon completion of this lesson, you will be able to:
An ontology is a formal representation of a set of concepts within a domain and the relationships between them. It is used to provide a shared understanding of the structure of information and the meaning of terms used in a specific domain or knowledge area. The need for an ontology arises from the need for a common and unambiguous representation of knowledge, which can support tasks such as information retrieval, data integration, and reasoning. In addition, ontologies can facilitate communication and collaboration among people and systems by providing a shared vocabulary and common framework for representing and organizing information.
Ontologies are an essential aspect of describing a domain prior to designing shared information spaces, object-oriented designs, or data stores.
This lesson introduces a set of structures for describing an ontology in narrative form using English. It can be used alongside a visual rendering of the ontology, such as a diagram using the Unified Modeling Language (UML).
An ontology describes a domain. In the context of ontologies, a “domain” typically refers to a specific area or subject matter that an ontology is designed to represent and describe. It is described from a perspective and requires these elements:
In an ontology, we furthermore distinguish between three types of relationships:
A term is a noun or noun phrase that has a special meaning in the domain. Terms are describe concepts and are expressed as commonly entities (classes)..
Example terms for the domain of car rental business:
A term is an entity in the business domain, e.g., customer, invoice, sale, vehicle. Use information discovery methods, such as interviews or document analysis, to discover terms from use cases and other narrative descriptions of business processes and system interactions. During discovery, listen for nouns in descriptions of processes.
Terms generally have properties that describe information about them such as attributes:
Attributes have an associated value domain or data type. For example, the value domain of make is the enumerated list of all makes of vehicles, such as {Volkswagen, Audi, Mercedes, Dodge, Ford, Fiat, Tesla, …}. The enumerated lists are finite, although may be quite large.
Terms are described by nouns or one or more adjectives plus a noun drawn from a language (English, Portuguese, German, Mandarin, etc.). A term can sometimes be described by more than one word and an ontology needs to express all synonymous words for a term. A single word can also, sometimes, describe different term. Finally, a noun in one language may describe a different term than the same word used in a different language. If no word has yet been ascribed to a concept, then either a new word is invented or a word is borrowed from another language.
For example, the term for a personal wireless communication device using cellular technology is “Cell Phone” in US English, but in German it is referred to by the (made-up) term “Handy”. In British English as well as a synonym in US English, the term “Mobile Phone” is also used. So, a concept can be described by more than one word.
Consider the term “bank” (English language), which can describe two different concepts depending on the context. In one context, “bank” refers to a financial institution where individuals and businesses can deposit money, obtain loans, manage their accounts, and conduct various financial transactions. This concept represents a place where people store and manage their money, and it’s commonly associated with services like checking and savings accounts, mortgages, and loans. In another context, “bank” can refer to a natural landform along the side of a river. In this concept, a “river bank” represents the sloped or elevated area adjacent to a river or other bodies of water. River banks are often composed of soil, sand, or other materials and play a role in controlling water flow, erosion, and providing habitats for various organisms.
As we can see, the term “bank” can have different meanings and represent distinct concepts depending on whether it is used in a financial or environmental context.
Another example of the same word, but coming from different languages, can describe a different concept. For example, the term “old timer” describes an old person in English but an antique car in German.
This illustrates how context is crucial in understanding the intended meaning of a term, and why ontologies and knowledge representation systems aim to disambiguate such terms to ensure clarity and precision in communication and data interpretation. One strategy is to define every term with a glossary definition and provide example instances.
A fact is an association (relationship) between two or more terms or a term and a property. Example facts for the domain of car rental:
Facts can be stated in more than one way. A visual data model (e.g., entity-relationship or UML class diagram) is a visual representation of facts.
A Constraint are Rules describes a policy, guideline, standard, or regulation which is present in a domain. It is a statement that defines or constrains some aspect of a domain and is intended to assert structure. It may also control or influence the behavior of people, organizations, or information systems.
Constraint and rule definitions must be atomic, i.e., they must not be further decomposable, and they may not be context-dependent.
To illustrate the notion of a constraint, consider these possible constraint rules for a rental car agency:
Constraints are initially expressed as informal narratives during ontology analysis and information discovery. Then, to remove ambiguity, they are transformed into structured narratives. The structured narratives can be augmented with decision tables, decision trees, and UML models.
Constraints are sometimes referred to as facets in ontology design.
Constraints may include the following types:
Some constraints express an assertion. An assertion is a type of constraint the specifies when an action is allowed to occur and when it is not. It includes conditions and authorizations. For example, the following below are some assertions for a car rental business domain:
A derivation is a rule that infers a value from other values (generally attributes). Example derivations for a car rental business:
if rental date is during peak season, then increase the base daily rental charge by 10%
rental charge = (daily rental charge * number of rental days) +
(hourly rental charge * number of overdue hours) + sales tax + (number of gallons of fuel * fuel price) + insurance
During the analysis process, identify any consequences that would arise if a constraint were violated, i.e., what would happen if a constraint were not adhered to.
In addition, it is essential to determine how a constraint might be (or should be) enforced. Enforcement could be automated through an information system, an organizational process, by an individual through checks-and-balances, or via an external agency (e.g., regulatory body, auditor).
A well-written constraint definition is:
Constraints express rules of a business: they are business rules. Information systems, processes, and policy statements enforce rules.
The sentences below (not in Structured English) express various ontology elements for the domain of book publishing.
“Rules build on facts, and facts build on concepts as expressed by terms. Terms express business concepts; facts make assertions about these concepts; rules constrain and support these facts.”
Defining the elements of an ontology using Structured (English) Narratives forces discipline on the part of the ontology designer and expresses the elements with less ambiguity. Each of the elements of an ontology has one or more corresponding narrative structures that should be used when defining an ontology instead of colloquial English1.
Structured Narratives are a modified form of English for precisely specifying constraints and rules. It uses a subset of English that is limited to:
Let’s take a look at how each element of an ontology can be defined as a Structured English expression.
Terms represent objects, transactions, events, roles, people, abstract concepts from a business domain and are expressed in a structured narrative statement as an italicized noun.
Terms should be define with a form of “glossary” entry using the phrase IS DEFINED AS.
A vehicle IS DEFINED AS an commercial automobile, truck, SUV or van that is available for sale at the dealership.
As the definition is in Structured English, it is presumed that the words describing the term are drawn from the vocabulary of English and as defined in English.
Facts connect terms with a verb phrase that defines the nature of the relationship. It commonly uses either a verb phrase from the business domain, or verb phrases for specific types of relationships, e.g., “is part of” or “contains” for a partonomy, is a kind of or is a type of for generalization taxonomies, and is linked to or is associated with for associations.
A truck IS A TYPE OF vehicle.
Every vehicle HAS A vehicle identification number.
Aliases are a form of fact where an alternative term is introduced. Aliases are specified with the phrase IS ALSO KNOWN AS.
Vehicle identification number IS ALSO KNOWN AS VIN.
Customer IS ALSO KNOWN AS Client.
To avoid (or at least reduce) ambiguity when defining constraints, structured narratives offer some benefit. An example of a structured narrative definition of a constraint rule in a natural language:
It must always hold that a customer’s driver’s license has an expiration date at least one month past the end date of the vehicle rental period.
This constrains the value of an attributes within a class. Think about how this might be enforced?
Let’s take a look at specific phrases that define different type of constraints. The “keywords” are often written in all upper case to distinguish them from constraint definitions. Terms are often in mixed upper/lower case and italicized.
Comments may be added to constraint definitions or any other structured narrative statement by enclosing them in curly braces, e.g., { … }.
Key values are attributes or combinations of attributes that uniquely identify every instance of a class, i.e., a primary key. Key values are specified with the phrase IS UNIQUELY IDENTIFIED BY.
A vehicle IS UNIQUELY IDENTIFIED BY its vehicle identification number.
These rules define conditions about associations or attributes values between classes (terms) and property (attribute) values that must always be true.
They are expressed with IT MUST ALWAYS HOLD THAT clauses:
IT MUST ALWAYS HOLD THAT an vehicle rented to a customer is located on the lot.
IT MUST ALWAYS HOLD THAT a vehicle has mileage less than 100,000.
IT MUST ALWAYS HOLD THAT a project may not have more than 12 members.
IT MUST ALWAYS HOLD THAT for any given retirement account the account owner is not the primary beneficiary.
Conditional assertions are rules that define what conditions must be met before an activity is allowed to take place.
They are defined with WHEN and IF clauses:
WHEN the in-stock level is below the reorder threshold
THEN place order with distributor
WHEN an order is placed by a customer
IF the customer does not have a line of credit
THEN request deposit ELSE waive deposit
These rules describe derivations for facts that can be inferred from other facts. Specified with IF AND ONLY IF and IF … THEN clauses:
An item is in inventory
IF AND ONLY IF
the item is carried by at least one distributor.
IF an item is no longer made
THEN the item is discounted by 20%
These derivation rules describe processing algorithms or mathematical formulae and are specified with IS COMPUTED AS FOLLOWS clauses:
The default retail price of an item
IS COMPUTED AS FOLLOWS
retail price = cost * (1 + markup)
where default markup = 20% (0.20)
{does not include tax}
Computational derivations are most common for derived attributes.
This is more of an anti-constraint in that it specifically relaxes some restriction or explicitly permits an action, relationship, multiplicity, or attribute value. it is specified with the IT IS POSSIBLE THAT or IT IS ALLOWED THAT. Naturally, the negative can be indicated by adding NOT.
IT IS POSSIBLE THAT
a vehicle is sold for which the title has not been received.
IT IS NOT ALLOWED THAT
a vehicle with a Canadian title can be sold in the US.
Constraint rules may be applicable:
A scope restrictions may be specified before the narrative statement and should be enclosed in parenthesis. The use of a different front and color are optional but can be useful to distinguish a scope restriction from the constraint definition.
(during order processing)
IT MUST ALWAYS HOLD THAT
an item that is sold to a customer is in stock
(after 7/1/87)
A person is a contractor IF AND ONLY IF
the person works for at least one other company
Although it is not necessary to specify universally applicable scope (as that is implied by the absence of a scope restriction), it may nevertheless be done for clarity.
(always)
IT MUST ALWAYS HOLD THAT
a contractor is paid an hourly wage
Follows these general guidelines when writing constraint rules as Structured Narratives:
Slide Deck: Structured Narratives (open context menu to download)
None
or any other spoken language↩︎