Göm menyn

729G78 Artificiell intelligens

Laboration 3: Kunskapsrepresentation


Purpose

The purpose of this lab is to learn how concepts, properties, and relations between concepts can be modeled and represented using ontologies. In this lab you will be using Protégé (https://protege.stanford.edu/), which is one of the most popular ontology modeling tools available. Using this tool, you will get some practical experience in creating simple ontologies.

Preparation

In preparation for this lab you should:

  • Read the full lab instructions.
  • Download and rename the exercises document Lab3Exercises_LiU-ID-1_LiU-ID-2.odt to match your group's LiU-IDs.
  • Read up on OWL. You don't need to understand all the details but try to understand its role in the context of the Semantic Web. We shall revisit the necessary OWL constructs throughout the lab.
  • For those working on their own computer: download any recent version of Protégé. You also need to install the Pellet plugin: File/Check for plugins, then check and install Pellet Reasoner Plug-in (Note: Depending on the Protégé version the plugin may not be listed and can require manual installation specific for your OS).

Description

As part of this lab, you will define two ontologies using the ontology modeling tool Protégé. Protégé is an open-source platform for constructing domain models using OWL, which is the most widely used ontology language. The tool is based on Java and can be run on most common platforms. Throughout this lab we will be running version 5.2.0, but both older (within limits) and newer versions should work just fine.

Note: There is a web version of Protégé but we strongly advise against using it for this particular lab.

Introduction

Before we get started, we need to get an idea of some fundamental concepts of modeling using ontologies. The list below is by no means exhaustive; however, a relatively firm grip of the concepts listed below are required to get through this lab. There are many good resources for learning more about both OWL and Protégé. Depending on your platform and version, the visual appearance of Protégé may differ between tutorials. If any tabs are missing you can find them under Window/Tabs.

Reasoning: Reasoning (or inferencing) is a process by which domain knowledge in the ontology is used to infer new information that is somehow implicitly reprented. For example, a female parent can be inferred to be a mother. It can also be used to detect some types of violations in the data.

Tip: Full OWL is often not used in real-world domains due to the computational complexity of the language. For this reason, several subsets (or profiles) of the language have been introduced.

Classes: Classes (also referred to as concepts) are the focus of most ontologies and describe concepts in the domain. For example, we can let the class wine represent the beverage we typically refer to as wine. This class can also have subclasses that represents concepts that are more specific than the superclass, such as red, white, and rosé wines. A class can also be the subclass of more than one concept. For example, the 1947 Château Cheval Blanc is not only a red wine but also a vintage wine, since it is both old and expensive ($304,375 per bottle).

Instances: An instance of a class inherits the properties of the class and can be said to represent "actual" data. Instances can belong to more than a single class.

Properties: Properties (also referred to as slots or predicates) link a concept or instance to some attribute. Properties will often be restricted in terms of a domain and a range. For example, we can state that the property hasFirstName requires a person as the subject (domain) and a string as the the attribute value (range). The attribute can also be another concept or instance, allowing us to link concepts together as a directed graph.

Tip: In general terms the ontology is knowledge about the domain rather than knowledge actual data. In reality, the line between a domain ontology and data is often blurred. Just rembember that an ontology means something specific in this particular context.

Class expressions: Class expressions are used to define restrictions on classes. A class can be specified as a subclass of or equivalent to several such expressions. The class expression syntax for Protégé is described here.

Tip: Protégé uses the Manchester OWL syntax to define these restrictions but the UI components provide some assistance.

Open-world assumption (OWA): This is in some ways the most difficult concept to grasp in this lab. The basic idea is very intuitive: just because we don't know something doesn't mean we can assume that it is false. For example, if we don't know a person's first name we shouldn't assume that the person has no first name. When we apply reasoning, however, this can have some surprising side effects. For example, assume that that a Person can be married to at most one person. Now, John is married to both Judy and Judith. What does this mean? Well, if we both the ontology and the data is correct then the only possible explanation is that Judy and Judith must be the same person.

Part 1

Run Protégé from the terminal using the command from somewhere in your home directory:

$ /courses/729G78/Lab3/Protege-5.2.0/run.sh

Note: Pay attention to the terminal output. If something goes wrong there will usually be a hint about what caused it.

Now its time to familiarize yourself with Protégé. We'll begin by creating a simple ontology.

  1. Open the classes tab and create the following classes, where Student should be a subclass of Person:
    • Person
    • Student
    • Course
  2. Make Person and Course disjoint.
    Complete Exercise 1.

  3. Define the following properties:
    • takesCourse (domain: Person, range: Course)
    • hasFirstName (domain: Person, range: xsd:string)
    • courseTakenBy (the inverse of takesCourse)

    Tip: Data type properties take simple values as their range. Object properties take instances as their range.

  4. Create three instances of owl:Thing (John, Mary, Judith) and give them each a first name. Create three instances of Course and define John and Mary to be taking one of these courses.

    Tip: Instances are sometimes called individuals.

  5. Run the reasoner: Reasoner/Start reasoner (select the Pellet reasoner if it's not already selected).
    Complete Exercise 2.
  6. Add a class restriction to Student. A student should be equivalent to a person who takes at least one course. Run the reasoner again.
    Complete Exercise 3.
  7. Save the ontology as Lab3_Part1_LiU-ID-1_LiU-ID-2.owl and match your group's LiU IDs.

Part 2

The development of an ontology is usually an iterative process that roughly involves:

  • defining the classes in the ontology
  • arranging the classes in a taxonomic (subclass-superclass) hierarchy
  • defining the properties with domain and range restrictions
  • arranging the properties in a taxonomic (subproperty-superproperty) hierarchy

In this part of the lab we will develop an animal ontology. There are variations in the ways the ontology can be implemented but before handing in the ontology make sure that it is consistent and that reasoning works as you intended.

  1. We will begin by defining the classes of the ontology. For each of the classes annotate it with a short description (use rdfs:comment).
    • Animals are divided into vertebrates and invertebrates
    • Plants are not animals
    • Insects and arachnids are invertebrates
    • Birds, mammals and reptiles are vertebrates
    • Herbivore, carnivore and omnivore are disjoint classes
    • Spider is an arachnid
    • Wasp is an insect
    • Butterfly is an insect
    • Penguin is a bird
    • Bear is a mammal
    • Crocodile is reptile

    Tip: "Add subclasses" allows you to specify more than one class at a time.

  2. Next we determine the required properties. For each property set the domain and range and annotate it with a short description (use rdfs:comment).
    • Reptiles, insects and arachnids are not warmblooded
    • Birds and mammals are warmblooded
    • Insects have exactly 6 legs
    • Arachnids have exactly 8 legs
    • Arachnids have at least 6 eyes
    • Mammals, reptiles and birds have 2 eyes
    • Spiders eat only animals
    • Wasps sometimes eat animals
    • Bears eat both plants and animals
    • Crocodiles eat only animals
    • Butterflies eat only plants
    • Herbivores are animals who only eat plants
    • Carnivores are animals who only eat animals
    • Omnivores are animals who eat both animals and plants
  3. Note: Carefully consider whether the properties should be data properties or object properties.

  4. Specify class restrictions for each of the classes above run the reasoner over the ontology frequently. Chances are that the results are not quite what you expected. Red markings indicate that nothing can be classified as that class (probably due to the presence of two mutually exclusive constraints). Update the model and re-run the reasoner until you are satisfied that it represents your intentions. After running the reasoner crocodiles and spiders should be classified as carnivores, butterflies as herbivores, bears as omnivores, and there should be no obvious errors in the inferred class hierarchy.
    Complete Exercise 4.
  5. Tip: Class restrictions are made as either "equivalent to" or "subclass of". The first one states that some properties are sufficient to be classified as something. For example, a person who studies some course is a student. The second states that the properties are only required by the class. For example, all computers are powered by electricity, but not all things powered by electricity is a computer.

    Tip: It is sometimes useful to give a specific value or to specify bounds for a certain property. When using the Manchester syntax it might be tempting to use exact, min or max for these cases, but this is not correct since this actually refers to the cardinality of the relations! Instead use value when specifying a specific value. To supply bounds to a value the format is somewhat different. For example, Person and hasAge some xsd:int [>17] would define "a person who is more than 17 years old"

    Caution: Universal restrictions can be a bit tricky. For example, the rule that "brunettes have only brown hair" would not prevent a reasoner from inferring that a bald person is a brunette (since there is no contradiction everything is fine, right?). It is therefore sometimes necessary to combine it with some existential quantifier like "brunettes have only brown hair and have some hair".

  6. Create two instances each of the following classes: Spider, Butterfly, and Bear. Use your imagination to come up with names the for the new instances and add these a rdfs:label to your instance. Run the reasoner.
    Complete Exercise 5.

    Note: Typically, the data will import the ontology, allowing the two to remain separated. For simplicity we here add our instances to the ontology.

  7. VG only: Add a new subclass to Invertebrate called Mysterious Bug. The Mysterious Bug has 8 legs, eats only meat, and has 8 eyes. Modify the Spider class by adding a restriction to correctly classify the new animal as a Spider. Make sure that the previous inferencing capabilities remain intact.
    Complete Exercise 6.

    Tip: Depending on how you've chosen to implement your ontology this may require some modifications. Remember that there is a difference between equivalentTo and subClassOf.

Hand-in

  • The ontology file for Part 1
  • The ontology file for Part 2
  • The exercises document (as PDF)
  • Upload your files to Lisam.

Sidansvarig: Robin Keskisärkkä