Hands-On Sessions: SHACL

Overview

SHACL is a language for validating RDF graphs against a set of conditions, referred to as shapes. The language can support a variety of use-cases beside validation, including user-interface generation, data integration, data transformation, and inferencing. The language became a W3C recommendation in July 2017, filling an important gap in the Semantic Web technology stack. The most important feature of SHACL is that it adds the ability to validate RDF data, which until now has required non-standardized solutions. In this tutorial, the focus is on the features of SHACL core for the use in validation.

Instructions for students

For each exercise, keep a copy of your modified shapes graph. Hand in a single .txt file containing the solutions to each of the exercises below and email the file to robin.keskisarkka@liu.se.

Hands-on Exercises

The hands-on exercises will be done on an online platform prepared for the tutorial: SHACL Tutorial Playground. The platform is based on SHACL.js, which is an an open-source implementation of SHACL. In the exercises, you will create and modify existing shapes to validate data from a subset of the Nobel Prize dataset. Some of the exercises will require you to modify the data graph to verify that your shapes have been designed correctly. Refer to Section 2.1 of the ontology specification for an overview of the ontology of this data. The SHACL properties required for the exercises are described in the tutorial slides and SHACL by example slides. The tasks are defined in increasing level of complexity, and at the end you will have a grasp of the core features of SHACL and how it can be used for validating RDF data. Tasks marked as "advanced" add additional complexity to a previous task. It may be useful to have a look at the SHACL spec for the advanced exercises.

Keep a backup of the shapes you are working with in case you need to reload the web page.

Start by selecting Data 1 and Shapes 1 in the dropdowns. Run the validator and try to understand the why the validation fails. Modify the shape so that no violations are generated.
Add additional property shapes to validate the datatypes for foaf:givenName, foaf:familyName and foaf:name as xsd:string. Make sure to edit the data graph to verify that your new constraints are working correctly.
(Advanced) Try to model the name constraints in the previous task with a single property shape using sh:alternativePath.
Select Data 2 and Shapes 2. Add a property constraint to ensure that all laureates have at least one dbo:affiliation to some dbo:University.
(Advanced) The shape above probably doesn't take into account other types of affiliations. Modify the shape so that only one of the affilitations are required to be a university. Use sh:qualifiedValueShape and sh:qualifiedMinCount to add this new constraint.
Add another property shape that requires dbp:dateOfBirth and foaf:birthday to have equal values. Will the shape validate if one of the properties is missing?
(Advanced) Try to model a missing property in the task above using sh:or, sh:and, and/or cardinality restrictions.
Select Data 3 and Shapes 3 and confirm that the data validates correctly against the shapes graph. Now, create property shapes that validate foaf:name, foaf:givenName and foaf:familyName with regular expressions using sh:pattern. Names should not be allowed to be shorter than two characters and must not contain underscores or numbers. (Tip: use https://regex101.com/ to develop your regular expressions)
(Extra) Take the validation to the next level by polishing your regex expression and add you own constraints regarding legal names. Note that the regular expression is not meant to mimic erroneous data, so if you detect an obvious error in the data during validation feel free to update the data graph rather than incorporating the error into the validation.
Add a property constraint that limits the values of foaf:gender to either "male" or "female".
Add a constraint that requires all laureate awards to have a nobel:motivation that is a string of 10 characters or more. Allow only one motivation per language tag.

Page responsible: Olaf Hartig
Last updated: 2020-10-23

IDA - Department of Computer and Information Science

Hands-On Sessions: SHACL

Overview

Instructions for students

Hands-on Exercises