Master Thesis Proposal: Publishing Linked Energy Data

During the past decade, the Semantic Web has emerged as an evolution of the human readable web, towards a web of macine processable data with formal semantics. A large part of the current Semantic Web is the Linked Data movement. Linked Data is concerned with publishing RDF data online, and linking datasets so as to be able to use them together.

In the field of energy efficiency assessments of enterprises, a lot of data is collected about suitable measures for reducing energy consumption, and how those measures actually turn out in terms of reduced consumption and saved money. However, these data are usually collected either in internal databases of some organization, or if published openly, usually available in proprietary formats such as Excel sheets. This makes it difficult both to combine data from different sources, and reuse data from one or more sources in an application. Publishing these sources as Linked Open Data could remedy this problem.

The aim of this master thesis is to transform (parts of) the IAC database of assessments and recommendations and the Swedish PFE database to RDF, express it using suitable vocabularies, link the two datasets and then publish them on a server providing a SPARQL endpoint for queries over the joint dataset. The student should pay particular attention to lessons learned and difficulties that arise which could be generalized to similar projects, and which could help others to publish better data or make the publishing process easier in the future. Apart from the dataset as such, and an analysis of its quality, the thesis should result in a set of recommendations for how energy-related linked data could be published in the future and how the process could be made easier based on the lessons learned from this project.

Recommended skills to attempt this thesis include some basic knowledge of logical languages (preferably, but not necessarily, previous knowledge of RDF/OWL), familiarity with web languages such as XML and web infrastructure (e.g. URIs, web servers, the REST protocol) and at least some basic knowledge of Swedish (since one of the datasets is in Swedish).

How to express interest in this topic:

In order to make sure that the student will be able to complete thesis, I usually ask the student to do a small "test" before we agree on starting up the thesis. This is not an exam, just a test to see how quickly the student is able to get into the field. It is also an opportunity for the student to learn more about the technologies involved and see if you are really interested in them, before you start!

So if you are interested in this topic, please access this page and follow the instructions (i.e. fill out the questionnaire and do the small modelling task).

Note that you are free to find any online reading material you like, and use any tool of your choice. But please do not spend more than a day or so (spread out over one week) on this task, so as not to waste too much time if you in the end find it too difficult or not interesting, and in order to let me see how fast you catch on to the techniques and languages.

Don't forget to send me an e-mail expressing your interest in the thesis topic, including the solution to the modelling task, when you are done!

