Mining textual data for simplified reading

The goal of the project is to make information available for all through individually adapted text retrieval and language adaption.

In todays school the students' own information retrieval plays an important role for knowledge acquisition. Students also have the right to have education based on their own level. Problem based learning, where the students gather information on their own, can, however, be difficult for students with a less fluent language ability. The documents a student finds are seldom adapted to the studentÿs language ability. This is a problem, as the students' language ability develops differently and can sometimes be limited. In the age of 10-15, which is the group focussed in this project, this includes students still having difficulties to read as well as students who read fluently. Finding texts on the right level is also a tedious task for the teachers.

If students can find texts that are adapted to their language abilities, the information becomes more meaningful as the texts are easier to understand. Individually adapted simplifications of the texts will further support the students' information retrieval. This stimulates reading as the students feel that they can read. Students with reading problems can get easier texts whereas students with no reading problems get more advanced texts. The students also get a tool for their work with source critique, as they understand the texts better if they are adapted to their reading abilities. Furthermore, as reading is a part of the learning process and by adapting texts to the students reading abilities, the students are stimulated to read.

Text genre plays a major role in information retrieval. We intend to study news texts, fact texts, societal information and school texts.

We intend to develop sophisticated measures for judging a studentÿs language abilities. We will also investigate how these measures can be transformed to values on known criteria like vocabulary, grammatical fluency, etc and how these can be used to analyse texts and, depending on genre, subject, content and readability individually select suitable texts based on a studentsÿ language abilities.

In the project we will develop digital tools for assessing language ability, where teacher and student together create a profile of the student's language ability. We will also develop tools for selecting texts based on language profile, content, readability and genre. Techniques will also be developed for automatic summarisation of long texts to a length that is suitable for the student and techniques and tools for automatic transformation to easy to read Swedish. There are no such, individually adaptable tools for knowledge acquisition today and, even if the project will focus on students the age of 10-15 years, the knowledge and tools developed in the project will be useful for a more general audience.

The project is financed by Wallenbergstiftelserna

Project members:

Arne Jönsson, Linköpings universitet
Sofie Johansson Kokkinakis, Göteborgs universitet
Caroline Liberg, Uppsala universitet
Johan Falkenjack, Linköpings universitet
Åsa af Geijerstam, Uppsala universitet
Katarina Heimann Mühlenbock, Göteborgs universitet
Jenny Wiksten Folkeryd, Uppsala universitet

Page responsible: Master
Last updated: 2018-11-05

IDA - Department of Computer and Information Science

Mining textual data for simplified reading