Large-Scale Web Information Integration

FDA205, 2007VT

Status Archive
School National Graduate School in Computer Science (CUGS)
Division ADIT
Owner Nahid Shahmehri

The course is full! If you are CUGS student and would like to discuss your participation, please contact Nahid Shahmehri

Course plan


Number of lectures: 4

Recommended for

CUGS PhD students and CUGS MSc students.

The course was last given

1st time the course is given.


In this class we will survey the research literature on
large-scale Web information integration. We will examine a variety of
problems, some of their initial solutions, and develop new ideas for
relevant database research.


Basic database course


As background material, we will study basic techniques and
the theoretical foundations for information integration. We will
then cover several aspects of integration that are particular to
the Web, including: source discovery, modeling and selection;
schema matching; querying and crawling; and data extraction.


This is a seminar course that is based on current
research papers. There is no textbook. The intent is to explore the
new area of Web Information Integration by reading current papers and
related topics.Each week, we will tackle two topics and different
approaches to solving the particular problems. The objective is to
study multiple aspects of a topic by considering different
perspectives. For each topic, I will suggest the material you need to
read. The mechanics of this course are as follows. Two teams will be
assigned to each topic. One team, the "cheerleaders" will be
responsible for presenting a summary of the topic based on the
readings, and present the area in the best possible light. This can
largely be derived from the assigned readings, but you are encouraged
to go beyond these to discover other interesting work within the same
topic. The presentation should *not* be a linear presentation of the
sections in the papers, instead it should give a general overview of
the problem, challenges involved in addressing the problem, existing
solutions, and directions for new work in the area. The second team,
the "discussants", will present a short rebuttal to the presenters
talk. They will also come to class prepared with questions,
counterexamples, and a generally with a devil's advocate attitude
toward the work. With any luck, this will set up a debate-like
atmosphere in which we can argue about the pros and cons of the basic
technologies.The rest of the class (who are neither presenters nor
discussants) is expected to actively participate in the debate. Also,
in order to ensure that you read the papers and think about the issues
before coming to class, everyone who is not a presenter or a
discussant will write a brief position paper which captures your own
thoughts about the readings. My guess is that these will need to be
about 1 page in length, but you may use whatever you feel is



prof. Juliana Freire (from UTAH university)


Juliana Freire (through Nahid Shahmehri)



3 to 6 p

Organized by

Linkoping university, ADIT


