Welcome to the KitEGA web site.
During the last decade an enormous amount of biological data has
been generated and techniques and tools to analyze this data are
being developed. Many of these tools use some form of grouping.
They organize the data according to a certain aspect or a combination of
aspects.
Grouping of data entries in one or more data sources is an operation
underlying many different data management tasks. Grouping can be
used to structure and visualize search results.
This is especially important when large data sources are studied. It may
lead to the discovery of new knowledge or may allow to locate the
information of interest faster. The identification of similar data entries
and their grouping are also
core operations for data cleaning and data integration.
A number of aspects influence the quality of the grouping results:
the quality of the data sources, the selection of the grouping
attributes and the algorithms implementing the grouping procedure.
Many methods exist, but it is often not clear which methods perform best
for which grouping tasks. The study of the properties, and the
evaluation and the comparison of the different aspects that
influence the quality of the grouping results, would give us
valuable insight in how the grouping procedures could be used in the
best way. It would also lead to recommendations on how to improve
the current procedures and develop new procedures. To be able to
perform such studies and evaluations we need environments that allow
us to compare and evaluate different grouping procedures. KitEGA is such
an environment.
|