Introduction to Bioinformatics
Biological databanks - Assignment
Patrick Lambrix, Vaida Jakoniene, IDA
In this assignment you can deepen your knowledge in one particular aspect of biological databanks. Select one of the tasks presented below. Hand in a report of circa 5 (machine-written) pages.
- Select at least three biological databanks (e.g. from NAR or NAR01 or from the course home page of TDDB77 under 'projekt') that can be integrated. Familiarize yourself with the information in the databanks. Read the Davidson et al. 1995 paper on challenges in integrating biological databanks. Discuss the topics in that paper using your chosen databanks. Do the problems as described in the paper still exist? Are there any other problems? Use examples from your chosen databases to exemplify your arguments.
- Compare the following retrieval systems that integrate other resources: SRS, Entrez, TAMBIS, K2. Discuss at least the goals, architectures, query languages, data models, techniques, and methods. Give examples!
- Select some papers of your own choice related to biological databanks. Get your choice approved by Patrick Lambrix. Your report should give an integrated view of the articles that you have read. Issues that should be taken up in the report include a discussion on what the problem is that the authors of the articles try to solve, how they solve it, why their solution is interesting, and whether there are implemented systems based on those solutions.
- Propose a small implementation project related to the area of biological databanks. Get your proposal approved by Patrick Lambrix.
[Challenges] Davidson, S., Overton, C., and Buneman, P., Challenges in Integrating Biological Data Sources, Journal of Computational Biology, 2:557-572, 1995.
[K2] Davidson, S., Crabtree, J., Brunk, B., Schug, J., Tannen, V., Overton, C., and Stoeckert, C., K2/Kleisli and GUS: Experiments in integrated access to genomic data sources, IBM Systems Journal, Issue on Deep computing for the life sciences, 40(2):512-531, 2001.
- provides a list of database categories with pointers to actual databases.
[NAR01] The 2001 Database Issue of Nucleic Acid Research. http://www.nar.oupjournals.org/content/vol29/issue1/
- descriptions of the most important and newly formed biological databases.
[TAMBIS] Goble, C., Stevens, R., Ng, G., Bechhofer, S., Paton, N., Baker, P., Peim, M., and Brass, A., Transparent access to multiple bioinformatics information sources, IBM Systems Journal, Issue on Deep computing for the life sciences, 40(2):532-551, 2001.