This page contains the instructions for the project assignments. For more information about the examination of the project module, see the page on Examination.

Overview

The project’s main purpose is to allow you to identify, assess, and use NLP research literature (learning outcome 4). You will also have the opportunity to deepen the knowledge that you have acquired in the other parts of the course.

General structure

You can either do the standard project or work on a project on a self-proposed topic. The general project structure and requirements are the same for both forms.

The standard project is carried out in groups of 4 students and centres around the task of syntactic parsing. The minimal requirements for the standard project are as follows:

Simple projects will make limited-scale modifications to the baseline system. Complex projects will be more varied and either implement substantial changes (such as a different parsing algorithm) or apply the parser in the context of some other task. In any case, the focus must be on implementing methods described in the NLP research literature.

If you want to propose your own topic, pitch your project idea to the examiner in good time before deliverable D2 (Project plan).

Time requirements

The project runs throughout the entire course, but most of the work is concentrated in the two project weeks (W9–W10). When you plan your time for the project, you should calculate approximately 56 hours per group member or a total of 224 hours for a group with 4 members. Here is a suggested breakdown of this time into concrete tasks:

Deliverables

While the choice of the topic of your project is completely up to you, the form of the project is relatively rigid. In particular, throughout the project, you will have to submit six deliverables (D1–D6); these are designed to keep you on track and to give you feedback on your progress. The rest of this page contains detailed information about these deliverables.

D1: Group contract

Your first task in the project (scheduled for weeks W3–W4) is to form your project group. We encourage you to form groups that include students with different backgrounds, skills, and interests, as this can improve the quality of the project.

After formation, your group must make a group contract that will govern your collaboration. The contract should spell out the behaviours you expect of all group members, as well as procedures for resolving impasses in the group.

Specific questions to think about include the following:

Instructions: Make a group contract and have it signed by all group members (physically or electronically). Include both the name and the LiU-ID of each group member. Submit the signed contract as a PDF document through Lisam. Rules for hand-in assignments

Due date: 2023-01-27

D2: Project plan

During the first few weeks of the course (W3–W8), your group should meet at least once weekly to plan and prepare the project. Towards the end of this phase, your group must hand in a project plan with the following structure:

In addition, your plan must contain a list of references to the research articles describing the methods you want to implement and evaluate in your project. The list should be formatted according to academic standards.

Instructions: Write a project plan (approximately 2 pages) per the specification above and submit it as a PDF document through Lisam. Rules for hand-in assignments

Due date: 2023-02-17

Feedback: We advise you to discuss your project plan with the examiner. Book an appointment

Standard project: Syntactic parsing

Syntactic parsing is the task of mapping a sentence to a formal representation of its syntactic structure. We will introduce this task in the first week, return to it several times throughout the course, and cover it in detail in Unit 5. To provide you with additional background material, we have compiled a reading list:

The starting point for the standard project is the tagger–parser pipeline that you will implement in labs 4 and 5. There are many different things that you can do to modify or apply this baseline system. Here are some ideas, roughly sorted from simple to complex. For each idea, we also list one relevant research article. You can also come up with your own ideas, and do your own literature search. Most research articles in the field of natural language processing are available for free via the ACL Anthology.

D3: Baseline

During W7–W8 the task for your group is to implement and evaluate the baseline for your project. If you are doing the standard project, this baseline is the tagger–parser pipeline that you will implement in labs 4 and 5. If you are doing a project on a self-proposed topic, the baseline is the implementation of whatever other system you will compare your work to.

Standard project: Tagger–parser pipeline

The baseline for the standard project is a simple pipeline architecture with the following components:

You will also need to write glue code to train and evaluate your system on any given Universal Dependencies treebank. Your code should report tagging accuracy and unlabelled attachment score.

Some of the Universal Dependencies treebanks contain so-called non-projective trees. To train on these treebanks, you will first have to projectivise them. For this, you can use the Python script projectivize.py (contains usage instructions). This script also contains the code to read and output dependency trees in the CoNLL-U format that you need for your baseline system.

Instructions: Send an e-mail to the examiner with a link to a GitLab repository containing your code.

The repository must contain a file README.md stating the tagging accuracy and unlabelled attachment score for your baseline system when trained on the training sections and evaluated on the development sections of the English Web Treebank (EWT).

In addition to this file, your repository must contain everything needed for the examiner to replicate your results. This must be possible by running the following commands. (Replace abcxy999 with your LiU-ID and nlp-project with the name of your repository.)

$ git clone git@gitlab.liu.se:abcxy999/nlp-project.git
$ cd nlp-project
$ python baseline.py

Due date: 2023-02-24

D4: Project work

During the two project weeks (W9–W10), you will extend or apply your baseline system according to your project plan. At the end of this period, you must submit a one-paragraph abstract for your project. The abstract should summarise what you have done in the project (which may be different from what you originally planned to do) and your main results. The main purpose of the abstract is to announce your presentation ahead of the course conference that will take place in W11.

Instructions: Submit your project title, abstract and repo URL via this online form.

Due date: 2023-03-10

D5: Project presentation

In the week following the project weeks (W11), your group will present your project in connection at the course conference. This conference follows a hybrid format with some presentations given on-campus and some asynchronously online. After the conference, you will give feedback on other groups’ projects.

Present your project – on campus

For an on-campus presentation, you are allotted a 15-minute time slot: 10 minutes for your presentation and 5 minutes for questions. You are free to choose the presentation’s content and structure. Bear in mind that the presentation needs to be understandable to everybody in the course (not only the examiner).

Instructions: Present your project according to the instructions above. The language of the presentation is English.

Feedback and examination: You will receive feedback on your project and your presentation from other students during and after the course conference; this feedback will be helpful to you when preparing your post-project paper. After the conference, the examiner will assess your presentation according to the Project rubric. This assessment will contribute to your grade for the project module.

Schedule for the on-campus sessions

The on-campus sessions will take place on Wednesday, 15 March, 15:15–17:00 and Thursday, 16 March, 10:15–12:00 in Alan Turing.

Session 1: Wednesday, 15 March, 15:15–17:00

Session 2: Thursday, 16 March, 10:15–11:45

Present your project – asynchronously

An asynchronous project presentation consists of two parts:

You are free to choose the presentation’s content and structure. Remember that the presentation needs to be understandable to everybody in the course (not only the examiner).

Instructions: Record your group’s presentation as a 10-minute video (mp4 format) and share it with the examiner. Also, email the examiner a link to a Zoom room that your group will use for the interactive session. Make sure to make your group members co-hosts so that some of you can leave the room to visit other rooms.

Due date: 2023-03-13

Feedback and examination: You will receive feedback on your project and your presentation from other students during and after the course conference; this feedback will be helpful to you when preparing your post-project paper. After the conference, the course teachers will assess your presentation according to the Project rubric. This assessment will contribute to your grade for the project module.

General suggestions

In preparing your presentation, you may want to consider the following questions:

Give feedback on other presentations

Each of you will be assigned three other presentations to provide feedback on. Of course, you are welcome to attend/watch more presentations as well – have a look at the project abstracts and see what interests you!

After the conference, you will submit a Feedback form for each of the three presentations you have been assigned. The form will contain the following questions/prompts:

Instructions: Submit your feedback forms, one form for each of the presentations assigned to you. Link to the feedback form

Due date: 2023-03-17

D6: Post-project paper

The final project-related assignment is an individual reflection paper. The purpose of this assignment is to give you an opportunity to take stock of what you have learned from the project. We ask you to structure your paper into three parts as follows:

You will encounter the same type of questions in the labs, which should give you a good starting point. For more tips on how to write a good reflection paper, see the Guidelines for the post-project paper.

In addition to the paper itself, we ask you to also submit a self-assessment form. The information in this form will allow us to provide more relevant feedback, by focusing on aspects where our own assessment deviates from yours. The form also provides you with an opportunity to vet your paper against the assessment criteria.

Instructions: Write a paper according to the given specification. The length of your paper should be around 1,500 words (approximately 3 pages). Submit your report as a PDF document through Lisam. The document should be named as follows: NLP--D6-your LiU-ID.pdf. Please also submit the self-assessment form.

Due date: (none)

Examination: The course teachers will assess your paper according to the criteria spelled out in the Guidelines for the post-project paper. This assessment will contribute to your grade for the project module.