Stylistic advice to my students for writing a thesis

by Christoph Kessler, IDA, Linköpings universitet, Sweden

This is a list of comments on some stylistic issues for master theses and similar documents that I am supposed to examinate.
Most of these were compiled from a collection of frequently encountered stylistic errors in previous master theses.
My suggestions are based on scientific writing conventions used by international publishers.

Most of these points concern both English and Swedish.

As a general reference for proper writing in English, I recommend
Heffernan, Lincoln, Atwill: Writing, A College Handbook. 5th ed., Norton, 2001
or similar books.

For exjobb theses in Swedish there exist additional suggestions on technical writing, such as the Lathund för rapportskrivning.

General issues

Literature recherche for a final thesis project

[New in 2009]
I demand that you cite, and compare your work to, at least 5 scientific papers from peer-reviewed conferences or journals, in addition to textbooks, manuals, web pages etc.

[New in 2011]
I do no longer allow citing (and thus using material from) Wikipedia and other public web lexica in final thesis reports. Such material is not generally trustworthy. For instance, I recently traced an error made in an exam by several students back to an erroneous article in such a web lexicon.
Wikipedia may still be useful as a starting (!) point for a first recherche about a subject, but you should read and cite the original articles mentioned there (and of course also others that are not mentioned by that article's authors).

Don't use copyrighted material without permission

Here is a report about what happened to someone who did...

Stylistic issues

Pluralis majestatis

Never use "I" (except, maybe, in the preface where you sign with your name). Instead, use the so-called "pluralis maiestatis", that is, "WE". "We" is used in the sense of "the author(s)" and sometimes also in the sense of "the reader and the author together as a team".

Examples: "We have shown in previous work [3,4] that..." - "We will see that..." - "Let us consider now ..."

If you want to mark something as your own personal opinion or experience, you should speak of yourself in the third person:
"The author thinks that ..."

Never use "you" to directly address the reader, and never command the user by using imperative form. Also here the third person is appropriate to create the necessary distance:
"The reader may have noticed that ..." - "We refer the interested reader to ..."

The reason that I do not follow this rule in this document itself is that this document is not a thesis and not a scientific paper, and that I most probably know the reader of this document personally.

Descriptive title

The title is the only piece of information that may really be read during a quick literature search. It should be in your own interest that your thesis is found by interested readers, as the number of papers citing you may be used as a measure of your scientific quality (even if this does not always make sense, but nowadays people are crazy about evaluating and measuring everything).

Hence, choose a descriptive title without abbreviations, even if the title becomes a bit longer.

Positive example: Parallelization of the interpreter in a test system for mobile telecommunication components

Orthography

Your thesis is a publication, with your name on it, that will exist forever. Hence, the thesis is not just another examination moment. It is a mirror of your personal way of working in daily life. Both as an industrial or academic employee you will have to write many technical documents in your professional life. In the same way as an application letter, your thesis tells the experienced reader (such as a potential employer or a reviewer) a lot about your attitude towards quality and accuracy, for example.

In your own interest, I am fussy about orthography, grammar, and consistency of terminology, and I will reject a thesis with more than 15 typos on a page.

Please use a spell checker and, if possible, ask someone else (not your opponent) to proofread your thesis before you hand it in to me.

Active tense

If possible, prefer active to passive tense. You will find out that active tense forces you to be more precise, which is always good. For instance, one could write

"It was shown that this problem is NP-complete."

If you use active tense instead, you will be forced to give the author:

"Cook [23] has shown that this problem is NP-complete."

Colloquialisms

Avoid colloquial phrases such as "can't", "doesn't", "let's" etc. This is ok for boulevard newspapers and popular magazines but not for scientific papers and theses. Use "cannot", "does not", "let us" instead.

Length

Material that is only of interest to very special readers (e.g., to your successors in the same project), such as commented listings, complete language grammars, or JavaDoc excerpts, should be moved to an appendix.

Structure

Structure your thesis into 5-10 chapters, each of which addresses a certain aspect of your work. Try to minimize cross-chapter crossreferences. Avoid very short chapters or sections - these could be a sign that your structure is not very suitable.

An additional clustering of the chapters into multiple parts may be suitable for textbooks with more than 400 pages, but never for a thesis.

Sections and subsections are usually numbered up to a nesting depth of 3 or 4. Hence, "Section 3.1.4" looks still fine, while "Section 3.1.5.1.2.1.1" does not.
LaTeX automatically enforces this standard.

Keep it simple and short.

Avoid long sentences. Do not press more than one thought into one sentence.

Avoid phrases that do not really add information. Keep the quality and precision of the presentation but minimize the length.

Complete sentences

Build complete sentences! A sentence without a predicate is incomplete and a sign of bad style in a scientific publication. Your thesis is not a collection of loose thoughts.

Bad example: Just this.

This rule applies also to enumerations. Hence, avoid incomplete sentences as in

Memory types:
* Data memory. Contains...
* Program memory. Is ...

instead, you should write

The memory types are the following:
* Data memory. The data memory contains...
* Program memory. The program memory is ...

Quotes

Avoid quotes unless this is absolutely necessary for the description of your work. If you quote, give author name and citation.

Abbreviations

A human reader who is not really familiar with your project area can hardly keep more than 3 or 4 abbreviations in mind. The massive use of abbreviations, which is particularly dramatic in IT (sorry, should read: information technology) industry, makes a thesis hard to read. Note also that, even within the same subarea of computer science, abbreviations may be ambiguous, such as ILP (integer linear programming, instruction-level parallelism).

Hence, choose a few very central abbreviations that you want to keep, introduce them with their full name properly before their first use, and spell out the others.

Interpunction

Please put a blank space after the dot concluding a sentence. Not like this.This looks ugly and makes the text harder to read. The same rule also holds for commas, colons, semicolons etc.

Algorithms

Compact pseudocode (see your algorithms course) is preferred to lengthy Java code.

Long listings in verbatim mode (typewriter font) are hard to read. I recommend to use the available stylistic features to structure program code, such as boldface font for key words, italics for variable and function names, slanted for comments, etc.
Avoid long names for variables and constants, especially if you need to reference them in mathematical formulas.

Avoid excessively long algorithms or program listings. Partition the algorithm into suitable parts and package the parts into figures.

If you present an algorithm that you invented,

Referencing chapters, sections, figures, tables, algorithms, etc.

"see Figure 3 for ..." - "as shown in Section 2.3"
Here, "Figure 3" is a proper name and thus the first character is capitalized.

but:
"see the third figure ..." - "see the previous section"
Here, "figure" is an ordinary noun.

Referencing literature

There are various styles of citations used in standard literature. The most common ones are numeric [13], alphanumeric [Ke92,KS88,KS88a] and name/year, such as Kessler (1998).

Note that numeric citations are to be treated as comments (like footnote superscripts), not as independent text objects. I also recommend to add some text that simplifies reading without looking up the corresponding bibitem in the bibliography.

Hence, you may write
"Applying Brent's theorem [7] yields ..." or "see Cormen et al. [4]"

but not
"[7] yields ..." nor "see [4]"
as [7] by itself is just an optional comment, not a subject.

A minor issue is that multiple numeric references should appear in order, that is, [3,4] instead of [4,3].

Long flat lists of references are not a good citation style. For instance,
"A lot of loop transformations have been developed to increase data locality, for example [3], [5], [6], [7], [10], [11], [12], [13], [16], [17], [20], [21], [22]."
Instead, more effort should be put in reviewing and structuring the related work, even if this needs more time and space:
"Many loop transformations have been proposed in the literature of the last two decades. For instance, Kennedy et al. [10] give an in-depth treatment of loop interchange. Tiling of multidimensional loops is discussed in a seminal paper by Wolf and Lam [21]. Polychronopoulos [13] proposes cycle shrinking for loops with dependence cycles of static distances larger than one. Banerjee [5,6] gives an introduction to the theory of unimodular transformations. Ancourt and Irigoin [3] introduce the polytope model for the representation of index spaces, which is used in subsequent work by Lengauer [16], Xue [22], ... . See also Wolfe's textbook [20] for a comprehensive overview."

Please note also that there should be a blank space before each citation reference symbol, such as here [47] (in the LaTeX source: before each \cite{} command).

Bibliography

The bibitems in the bibliography should be alphabetically ordered by author names. Alternatively, they may be ordered in the order they appear in the text (that is the unsrt bibstyle in LaTeX), but that makes it more difficult to look up bibitems that are referenced multiple times.

Avoid web documents, give preference to printed books and articles.
Note that web documents are (with very few exceptions) highly volatile, subject to dynamic update without notice, and generally unrefereed; hence they are much less trustworthy than printed publications, which underwent a thorough reviewing process and can thus regarded as correct and original work (also with very few exceptions).
Experience shows that, on the average, more than 50% of all URLs cited do no longer exist after 2 years. A printed book or journal article exists forever.

For each (non-web) bibitem, give the author, title, year and the publisher. The ISBN is optional. This is standard for scientific writing. Note that I override here the IDA guidelines.

For articles, give also journal name, volume, issue, and page numbers.
For papers in conference proceedings, give the conference name and the page numbers.

If you cannot avoid citing a web document (such as online documentation), give the author, the year, and the location of the project (e.g., university or research institute).

By the way, the publisher information tells the experienced reader a lot about the value of a book publication (and thus, whether it is worth the effort to get access to it). For instance, "Morgan Kaufmann" or "Springer" books are usually regarded as high quality publications.

LaTeX

I recommend you to learn and use LaTeX. It is more powerful and forces you to follow stylistic standards better than any WYSIWYG word processor ever can. You will find LaTeX easy to learn if you know an arbitrary programming language.
Using LaTeX is even more or less mandatory if you are a graduate student.

To be continued...

Christoph Kessler, IDA