Hide menu

732A94 Advanced Programming in R

Course information


Autumn 2016

Autumn 2017

This is the coursepage for the course in Advanced R programming.

The first course occation will be in room P18 on Tuesday 2017-08-29 13:15-15 Lecture and 15-17 a computer lab!

Course Content


The course introduces general programming techniques and their practical implementation in the R language. More specifically, the course includes:

  • reading data from file, from the internet, and printing to output,
  • data structures, functions and objects,
  • iteration and conditional statements,
  • numerical linear algebra in R,
  • debugging,
  • object-oriented programming,
  • performance enhancement,
  • parallel programming,
  • literate programming,
  • development of R packages.

Course structure


The course contains 7 lectures, 2 computer labs and 5 seminars.

The two first weeks will focus on basic R programming and syntax and students will work individually.

The last five weeks will focus more on advanced concepts and students will then work in groups of two and two.

The best thing is that the students work with their own computer. See below how to install R, git and R-Studio.

The course contains three teaching activities:

  • Lecture (FÖ) Introduction of new concepts.
  • Computer lab (DATALAB) Individual computer lab with individual help.
  • Seminar (SE) Each student group will present their current work and we will discuss eventual problems.

The following content will be presented each week:

Lecture Material
1 Data structures, subsetting and intro to functions
2 Program control, functions and R packages
3 Performant code: Writing R packages, roxygen, testthat, git/github
4 Linear algebra, vignettes & rmarkdown, graphics and object-orientation in R
5 Advanced Input/Output (I/O)
6 Performant code: Writing fast code, parallelism and handling big data
7 Intro to Machine learning, big data and data wrangling

Course literature


  • The art of R programming by Norman Matloff (NM1). The book can be found online here.

  • Introduction to ggplot2 by Norman Matloff (NM2). The pdf can be found here.

  • Advanced R by Hadley Wickham (HW1). An online version can be found here.

  • R Packages by Hadley Wickham (HW2). An online version can be found here.

  • Efficient R Programming by Colin Gillespie and Robin Lovelace (CGRL). An online version can be found here.

  • Pro Git by Scott Chacon and Ben Straub (SCBS). An online version can be found here.

  • Best practices for scientific computing by Greg Wilson et. al. (GW) The article can be found here.

  • Semantic versioning (SV) rules found here

  • Testthat: Get started with testthat (HW3) by Hadley Wickham that can be found here.

  • httr: Quickstart (HW4) by Hadley Wickham that can be found here.

  • Databases (HW5) by Hadley Wickham that can be found here.

  • Persistent data storage (DA) by Dean Attalli that can be found here.

  • HTTP: the protocol (PP) by Panwan Podila that can be found here.

  • An introduction to API:s (BC) by Brian Cooksley here

  • Best practices for writing an API package by Hadley Wickham (HW6) here.

  • rvest: Easy websraping with R (RS1) by R Studio that can be found here.

  • Shiny tutorial (RS2) by R Studio that can be found here.

  • memoise (HW7) by Hadley Wickham that can be found here.

  • Introduction to dplyr (HW8) by Hadley Wickham that can be found here.

  • Big Oh notation (BO) can be found here

  • Tidy data (HW9) by Hadley Wickham can be found here

  • Data wrangling cheet sheet (DWCS) here

  • A short introduction to the caret package (MK) by Max Kuhn can be found here

  • The elements of statistical learning (HTF) by Trevor Hastie, Robert Tibshirani and Jerome Friedman here

Required reading

Below are the required reading for each week.

  1. Data structures, subsetting and intro to functions
  • NM1: Chap: 1, 2.2, 2.3, 3, 4.1-4.4, 4.6, 4.7, 4.9, 5.1 - 5.6, 6.1, 6.2, 6.4 - 6.6, 7, 9.1, 10.1, 10.3, 10.7
  • HW1: Chap: Introduction, Data structures, Subsetting
  1. Program control, functions, basic I/O and R packages
  • NM1: Chap: 8, 9.1 - 9.6, 11.3
  • HW1: Chap: Functions, Environment, Functional programming, Functionals
  1. Performant code I
  • GW: Whole article
  • SCBS: Ch. 1.1 - 1.3
  • HW1: Exceptions and Debugging
  • HW2: Introduction, Package structure, Package components, Code, Data, Package metadata, Object documentation, Testing, Namespaces, Checking, Git and GitHub
  • HW3: Whole page
  • SV: Whole page
  1. Linear algebra, dynamic reporting, graphics and object orientation
  • NM1: Chap: 10.4, 12 (not 12.1)
  • NM2: Full article
  • HW1: Chap: OO field guide
  • HW2: Chap: Vignettes
  1. Input and output
  • NM1: Chap: 11
  • HW4: Whole page
  • HW5: Read through, but the dplyr package will be discussed in week 6.
  • DA: Whole page
  • PP: HTTP Basics, URLs, Verbs, Status codes, Summary
  • BC: Chapter 1-3 (Ch. 4-5 can be relevant for some projects)
  • HW6: Up to Authenticating, the rest can be skipped.
  • RS1: Whole page
  • RS2: Lesson 1-7
  1. Performant code II
  • NM1: Chap: 4.10, 15, 17
  • HW1: Chap: Performance, Profiling, Memory, Rcpp, Functionals, Function operators
  • HW7: Front page
  • BO: All pages
  1. Machine learning, big data and data wrangling
  • HW8: The whole page
  • HW9: The whole article
  • DWCS: The whole sheet
  • MK: The whole article
  • HTF: Chapter 7.1, 7.2, 2.9, 3.4, 3.4.1 (scan)
Extra reading material
  • More on functionals from Data Camp can be found here.
  • More on environments can be found here
  • More on how to install git and make it work with R-Studio here and here.
  • More on HTTP and web crawling in this book.
  • More on building a Shiny application here.
  • The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) here
Extra video material
  • Google developers R videos (GD) here
  • Introduction to R videos by Roger Peng (RP) here
  • R markdown (Rmd) here
  • R-Studio and github here and here.
  • Debugging in R here
  • Extra material on program control in R here
  • Short video on Dijkstras algorithm here
  • Introduction to object oriented programming here and more extensive here.
  • A series of 5 videos on good coding practices here
Reference cards
  • R reference card v.2 av Matt Baggot. This reference card is the only help at the exam here
  • R-markdown cheat sheet (Rmd cheat) here
  • R-markdown reference guide (Rmd ref) here
  • Data wrangling cheet sheet (data cheat) here
Examples
  • ggplot2 example catlogue here

Software


The students are suggested to work using their own computers. For this course the following software is needed. Everything is open source and free.

Information on how to install R and R-Studio: - Windows, - Mac - Linux/Ubuntu

Assignments


The course contains 7 assignments that all are mandatory.

Assignment 1 and 2 will be turned in as an R script file. All assignments needs to be fully correct to get a pass. To test your assignment the R package markmyassignment can be used.

Assignment 3 to 7 will be fully implemented R packages that should be possible to install directly from github.com. To turn in the package just send in the github adress to the package. Read the particular instructions in each lab for what is needed to pass.

All labs and seminar assignments have a deadline. But please observe, on occation, you will present your assignment on a seminar before the deadline has passed! This means that sometimes, you will not be completely finished with the assignment on the seminar, but that is ok.

Assignment no. Instructions Material presented Seminar presentation Deadline
1 PDF 2017-08-29 Lab (no seminar) 2017-09-05
2 PDF 2017-08-30 Lab (no seminar) 2017-09-08
3 PDF 2017-09-04 2017-09-19 2017-09-22
4 PDF 2017-09-05 2017-09-20 2017-09-24
5 PDF 2017-09-18 2017-10-03 2017-10-06
API examples
6 PDF 2017-09-20 2017-10-04 2017-10-08
7 PDF 2017-09-29 2017-10-11 2017-10-15

Lectures


Lecture slides from 2017 (which 2017 slides are based on) can be found below:

Lecture Slides
L1 Lecture 1
L2 Lecture 2 R code for slide 31
L3 Lecture 3
L4 Lecture 4
L5 Lecture 5
L6 Lecture 6
L7 Lecture 7

Lecture slides from 2016 (which 2017 slides are based on) can be found below:

Lecture Slides
L1 Lecture 1
L2 Lecture 2
L3 Lecture 3
L4 Lecture 4
L5 Lecture 5
L6 Lecture 6
L7 Lecture 7

Lecture slides from 2015 (which 2016 slides are based on) can be found below:

Lecture Slides
L1 Slides 2015
L2 Slides 2015
L3 Slides 2015
L4 Slides 2015
L5 Slides 2015
L6 Slides 2015
L7 Slides 2015

Computer exam


The exam will be a computer exam on 2017-10-20

The material that will be included with the exam can be downloaded here.

Previous exams can be found here.

Exams from the basic course in R programming can be found here.

Staff


  • Krzysztof Bartoszek, lecturer
  • Krzysztof Bartoszek, examiner

Page responsible: Krzysztof Bartoszek
Last updated: 2016-06-17