Instituto Gulbenkian de Ciência

30-31 May 2019

09:30 - 18:00

Instructors: Eric Persson, Phil Reed

Helpers: João Raimundo, Henrique Costa

General Information

Software Carpentry aims to help researchers get their work done in less time and with less pain by teaching them basic research computing skills. This hands-on workshop will cover basic concepts and tools, including program design, version control, data management, and task automation. Participants will be encouraged to help one another and to apply what they have learned to their own research problems.

For more information on what we teach and why, please see our paper "Best Practices for Scientific Computing".

Who: The course is aimed at graduate students and other researchers. You don't need to have any previous knowledge of the tools that will be presented at the workshop.

Where:

When: 30-31 May 2019. Add to your Google Calendar.

Requirements: Computers running Linux are provided. Participants may bring a laptop with a Mac, Linux, or Windows operating system (not a tablet, Chromebook, etc.) that they have administrative privileges on. They should have a few specific software packages installed (listed below).

Code of Conduct: Everyone who participates in Carpentries activities is required to conform to the Code of Conduct. This document also outlines how to report an incident if needed.

Accessibility: We are committed to making this workshop accessible to everybody. The workshop organizers have checked that:

Materials will be provided in advance of the workshop and large-print handouts are available if needed by notifying the organizers in advance. If we can help making learning easier for you (e.g. sign-language interpreters, lactation facilities) please get in touch (using contact details below) and we will attempt to provide them.

Contact: Please email ericpersson1@gmail.com or phil.reed@manchester.ac.uk for more information.


Surveys

Please be sure to complete these surveys before and after the workshop.

Pre-workshop Survey

Post-workshop Survey


Schedule

Thursday, 30 May

Before Pre-workshop survey
09h30 R for Reproducible Scientific Analysis
(Set-up; Project Management with RStudio)
11h00 Morning break
11h30 R for Reproducible Scientific Analysis
(Help; Data Structures; Vectorization)
12h30 Lunch break
14h00 R for Reproducible Scientific Analysis
(Data Structures; Subsetting)
16h00 Afternoon break
16h30 R for Reproducible Scientific Analysis
(Creating Graphics with ggplot2)
18h00 END

Friday, 31 May

09h30 R for Reproducible Scientific Analysis
(Recap of day 1, more graphics and subsetting)
11h00 Morning break
11h30 R for Reproducible Scientific Analysis
(Producing Reports with knitr)
12h30 Lunch break
14h00 R for Reproducible Scientific Analysis
(Bringing it all together)
16h00 Afternoon break
16h30 RStudio and Version Control with Git
(Version Control with Git; Dataframe Manipulation with dplyr)
17h30 Wrap-up; Post-workshop Survey
18h00 END

We will use this collaborative document for chatting, taking notes, and sharing URLs and bits of code.


Syllabus

R for Reproducible Scientific Analysis

  • Project Management with RStudio
  • Data Structures (including Dataframes)
  • Creating Publication-Quality Graphics with ggplot2
  • Dataframe Manipulation with tidyr and dplyr
  • Producing Reports with knitr
  • Writing Good Software
  • Reference...

Version Control with Git

We will cover a little bit of this at the start:

  • Creating a Repository
  • Recording Changes to Files: add, commit, ...
  • Viewing Changes: status, diff, ...
  • Ignoring Files
  • Working on the Web: clone, pull, push, ...
  • Reference...

Setup

To participate in a Software Carpentry workshop, you will need access to the software described below. In addition, you will need an up-to-date web browser.

We maintain a list of common issues that occur during installation as a reference for instructors that may be useful on the Configuration Problems and Solutions wiki page.

Text Editor

When you're writing code, it's nice to have a text editor that is optimized for writing code, with features like automatic color-coding of key words. The default text editor on macOS and Linux is usually set to Vim, which is not famous for being intuitive. If you accidentally find yourself stuck in it, hit the Esc key, followed by :+Q+! (colon, lower-case 'q', exclamation mark), then hitting Return to return to the shell.

nano is a basic editor and the default that instructors use in the workshop. It is installed along with Git.

Others editors that you can use are Notepad++ or Sublime Text. Be aware that you must add its installation directory to your system path. Please ask your instructor to help you do this.

nano is a basic editor and the default that instructors use in the workshop. See the Git installation video tutorial for an example on how to open nano. It should be pre-installed.

Others editors that you can use are BBEdit or Sublime Text.

nano is a basic editor and the default that instructors use in the workshop. It should be pre-installed.

Others editors that you can use are Gedit, Kate or Sublime Text.

R

R is a programming language that is especially powerful for data exploration, visualization, and statistical analysis. To interact with R, we use RStudio.

Video Tutorial

Install R by downloading and running this .exe file from CRAN. Also, please install the RStudio IDE. Note that if you have separate user and admin accounts, you should run the installers as administrator (right-click on .exe file and select "Run as administrator" instead of double-clicking). Otherwise problems may occur later, for example when installing R packages.

You can download the binary files for your distribution from CRAN. Or you can use your package manager (e.g. for Debian/Ubuntu run sudo apt-get install r-base and for Fedora run sudo dnf install R). Also, please install the RStudio IDE.