RNA-seq data analysis workflow

Getting started with RNA-seq data analysis

This document aims to get you started with analysing your own RNA-seq data by providing an example workflow. At each of the major steps in the workflow we will examine issues that could confound your analysis and consider how different available methods address these issues.

Rather than prescribing what methods you should use, this document aims to make the user aware of the biological and statistical challenges involved in analysing RNA-seq data and asks how different methods address those challenges. By the end of this document you should be familiar with the general workflow of RNA-seq analysis and be ready to ask informed questions about the specific methods you should select going forward.

This document is written in the form of a workshop which will run over two mornings from 9am - 1pm. The workshop includes a small example dataset which will be used by the workshop Instructor to demonstrate an analysis. Attendees may choose to follow along with the analysis in real time or they may choose to focus on note-taking. The workshop includes short exercises to highlight key messages and reinforce your understanding.

Workshop timeline

This workshop runs over two mornings, from 9am - 1pm. We will take one 15 minute break at approximately 10:45, and shorter breaks in between.

Day 1

An overview of RNA-seq data analysis workflows

Quality assessment

Adaptor trimming and filtering

Mapping and counting methods

Exploratory analysis

Day 2

Concepts for identifying differentially expressed genes

Limma for differentially expressed genes

DESeq2 for differentially expressed genes

Over-representation analysis

A note on software

During the workshop, code will be run in real-time to demonstrate the analysis. This will be run on the NeSI OpenOnDemand platform which already has all the required software and R packages installed.

Where you are working will dictate what you need to install and how you will need to go about it, therefore this document does not include instructions to install software. We will include notes about loading R packages, and

Attribution notice

This workshop is an updated version on an existing version of RNA-seq data analysis by Genomics Aotearoa and NeSI. It incorporates, and takes inspiration from, various training materials produced by the Harvard Bioinformatics Core and from The Carpentries, specifically their RNA-seq analysis with Bioconductor workshop.