Workshop portfolio

Workshop structure, style and expectation

All workshops are run under The Carpentries Code of Conduct to promote a safe and inclusive working environment.

Our training workshops are interactive live-coding events. One or more Instructors (subject matter experts who have received formal teaching training) will demonstrate by writing code in real-time while attendees follow along. Instructors provide details about what they are doing and explain the decision making and thought processes that they are going through. Attendees will follow along and work through all code at the same time as the Instructor. Short exercises, either for individuals or groups, will reinforce your learning. All workshops will also have one or more Helpers: experienced members of the training community or subject matter experts who are on hand to answer questions and solve issues as they arise. Helpers ensure that no attendee is left behind and that everyone is able to complete tasks in a timely manner.

Our workshops run with the help of New Zealand eScience Infrastructure (NeSI). Our partnership with NeSI ensures there is no need to install software, no compatibility issues, no minimum computing requirements. All you will need to complete the workshop is a computer with an internet connection and a web browser.

All workshop material, which includes written notes that cover all main points, code examples, and tasks (as well as their solutions), is provided in advance and is freely available.

Workshops are either in person or online (zoom), and are usually one or two days (10am - 4pm).

Introductory workshops

Our series of introductory workshops are designed to take you from absolute beginner to self-driving learner. Each of these workshops can be done as a standalone crash course in a particular topic, and they can also be strung together to provide a strong foundation in bioinformatics principals and practices.

Introduction to Bash Scripting and HPC Job Scheduler

Get started with writing your own scripts for data analysis and working on an HPC (high performance computing) environment. Some of the topics covered in the workshop are:

Designing a variant calling workflow.
Automating a workflow.
An introduction to HPC.
Working with job scheduler.

View the full workshop material here: Introduction to Bash Scripting and HPC Scheduler

Introduction to the R Programming Language

Get started with R, a highly popular programming language in the fields of biology and statistics. R is world-renowned for producing high-quality, publication-ready figures and tables. Note that this workshop is a pre-requisite for the RNA-seq Data Analysis workshop. Some of the topics covered in the workshop are:

An introduction to R and RStudio.
R basics: The R language, reading data into R, storing data as objects.
R packages.
Publication-quality data presentation using ggplot2.
Knitr: keep track of workflow and produce easy-to-follow reports of your work.
Where to get more help when you are ready to do more.

We assume the learner has no prior experience with the tools covered in the workshop. However, learners are expected to have some familiarity with biological concepts.

View the full workshop material here: Introduction to the R Programming Language

RNA-seq Data Analysis

Get started with analysing RNA-seq datasets, identifying differentially expressed genes and highlighting impacted biological processes. Some of the topics covered in the workshop are:

Quality assessment
Trimming and filtering
Mapping and read counts
Differential expression analysis
Over-representation analysis

View the full workshop material here: RNA-seq Data Analysis

Genomics Data Carpentry (introduction to Shell)

Learn the fundamentals of working with the Command Line Interface (CLI). Shell is a program that allows you to interact with the command line. Familiarity with the shell will allow you to access remote servers, automate tasks, and use a wide range of tools that are unavailable on a Graphical User Interface (GUI)

During this workshop you will learn:

The importance of the shell.
How to navigate files and directories.
How to create, view and modify files.
Pipes, redirection, and scripts, which will allow you to automate your workflow.

View the full workshop material here: Introduction to the Command Line for Genomics

Intermediate workshops

Intermediate Shell for Bioinformatics

Shell overview, downloading and verifying data, inspecting and manipulating text data with Unix tools, automating file-processing. This includes:

An overview of the Shell, UNIX and Linux.
Downloading data from a remote source and checking data integrity.
Recap navigating files and directories, and commands used in routine tasks.
Inspecting and manipulating data, part 1 (the head, less, grep, and sed commands).
Inspecting and manipulating data, part 2 (using awk and bioawk to process text).
Automating file processing.
Challenges: solve example molecular biology problems using shell scripts.

View the full workshop material here: Intermediate Shell for Bioinformatics

Intermediate R: advance your skills with the R programming language

Advance your skills with R! You will learn to complete R tasks with fewer lines of code, scale your analyses, and write readable code. Some of the topics covered in the workshop are:

Introduction to relational data and the join function.
Working with regular expressions and functions from the stringr package.
Writing custom functions, working with conditional statements.
‘Defensive programming’.
Iterations - for loops, and map_*() functions.
The importance of data structure in R.

View the full workshop material here: Intermediate R: advancing your skills with the R programming language

Outlier Analysis

Identify regions of the genome that are under forms of selection during a two-day workshop. During this workshop you will:

Download example genomic data (or prepare your own).
Use the PCAdapt tool to identify outlier loci within a genome.
Use VCFtools to identify outlier SNPS in population comparisons.
Use Bayescan to identify outlier SNPS based on allele frequencies.
Relate identified SNPS to phenotypic variation.
Compare the results of the different methods and discuss the results.

The focus of this workshop is on identifying signals of selection in an example genome using the outlier analysis method. Outlier analysis assumes that the majority of the genome is under neutral selection and some loci will appear as outliers relative to this background.

This lesson assumes learner has no prior experience with the tools covered in the workshop. However, learners are expected to have some familiarity with biological concepts, including the concept of selection. You are expected to have some familiarity with both the R programming language and with basic command line (bash).

View the full workshop material here: Outlier Analysis

Advanced workshops

Reproducible Bioinformatics with Nextflow and nf-core

Reproducible research is of the utmost importance. Nextflow is workflow management software that enables the writing of scalable and reproducible scientific workflows. It can integrate various software packages and environment management systems from environment modules to Docker, Singularity, and Conda. It allows for existing pipelines written in common scripting languages to be seamlessly coupled together. Nextflow simplifies the implementation and execution of workflows on cloud or high-performance computing infrastructures.

In this workshop you will:

Be introduced to Nextflow and execute an example pipeline.
Be introduced to nf-core, an online repository of curated pipelines.
Learn how to configure and customise an existing nf-core pipeline.
Generate metrics and reports.

View the full workshop material here: Reproducible Bioinformatics Workflows with Nextflow and nf-core

Constructing Pangenome Graphs with PGGB

How to construct a pangenome graph using a popular tool (PGGB), including QC, variant extraction and short-read mapping. This workshop will include:

An introduction to pangenome graphs.
Setup guide for using the tools and data in the workshop.
Overview of the PGGB toolkit.
Choosing parameters to construct a graph.
QC
Extracting variant data.
Mapping short reads against a pangenome graph.

View the full workshop material here: Unlock the Power of Pangenome Graphs

Scaling Gene Regulatory Networks Simulations

Simulate gene regulatory networks using R and Julia. This workshop will include: - Why simulations are valuable in systems biology,

What regulatory networks are and how they can be modelled,
How to use the sismonr R package to simulate a small regulatory network, and
Introduction to High performance computing (HPC)
HPC architectures
Batch systems and using a Slurm scheduler
HPC architectures
How to scale up simulations on a HPC through profiling and optimisation

View the full workshop material here: Scaling Gene Regulatory Networks Simulations