Howdy biologists!
Today I am beyond excited to announce our flagship course: Fundamentals of R for Biologists!
Fundamentals of R for Biologists is a self-guided crash course in how to use R specifically geared for those working in the biological sciences. This course will use biological datasets to teach you how to: load and save data, manipulate objects, identify potential issues in your data, fix issues in your data, visualize your data using ggplot2, and learn many of the functions found in the tidyverse package.Â
This course was designed specifically for those entering the biological sciences, those who are unfamiliar with R, or those that want to learn how to analyze biological datasets. There is no requirement to even know what R is!Â
In fact, here’s a short overview video!
Here is an expanded overview of all the material:
Section 0 – Introduction to R and Rstudio is meant for those who may have never written a single line of code or may have never even heard of R and Rstudio before. It covers the very fundamentals of R and Rstudio. If students already have a working knowledge of some basics of R, this section can be entirely skipped!
Section 1 – Basics of R covers some of the basic elements of R. This section introduces the different classes
and types of objects we will use in R. Then, we learn the fundamentals of loading and saving data into and out of R. Finally, we introduce the basics of plotting and visualizing your data.
Section 2 – Manipulating Objects and Logic largely covers subsetting and logic. Here students will learn how to extract specific elements from all types of objects in R. Then they will learn how to combine different datasets using the rbind(), cbind() and merge() functions.Â
Section 3 – Introduction to Tidyverse introduces the tidyverse package. This section starts by detailing pipes and pipelines. Then we introduce a variety of tidyverse functions that allow us to extract data (select() and pull()), manipulate how our data appears (arrange() and relocate()), and how to filter our data based on some criteria (filter() and distinct()). Finally, we then cover how to do simple analyses using the mutate(), count(), summarize(), and group_by() functions.
Section 4 – Cleaning data and creating pipelines is specifically aimed at cleaning datasets. Here we introduce best practices for data cleaning and use a variety of functions learned in Section 3 to fix problematic data. I also detail how to use functions such as unique(), is.na(), and glimpse() to uncover potential problems in our data. This section is likely the most important section for those in the biological sciences to learn, as cleaning our data often takes a considerable amount of time.
Section 5 – logic, custom functions, and apply() takes a step back and introduces a variety of operations that are commonplace in nearly all programming languages. Students will learn how to fork their code with if-else statements, run functions over entire data sets use for-loops, and teaches students how to create their own custom functions. In my opinion, learning, and understanding this section is the absolute best thing you can do to increase your coding ability. This section provides you with the tools to solve almost any problem you may run into.
Section 6 – Plotting with ggplot2 covers how to visualize data with the ggplot2 package. ggplot2 is the most widely used data visualization package and is often a source of frustration. This section will break down ggplot2 into its fundamental components and get you started on making stunning visualizations!
Students who complete this course will receive a certificate of completion.
I hope you enjoy learning more about R coding!
Full lesson overview:
- FREE Section 0 – Introduction to R and Rstudio
- FREE 0.1.1 What is R and Rstudio?
- FREE 0.1.2 The layout of Rstudio
- FREE 0.1.3 The Environment
- FREE 0.2.1 Common Files in R
- FREE 0.2.2 The Working Directory
- FREE 0.3.1 Intro to Functions
- FREE 0.3.2 Intro to Packages
- FREE 0.3.3 Comments
- Section 1 – Basics of R
- 1.1.1 Math Rules
- 1.1.2 Intro to Objects
- 1.1.3 Classes of Objects
- 1.1.4 Types of Objects
- 1.2.1 Loading Data
- 1.2.2 Saving Data
- 1.3.1 Intro to Plotting
- Bonus: Tips for Data Organization
- Section 2 – Manipulating Objects and Logic
- 2.1.1 Subsetting Vectors
- 2.1.2 Subsetting Dataframes
- 2.1.3 Subsetting Lists
- 2.2.1 Intro to Logic
- 2.2.2 Combining logical statements
- 2.3.1 Adding new columns
- 2.3.2 Combining data with rbind() and cbind()
- 2.3.3 Combining Data with merge()
- Section 3 – Introduction to Tidyverse
- 3.1.1 Intro to tidyverse
- 3.1.2 Intro to Pipes
- 3.2.1 select and pull
- 3.2.2 arrange and relocate
- 3.2.3 filter and distinct
- 3.3.1 mutate and count
- 3.3.2 summarize and group_by
- Section 4 – Cleaning data and creating pipelines
- 4.1.1 Best Practices for Data Cleaning
- 4.1.2 glimpse and simple plots
- 4.1.3 unique and Na values
- 4.2.1 changing data types
- 4.2.2 replacing data directly
- 4.3.1 Fixing typos with replace
- 4.3.2 Cleaning data with filter
- 4.3.3 putting it all together
- Section 5 – logic, custom functions, and apply()
- 5.1.1 Intro to if-else
- 5.1.2 Intro to For-Loops
- 5.1.3 Intro to custom functions
- 5.2.1 for loops on dataframes
- 5.2.2 For-loops for analyses
- 5.3.1 Apply function
- 5.3.2 Tying it all together
- Section 6 – Plotting with ggplot2
- 6.1.1 Intro to ggplot2
- 6.1.2 labeling layers and ggobjects
- 6.2.1 color and fill
- 6.2.2 Size shape and transparency
- 6.2.3 visualizing using data and aes
- 6.3.1 faceting with facet_wrap
- 6.3.2 saving visualizations using ggsave
- Bonus: Intro to ggplot themes
- Bonus: customizing themes with elements