# Chapter 2 Introduction to R

What you’ll have learned by the end of the chapter: reading and writing, exploring (and optionally visualising) data.

## 2.1 Reading in data with R

Your first job is to actually get the following datasets into an R session.

First install the {rio} package (if you don’t have it already), then download the following datasets:

Also download the following 4 csv files and put them in a directory called unemployment:

Finally, download this one as well, but put it in a folder called problem:

and take a look at chapter 3 of my other book, Modern R with the {tidyverse} and follow along. This will teach you to import and export data.

{rio} is some kind of wrapper around many packages. You can keep using {rio}, but it is also a good idea to know which packages are used under the hood by {rio}. For this, you can take a look at this vignette.

If you need to import very large datasets (potentially several GBs), you might want to look at packages like {vroom} (this benchmark shows a 1.5G csv file getting imported in seconds by {vroom}. For even larger files, take a look at {arrow} here. This package is able to efficiently read very large files (csv, json, parquet and feather formats).

## 2.2 A little aside on pipes

Since R version 4.1, a forward pipe |> is included in the standard library of the language. It allows to do this:

4 |>
sqrt()
## [1] 2

Before R version 4.1, there was already a forward pipe, introduced with the {magrittr} package (and automatically loaded by many other packages from the tidyverse, like {dplyr}):

library(dplyr)

4 %>%
sqrt()
## [1] 2

Both expressions above are equivalent to sqrt(4). You will see why this is useful very soon. For now, just know this exists and try to get used to it.

## 2.3 Exploring and cleaning data with R

Take a look at chapter 4 of my other book, ideally you should study the entirety of the chapter, but for our purposes you should really focus on sections 4.3, 4.4, 4.5.3, 4.5.4, (optionally 4.7) and 4.8.

## 2.4 Data visualization

We’re not going to focus on visualization due to lack of time. If you need to create graphs, read chapter 5.