Presentation 1: Intro to R and Quarto

Want to code along? Go to the Data tab of the website and press the DOWNLOAD PRESENTATIONS button. This is presentation1A.

R as a calculator

Adding

1+1

[1] 2

Subtracting

2-1

[1] 1

Multiplication

3*3

[1] 9

Division

15/3

[1] 5

Exponentiation

10^2

[1] 100

Base-2 logarithm

log2(20)

[1] 4.321928

Base-10 logarithm

log10(20)

[1] 1.30103

Define variables

a <- 5

b <- 3

a

[1] 5

[1] 3

a + b

[1] 8

c <- a + b


c

[1] 8

Check objects in environment

ls()

[1] "a"               "b"               "c"               "pandoc_dir"     
[5] "quarto_bin_path"

First Slideshow Intermezzo

The Quarto Way!

Quarto is an open-source publishing suite - a tool-suite that supports workflows for reproducible scholarly writing and publishing.

Quarto documents often begin with a YAML header demarcated by three dashes (---) which specifies things about the document. This includes what type of documents to render (compile) to e.g. HTML, PDF, WORD and whether it should be published to a website project. You can also add information on project title, author, default editor, etc.

Quarto uses a markup language

Quarto works with markup language. A markup language is a text-coding system which specifies the structure and formatting of a document and the relationships among its parts. Markup languages control how the content of a document is displayed.
Pandoc Markdown is the markup language utilized by Quarto - another classic example of a markup language is LaTex.

Lets see how the pandoc Pandoc Markdown works:

This is the third largest header (Header 3)

This is the smallest header (Header 6)

Headers are marked with hashtags. More hashtags equals smaller title.

This is normal text. Yes, it is larger than the smallest header. A Quarto document works similarly to a Word document where you can select tools from the toolbar to write in bold or italic and insert thing like a table:

My friends	Their favorite drink	Their favorite food
Micheal	Beer	Burger
Jane	Wine	Lasagne
Robert	Water	Salad

… a picture:

Picture source

We can also make a list of things we like:

Coffee
Cake
Water
Fruit

Modes of Quarto Document

There are two modes of Quarto: Source and Visual. In the left part of the panel you can change between the two modes.
Some features can only be added when you are in Source mode. E.g write blue text is coded like this in the source code [write blue text]{style="color:blue"}.

Code chunks and structure

Code chunks are where the code is added to the document.

Click the green button +c and a grey code chunk will appear with '{r}' in the beginning. This means that it is an R code chunk. It is also possible to insert code chunks of other coding language.

For executing the code, press the Run button in the top right of the chunk to evaluate the code.

some executable code in an R code chunk.

1+3

[1] 4

Below is a code chunk with a comment. A comment is a line that starts with a hashtag. Comments can be useful in longer code chunks and will often describe the code.

# This is a comment. Here I can write whatever I want because it is in hashtags.

You can add comments above or to the right of the code. This will not influence the executing of the code.

# Place a comment here 
1+3 # or place a comment here

[1] 4

Output of code chunks

Control whether code is executed.

eval=FALSE not execute the code and eval=TRUE will execute the code.

The code is shown, but the result is not shown ({r, echo=TRUE, eval=FALSE}):

1+3

Show or hide code. echo=FALSE will hide the code and echo=TRUE will show the code. Default is TRUE.

The code is not shown, but the result is shown ({r, echo=FALSE, eval=TRUE}):

[1] 4

Control messages, warnings and errors. Maybe you have a code chunk that you know will produce one of the three and you often don’t want to see it in the compiled document. N.B! It is not a good idea to hide these statements (especially the errors) before you know what they are.

Warning is not printed ({r, message=FALSE, warning=FALSE, error=TRUE}):

log(-1)

[1] NaN

Warning is printed ({r message=TRUE, warning=TRUE, error=TRUE}):

log(-1)

Warning in log(-1): NaNs produced

[1] NaN

Render: Making the report

In the panel there is a blue arrow and the word Render. Open the rendered html file in your browser and admire your work.

Let’s get to coding!

R packages

R packages are collections of functions written by R developers and super users and they make our lives much easier. Functions used in the same type of R analysis/pipeline are bundled and organized in packages. There is a help page for each package to tell us which functions it contains and which arguments go into these. In order to use a package we need to download and install it on our computer. Most R packages are stored and maintained on the CRAN[https://cran.r-project.org/mirrors.html%5D repository.

Install a package

# install.packages('tidyverse')

Load packages

library(tidyverse)

Query package

?tidyverse

Query function from package

?dplyr::select

Second Slideshow Intermezzo

Data Types

Numeric

num1 <- 5
num1

[1] 5

class(num1)

[1] "numeric"

Character

char1 <- "Hello World!"
char1

[1] "Hello World!"

class(char1)

[1] "character"

Factor

num1 <- factor(num1)
num1

[1] 5
Levels: 5

class(num1)

[1] "factor"

char1 <- factor(char1)
char1

[1] Hello World!
Levels: Hello World!

class(char1)

[1] "factor"

A vector of numeric values

vector1 <- c(1, 2, 4, 6, 8, 2, 5, 7)
vector1

[1] 1 2 4 6 8 2 5 7

Functions

Summing a vector

?sum
sum(vector1)

[1] 35

Mean of vector

?mean
mean(vector1)

[1] 4.375

mean(vector1) # mean/average

[1] 4.375

median(vector1) # median

[1] 4.5

sd(vector1) # standard deviation

[1] 2.559994

sum(vector1) # sum

[1] 35

min(vector1) # minimum value

[1] 1

max(vector1) # maximum value

[1] 8

length(vector1) # length of vector

[1] 8

Data structures

In the example below we will make two vectors into a tibble. Tibbles are the R object types you will mainly be working with in this course. We will try to convert between data types and structures using the collection of ‘as.’ functions.

A vector of characters

people <- c("Anders", "Diana", "Tugce", "Henrike", "Chelsea", "Valentina", "Thilde", "Helene")
people

[1] "Anders"    "Diana"     "Tugce"     "Henrike"   "Chelsea"   "Valentina"
[7] "Thilde"    "Helene"

A vector of numeric values

joined_year <- c(2019, 2020, 2020, 2021, 2023, 2022, 2020, 2024)
joined_year

[1] 2019 2020 2020 2021 2023 2022 2020 2024

Access data type or structure with the class() function

class(people)

[1] "character"

class(joined_year)

[1] "numeric"

Convert joined_year to character values

joined_year <- as.character(joined_year)
joined_year

[1] "2019" "2020" "2020" "2021" "2023" "2022" "2020" "2024"

class(joined_year)

[1] "character"

Convert joined_year back to numeric values

joined_year <- as.numeric(joined_year)
joined_year

[1] 2019 2020 2020 2021 2023 2022 2020 2024

Convert classes with the ‘as.’ functions

# as.numeric()
# as.integer()
# as.character()
# as.factor()
# ...

Let’s make a tibble from two vectors

my_data <- tibble(name = people, 
                  joined_year = joined_year)

my_data

# A tibble: 8 × 2
  name      joined_year
  <chr>           <dbl>
1 Anders           2019
2 Diana            2020
3 Tugce            2020
4 Henrike          2021
5 Chelsea          2023
6 Valentina        2022
7 Thilde           2020
8 Helene           2024

class(my_data)

[1] "tbl_df"     "tbl"        "data.frame"

Just like you can convert between different data types, you can convert between data structures/objects.

Convert tibble to dataframe

my_data2 <- as.data.frame(my_data)
class(my_data2)

[1] "data.frame"

Convert classes with the ‘as.’ functions

# as.data.frame()
# as.matrix()
# as.list()
# as.table()
# ...
# as_tibble()

Fundamental operations

You can inspect an R objects in different ways:

1. Simply call it and it will be printed to the console. 2. With large object it is preferable to use `head()` or `tail()` to only see the first or last part. 3. To see the data in a tabular excel style format you can use `view()`

Remove something:

rm(a)

Look at the “head” of an object:

head(my_data, n = 4)

# A tibble: 4 × 2
  name    joined_year
  <chr>         <dbl>
1 Anders         2019
2 Diana          2020
3 Tugce          2020
4 Henrike        2021

Open up tibble as a table (Excel style):

view(my_data)

dim(), short for dimensions, which returns the number of rows and columns of an R object:

dim(my_data)

[1] 8 2

Look at a single column from a tibble using the ‘$’ symbol:

my_data$joined_year

[1] 2019 2020 2020 2021 2023 2022 2020 2024