Presentation 4: Main Script

If-else statments

Define variables.

num1 <- 8
num2 <- 5

Logical: 8 is larger than 5.

num1 > num2
[1] TRUE

Logical: 8 is not smaller than 5.

num1 < num2

A logical statement is used in an if statement to define a condition.

if (num1 > num2){
  statement <- paste(num1, 'is larger than', num2)

[1] "8 is larger than 5"

Redefine variables.

num2 <- 10

num2 <- 3

num2 <- 8

Use else if and else statements to test multiple conditions.

if (num1 > num2){
  statement <- paste(num1, 'is larger than', num2)
} else if (num1 < num2) {
  statement <- paste(num1, 'is smaller than', num2)
} else {
  statement <- paste(num1, 'is equal to', num2)

[1] "8 is equal to 8"


We first define a vector containing both numeric and character elements.

vector1 <- c(1, 2, 6, 3, 2, 'hello', 'world', 'yes', 7, 8, 12, 15)

To loop through vector1, we define a loop variable (here called element), which takes the value of each item in the vector, one at a time.

for (element in vector1) {
[1] "1"
[1] "2"
[1] "6"
[1] "3"
[1] "2"
[1] "hello"
[1] "world"
[1] "yes"
[1] "7"
[1] "8"
[1] "12"
[1] "15"

The loop variable name is arbitrary - you can call it anything. For example, we can use THIS_VARIABLE and get the same result. Point is, it does not matter what you call the variable, just avoid overwriting an important variable of your script.

for (THIS_VARIABLE in vector1) {
[1] "1"
[1] "2"
[1] "6"
[1] "3"
[1] "2"
[1] "hello"
[1] "world"
[1] "yes"
[1] "7"
[1] "8"
[1] "12"
[1] "15"

After you loop through a vector or a list, the variable is always the last element of your vector. The variable is hence a global variable.

[1] "15"

User defined Functions

We will use BMI calculation as an example for this part.

Define variables.

weight_kg <- 70
height_m <- 1.80

Calculate BMI.

bmi <- weight_kg/height_m^2
[1] 21.60494

If we plan to calculate BMI for multiple individuals it is convenient to write the calculation into a function.

  • Function name: calculate_bmi.

  • Function parameters: weight_kg and height_m.

  • The return value: bmi.

The return statement specifies the value that the function will return when called.

calculate_bmi <- function(weight_kg, height_m){
  bmi <- weight_kg/height_m^2

We can call the function using previously defined variables.

calculate_bmi(weight_kg = weight_kg, 
              height_m = height_m)
[1] 21.60494

We can also pass numbers directly to the function.

calculate_bmi(weight_kg = 100, 
              height_m = 1.90)
[1] 27.70083

Argument Order in Function Calls

If we specify the parameter names, the order can be changed.

calculate_bmi(height_m = 1.90, 
              weight_kg = 100)
[1] 27.70083

If we do not specify the parameter names, the arguments will be matched according to the position - so be careful with this.

[1] 0.00019

Combining function call with if-statement

Data on a single individual.

age <- 45
weight_kg <- 85
height_m <- 1.75

BMI should only be calculated for individuals over the age of 18.

if (age >= 18){
  calculate_bmi(weight_kg, height_m)
[1] 27.7551

Combining function call with for-loops

Data on 5 individuals.

df <- data.frame(row.names = 1:5, 
                 age = c(45, 16, 31, 56, 19), 
                 weight_kg = c(85, 65, 100, 45, 76), 
                 height_m = c(1.75, 1.45, 1.95, 1.51, 1.89)

Print ID, weight, and height of all individuals.

for (id in rownames(df)){
  weight <- df[id, 'weight_kg']
  height <- df[id, 'height_m']
  print(c(id, weight, height))
[1] "1"    "85"   "1.75"
[1] "2"    "65"   "1.45"
[1] "3"    "100"  "1.95"
[1] "4"    "45"   "1.51"
[1] "5"    "76"   "1.89"

Call function to calculate BMI for all individuals.

for (id in rownames(df)) {
  weight <- df[id, 'weight_kg']
  height <- df[id, 'height_m']
  bmi <- calculate_bmi(weight, height)
  print(c(id, bmi))
[1] "1"                "27.7551020408163"
[1] "2"                "30.9155766944114"
[1] "3"                "26.2984878369494"
[1] "4"                "19.7359764922591"
[1] "5"                "21.2760001119789"

Combination of function call, if-statement and for-loops.

Print BMI for individuals that are 18 years old or older.

for (id in rownames(df)) {
  if (df[id, 'age'] >= 18) {
    weight <- df[id, 'weight_kg']
    height <- df[id, 'height_m']
    bmi <- calculate_bmi(weight, height)
    print(c(id, bmi))

  } else {
    print(paste(id, 'is under 18.'))
[1] "1"                "27.7551020408163"
[1] "2 is under 18."
[1] "3"                "26.2984878369494"
[1] "4"                "19.7359764922591"
[1] "5"                "21.2760001119789"

Add BMI to the data frame.

for (id in rownames(df)){
  if (df[id, 'age'] >= 18) {
    weight <- df[id, 'weight_kg']
    height <- df[id, 'height_m']
    bmi <- calculate_bmi(weight, height)

  } else {
    bmi <- NA
  df[id, 'bmi'] <- bmi

Have a look at the data frame.

  age weight_kg height_m      bmi
1  45        85     1.75 27.75510
2  16        65     1.45       NA
3  31       100     1.95 26.29849
4  56        45     1.51 19.73598
5  19        76     1.89 21.27600

Out-sourcing functions to an Rscript you source

Remove calculate_bmi from the global environment.

rm(list = "calculate_bmi")

By sourcing a script, all global variables (including functions) in script will be loaded and appear in the Global environment in the top left corner. Here we source the functions.R script.


After we sourced the functions script the calculate_bmi function can be used just like if it was defined in the main script. If you work on a larger project and write multiple functions, it is best practice to have a function script and source it in your main script.

calculate_bmi(weight_kg = 67, 
              height_m = 1.70)
[1] 23.18339

Use mapply for alternative to calling function in for-loop.

mapply(FUN = calculate_bmi, 
       weight_kg = df$weight_kg, 
       height_m = df$height_m)
[1] 27.75510 30.91558 26.29849 19.73598 21.27600

Functions with error handling.

The function is in the functions script.

The BMI function with out error handling returns a meaningless BMI value if given a negative weight.

calculate_bmi(weight_kg = -50, height_m = 1.80)
[1] -15.4321

The BMI function with error handling returns an error if given a negative weight.

calculate_bmi_2(weight_kg = -50, height_m = 1.80)

The BMI function with error handling returns an warning if a BMI outside the normal range is calculated.

calculate_bmi_2(weight_kg = 25, height_m = 1.80)
Warning in calculate_bmi_2(weight_kg = 25, height_m = 1.8): The calculated BMI
is outside the normal range. Please check your input values.
[1] 7.716049