Exercise 4B: Scripting in R - Functions

In this exercise you will practice creating and applying user-defined functions.

Getting started

Load libraries and the joined diabetes data set.

User defined Functions

  1. Create a function named calculate_risk_score().

It should accept the following parameters:

  • BloodPressure
  • BMI
  • Smoking
  • PhysicalActivity

Initiate the risk score with a value of 0 and calculate the final risk score with the following rules:

  • if BloodPressure is > 90, add 0.5
  • if Smoking is ‘Smoker’, add 1
  • if BMI is > 30, add 1. If BMI > 40, add 2
  • if PhysicalActivity is > 110, substract 1

The function should return the final risk score. Test it with some different inputs to verify that it works according to the rules detailed above.

  1. Add a check to your function whether the supplied parameters are numeric, except for Smoking which should be a factor or a character (you can check that too if you want to). Test that your check works correctly.

  2. In a for-loop, apply your calculate_risk_score to each row in diabetes_glucose. Remember to use the appropriate column, i.e. diabetes_glucose$BloodPressure should be the argument for BloodPressure in the function. Add the calculated risk score to diabetes_glucose as a column Risk_score.

  3. Now, run calculate_risk_score on each row in diabetes_glucose by using mapply instead. Confirm that your result is the same.

  4. Create an R script file to contain your functions. Copy your functions there and remove them from your global environment with rm(list="name_of_your_function"). Now source the function R script in your quarto document and test that the functions work.


Extra exercises

These exercises will ask you to first perform some tidyverse operations. Once you succeeded, you should abstract your code into a function.

We start by unnesting diabetes_glucose so you get back the Measurement and Glucose columns:

e1. Calculate the mean of Glucose (mmol/L) for each measuring time point across all patients. You should obtain 3 values, one for 0, one for 60 and one for 120. e2. Now we would like to stratify this mean further by a second variable, Sex. We would like to obtain 6 means: 0_female, 0_male, 60_female, 60_male, 120_female, 120_male.

e3. Now we would like to abstract the name of the second column. Imagine we would like glucose measurement means per marriage status or per workplace instead of per sex. Create a variable column <- 'Sex' and redo what you did above but using column instead of the literal name of the column (Sex). This works in the same way as when plotting with a variable instead of the literal column name. Have a look at presentation 4A if you don’t remember how to do it.

e4. Now, make your code into a function calc_mean_meta() that can calculate glucose measurement means based on different meta data columns. It should be called like so calc_mean_meta('Sex') and give you the same result as before. Try it on other metadata columns in diabetes_glucose_unnest too!