Exercise 4B: Scripting in R - Functions
In this exercise you will practice creating and applying user-defined functions.
Getting started
Load libraries and the joined diabetes data set.
User defined Functions
- Create a function named
calculate_risk_score()
.
It should accept the following parameters:
- BloodPressure
- BMI
- Smoking
- PhysicalActivity
Initiate the risk score with a value of 0 and calculate the final risk score with the following rules:
- if BloodPressure is > 90, add 0.5
- if Smoking is ‘Smoker’, add 1
- if BMI is > 30, add 1. If BMI > 40, add 2
- if PhysicalActivity is > 110, substract 1
The function should return the final risk score. Test it with some different inputs to verify that it works according to the rules detailed above.
Add a check to your function whether the supplied parameters are numeric, except for Smoking which should be a factor or a character (you can check that too if you want to). Test that your check works correctly.
In a for-loop, apply your
calculate_risk_score
to each row indiabetes_glucose
. Remember to use the appropriate column, i.e.diabetes_glucose$BloodPressure
should be the argument forBloodPressure
in the function. Add the calculated risk score to diabetes_glucose as a columnRisk_score
.Now, run
calculate_risk_score
on each row indiabetes_glucose
by usingmapply
instead. Confirm that your result is the same.Create an R script file to contain your functions. Copy your functions there and remove them from your global environment with
rm(list="name_of_your_function")
. Now source the function R script in your quarto document and test that the functions work.
Extra exercises
These exercises will ask you to first perform some tidyverse operations. Once you succeeded, you should abstract your code into a function.
We start by unnesting diabetes_glucose so you get back the Measurement and Glucose columns:
e1. Calculate the mean of Glucose (mmol/L)
for each measuring time point across all patients. You should obtain 3 values, one for 0, one for 60 and one for 120. e2. Now we would like to stratify this mean further by a second variable, Sex
. We would like to obtain 6 means: 0_female, 0_male, 60_female, 60_male, 120_female, 120_male.
e3. Now we would like to abstract the name of the second column. Imagine we would like glucose measurement means per marriage status or per workplace instead of per sex. Create a variable column <- 'Sex'
and redo what you did above but using column
instead of the literal name of the column (Sex
). This works in the same way as when plotting with a variable instead of the literal column name. Have a look at presentation 4A if you don’t remember how to do it.
e4. Now, make your code into a function calc_mean_meta()
that can calculate glucose measurement means based on different meta data columns. It should be called like so calc_mean_meta('Sex')
and give you the same result as before. Try it on other metadata columns in diabetes_glucose_unnest
too!