# Initiate list
risk_status <- list()Exercise 4A: Scripting in R - Conditions and For-loops
In this exercise you will practice your scripting in R.
Getting started
Load R libraries you think you will need for this exercise. You will be doing some data manipulation and plotting, so have a look at which packages we used in Presentation 2+3. No worries if you forget any, you always load then later on.
If-else statements
In these exercises we don’t use the dataframe yet, that comes later when we have loops. For this part, just declare variables to test your statements, e.g. bp <- 120.
Write an if-else statement that prints whether a person has high (more than 100), low (lower than 50) or normal blood pressure (between 50 and 100).
Write an if-else statement that assigns people high, moderate or low health risk based on their smoking habits (
Smoker <- 'Smoker') and BMI:
Smoker and BMI greater than 35 -> high risk
Smoker or BMI greater than 35 -> moderate risk
otherwise low risk
And Smoker should be one of “Smoker”, “Former”, “Never”, “Unknown”.
Verify that your statement works for different combinations of smoking habits and BMI.
Loops
Read in the joint diabetes dataset you created in Exercise 2. If you did not make it all the way through Exercise 2 you can find the dataset in
../data/exercise2_diabetes_glucose.xlsx. Name the datasetdiabetes_glucose.Print each column name in the
diabetes_glucosedataframe using a for loop.Loop over all rows of diabetes_glucose and determine whether the person’s blood pressure is high, denoted by
bp > 100. If a person has high blood pressure, print the string “High blood pressure”, along with the blood pressure value and the ID belonging to that person.Loop over all rows of diabetes_glucose and extract the smoking habits and BMI for each row and determine the health risk with the same conditions as in Question 1 (high, low or normal blood pressure). Print the smoking habits and BMI as well as the health risk level to make it easier to see whether your code works correctly.
Extract value for i’th row in specific column: df$col1[i]
An easy way to printing several variables is to pass a vector into print: print(c(this, and_that, and_this_too))
- Do the same as above but instead of printing the risk status, append it to a list. Start by initiating an empty list.
- Check the length of the list. Is it as expected?
Since we looped through all the rows in the diabetes_glucose dataframe, the list should be as long as there are row in the dataframe.
Add the list as a new column in the
diabetes_glucosedata frame. Note: Before assigning it, use theunlist()function to convert the list to a flat vector. This ensures that each value aligns correctly with the rows of the data frame.Write a
forloop that creates a ggplot2 barplot for each of the categorical variables in thediabetes_glucose. You could do this in multiple ways, one way would be to make a list of all the column names indiabetes_glucosethat contain categorical variables and loop over these. OR, instead of a list, you could use a conditional if-statement to check if a column is categorical.Write a
forloop that creates a ggplot2 boxplot for each of the numerical variables in thediabetes_glucose. You should stratify the boxplot by therisk_statusvariable you created in questions 7-9 above.