library(tidyverse)
Warning: pakke 'ggplot2' blev bygget under R version 4.2.3
Warning: pakke 'tibble' blev bygget under R version 4.2.3
Warning: pakke 'dplyr' blev bygget under R version 4.2.3
In this exercise you will do some more advance tidyverse operations such as pivoting and nesting, as well as create plots to brush up on your ggplot skills.
Warning: pakke 'ggplot2' blev bygget under R version 4.2.3
Warning: pakke 'tibble' blev bygget under R version 4.2.3
Warning: pakke 'dplyr' blev bygget under R version 4.2.3
OGTT
column contains measurements from a Oral Glucose Tolerance Test where blood glucose is measured at fasting (Glucose_0), 6 hours after glucose intake (Glucose_6), and 12 hours after (Glucose_12).# A tibble: 6 × 4
Glucose_0 Glucose_60 Glucose_120 ID
<dbl> <dbl> <dbl> <dbl>
1 6.65 8.04 10.0 9046
2 4.49 5.40 6.22 51676
3 5.76 6.52 7.22 1665
4 6.13 6.94 8.09 12095
5 6.84 6.92 7.01 12175
6 6.84 7.62 8.42 8213
df_glucose_long <- df_glucose %>%
pivot_longer(cols = starts_with("Glucose"),
names_to = "Measurement",
values_to = "Glucose (mmol/L)"
)
head(df_glucose_long)
# A tibble: 6 × 3
ID Measurement `Glucose (mmol/L)`
<dbl> <chr> <dbl>
1 9046 Glucose_0 6.65
2 9046 Glucose_60 8.04
3 9046 Glucose_120 10.0
4 51676 Glucose_0 4.49
5 51676 Glucose_60 5.40
6 51676 Glucose_120 6.22
The stringr
packages is a part of tidyverse and has many functions for manipulating strings. Find a function that can split the string so you can extract the numbers on the other side of the underscore.
df_glucose_long <- df_glucose_long %>%
mutate(Measurement = str_split_i(Measurement, '_', 2) %>% as.numeric())
head(df_glucose_long)
# A tibble: 6 × 3
ID Measurement `Glucose (mmol/L)`
<dbl> <dbl> <dbl>
1 9046 0 6.65
2 9046 60 8.04
3 9046 120 10.0
4 51676 0 4.49
5 51676 60 5.40
6 51676 120 6.22
df_glucose_nested <- df_glucose_long %>%
group_by(ID) %>%
nest(OGTT = c(Measurement, `Glucose (mmol/L)`)) %>%
ungroup()
head(df_glucose_nested)
# A tibble: 6 × 2
ID OGTT
<dbl> <list>
1 9046 <tibble [3 × 2]>
2 51676 <tibble [3 × 2]>
3 1665 <tibble [3 × 2]>
4 12095 <tibble [3 × 2]>
5 12175 <tibble [3 × 2]>
6 8213 <tibble [3 × 2]>
diabetes_glucose <- diabetes_join %>%
left_join(df_glucose_nested, by = 'ID')
head(diabetes_glucose)
# A tibble: 6 × 12
ID Sex Age BloodPressure GeneticRisk BMI PhysicalActivity Smoker
<dbl> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
1 9046 Male 34 84 0.619 24.7 93 Unknown
2 51676 Male 25 74 0.591 22.5 102 Unknown
3 60182 Male 50 80 0.178 34.5 98 Unknown
4 1665 Female 27 60 0.206 26.3 82 Never
5 56669 Male 35 84 0.286 35 58 Smoker
6 53882 Female 31 78 1.22 43.3 59 Smoker
# ℹ 4 more variables: Diabetes <chr>, Married <chr>, Work <chr>, OGTT <list>
OGTT
) from your favorite ID.First filter
for your favorite ID and then pull
the nested column.
[[1]]
# A tibble: 3 × 2
Measurement `Glucose (mmol/L)`
<dbl> <dbl>
1 0 6.65
2 60 8.04
3 120 10.0
.rds
file. Have a guess at what the function is called.