R Objects: Named lists

In our last exercise ‘3B: PCA’ we encountered the object pca_res <- prcomp(df, scale. = TRUE) (or what you have named it). Lets have a deeper look at its structure because this is a type of object you will encounter many times while using R.

Creating a PCA object

#build-in dataset: Iris
head(iris)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa
5          5.0         3.6          1.4         0.2  setosa
6          5.4         3.9          1.7         0.4  setosa
#create PCA object
pca_res <- prcomp(iris[1:4], scale. = TRUE)

pca_res is a named list:

typeof(pca_res)
[1] "list"

This is means it has several elements inside it and they are named. You can investigate them by clicking on pca_res in the Environment which will show their name, type and some example values. Lists are great because their elements can have different data types while vectors cannot.

We can also list the elements of a named list (or any other named object such as dataframes/tibbles):

names(pca_res)
[1] "sdev"     "rotation" "center"   "scale"    "x"       
names(iris)
[1] "Sepal.Length" "Sepal.Width"  "Petal.Length" "Petal.Width"  "Species"     

Elements of a named list are always accessed with the $ symbol:

#the standard deviations of the principal components
pca_res$sdev
[1] 1.7083611 0.9560494 0.3830886 0.1439265

Elements of a named list (or any list) can themselves be multi-dimensional, such as the coordinates of each data point in the PC space, x:

head(pca_res$x)
           PC1        PC2         PC3          PC4
[1,] -2.257141 -0.4784238  0.12727962  0.024087508
[2,] -2.074013  0.6718827  0.23382552  0.102662845
[3,] -2.356335  0.3407664 -0.04405390  0.028282305
[4,] -2.291707  0.5953999 -0.09098530 -0.065735340
[5,] -2.381863 -0.6446757 -0.01568565 -0.035802870
[6,] -2.068701 -1.4842053 -0.02687825  0.006586116

Many named list objects have a summary:

summary(pca_res)
Importance of components:
                          PC1    PC2     PC3     PC4
Standard deviation     1.7084 0.9560 0.38309 0.14393
Proportion of Variance 0.7296 0.2285 0.03669 0.00518
Cumulative Proportion  0.7296 0.9581 0.99482 1.00000

Some named list objects also have a class attribute. Named lists with a class attribute are also referred to as S3 objects. Or the other way around: S3 objects are named lists that have a class.

class(pca_res)
[1] "prcomp"

R uses the class to figure out how to process the object, for example inside summary(). So class is about what the object is whereas Type (as in typeof()) is about the structure of an object and how you can interact with it.

Bonus info: A ggplot is also an S3 object. Bet you didn’t know that!