Beginning with R — The uncharted territory Part 2
For a recap of lists, vectors and matrices in R checkout Beginning with R — The uncharted territory Part 1.
Table of Contents
Arrays
Array is an object which can hold multidimensional data. Matrices are a subset of arrays as in they are two dimensional arrays. So, together with an attribute of dimension i.e. dim
, arrays also have attribute dimnames
. Array is simply a multidimensional data structure.
Its syntax is a <- array(data, dim = c(x,y,z,t...))
a <- array(1:24, dim = c(3,4,2)); print(a)
## , , 1
##
## [,1] [,2] [,3] [,4]
## [1,] 1 4 7 10
## [2,] 2 5 8 11
## [3,] 3 6 9 12
##
## , , 2
##
## [,1] [,2] [,3] [,4]
## [1,] 13 16 19 22
## [2,] 14 17 20 23
## [3,] 15 18 21 24
vec1 <- c(10,20,30,40)
vec2 <- c(12,13,14,15)
b <- array(c(vec1,vec2), dim = c(2,2,2)); print(b)
## , , 1
##
## [,1] [,2]
## [1,] 10 30
## [2,] 20 40
##
## , , 2
##
## [,1] [,2]
## [1,] 12 14
## [2,] 13 15
To define labels for different dimensions, use dimnames
vec1 <- c(10,20,30,40)
vec2 <- c(12,13,14,15)
b <- array(c(vec1,vec2), dim=c(2,2,2), dimnames = list(c("a", "b"),
c("d", "e"),
c("g", "h"))); print(b)
## , , g
##
## d e
## a 10 30
## b 20 40
##
## , , h
##
## d e
## a 12 14
## b 13 15
arr <- array(1:27,dim=c(3,3,3)); print(arr)
## , , 1
##
## [,1] [,2] [,3]
## [1,] 1 4 7
## [2,] 2 5 8
## [3,] 3 6 9
##
## , , 2
##
## [,1] [,2] [,3]
## [1,] 10 13 16
## [2,] 11 14 17
## [3,] 12 15 18
##
## , , 3
##
## [,1] [,2] [,3]
## [1,] 19 22 25
## [2,] 20 23 26
## [3,] 21 24 27
t <- arr[1:2,1:2,,drop=FALSE]; print(attributes(t)); print(t)
## $dim
## [1] 2 2 3
## , , 1
##
## [,1] [,2]
## [1,] 1 4
## [2,] 2 5
##
## , , 2
##
## [,1] [,2]
## [1,] 10 13
## [2,] 11 14
##
## , , 3
##
## [,1] [,2]
## [1,] 19 22
## [2,] 20 23
Factors
For the representation of categorical data, R has specific object called factors. Factors are basically integers and have labels associated with them. So, a particular number of factors are associated with a particular label. These labels are called levels. Factors look like characters but are integers in reality. Further uses of Factors are to sort all the categorical datasets according to one categorical dataset.
factor()
command is used to create a factor object.
fruits <- factor(c('apple','orange','orange','apple','orange','banana','apple'))
print(attributes(fruits))
## $levels
## [1] "apple" "banana" "orange"
##
## $class
## [1] "factor"
The levels are by default unordered. To order them you can define the levels.
fruits <- factor(c('apple','orange','orange','apple','orange','banana','apple'),
levels = c('apple', 'orange', 'banana'))
print(attributes(fruits))
## $levels
## [1] "apple" "orange" "banana"
##
## $class
## [1] "factor"
Dataframes
Dataframes are used to store tabular data. Lists of equal length are stored in dataframes.
a <- data.frame(city=c('Jaipur','Jammu'), rank = c(2,3)); print(a)
## city rank
## 1 Jaipur 2
## 2 Jammu 3
The data stored can be of different type. One column may be character, another may be factors and so on. But each column must have same type of data.