Working with data types of R

I have discussed about package management in the previous post. This post concentrates on data type, data structure and data coercion.

Data types & Data Structure

We have different Object types in R as given below.

  • Vectors
  • Lists
  • Matrices
  • Arrays
  • Factors
  • Data Frames

We have the following data types in R.

  • Logical
    myvar <- TRUE
    
  • Numeric
    myvar <- 23.5
    
  • Integer
    myvar <- 2L
    
  • Complex
    myvar <- 2+5i
    
  • Character
    myvar <- "GOOD"
    
  • Raw
    myvar <- charToRaw("GOOD")
    

We could see them with examples below.

I have already written about c command in Basic functions in R language

> id1 <- c(1,2,3,4)
> id1+10
[1] 11 12 13 14

rep command helps us to repeat the values.

> gender <- rep(c("male", "female"), c(6,4))
> gender
[1] "male" "male" "male" "male" "male" "male" "female" "female"
[9] "female" "female"

male is repeated 6 times where as female repeated 4 times.

Serial number is given by the command sn.

> sn <- seq(1, 10, 1)
> sn
[1] 1 2 3 4 5 6 7 8 9 10 > sn <-seq(1, 10, 2)
> sn
[1] 1 3 5 7 9

we passed start value, end value and interval.

cbind meant for column bind.

> #vector to matrix - cbind
> slno_gender <- cbind(sn, gender)
> slno_gender
 sn gender
 [1,] "1" "male"
 [2,] "2" "male"
 [3,] "3" "male"
 [4,] "4" "male"
 [5,] "5" "male"
 [6,] "6" "male"
 [7,] "7" "female"
 [8,] "8" "female"
 [9,] "9" "female"
[10,] "10" "female"

R studio 14 - cbind

This gives column wise data, in matrix. Matrix takes single data type only. Hence serial number is changed to character data type.

Let’s look at similar example using DataFrame. We shall overcome this data type problem.

> #data frame
> df <- data.frame(sn, gender)
> df
   sn gender
1   1   male
2   2   male
3   3   male
4   4   male
5   5   male
6   6   male
7   7 female
8   8 female
9   9 female
10 10 female

Let’s create another object called names

> name <- c("Sophia","Jackson","Isabella","Lucas","Charlotte","Oliver","Amelia","Benjamin","Sarah","Julian")
> name
 [1] "Sophia"    "Jackson"   "Isabella"  "Lucas"     "Charlotte" "Oliver"
 [7] "Amelia"    "Benjamin"  "Sarah"     "Julian"

Let’s create another object called salary.  when we specify number of samples (10), mean (1000) and standard deviation (20), this command would generate random numbers to satisfy this.

> sal <- rnorm(10, 1000, 200);
> sal
 [1]  898.5731 1105.1638  757.7067  934.9025 1006.0053 1014.4837 1188.6763  611.0265
 [9]  643.5498 1121.6005
> mean(sal)
[1] 928.1688
> sd(sal)
[1] 200.4666

Let’s combine 3 objects together now to form a new object, using data.frame

> mydataset <- data.frame(sn, gender, sal)
> mydataset
   sn gender       sal
1   1   male  898.5731
2   2   male 1105.1638
3   3   male  757.7067
4   4   male  934.9025
5   5   male 1006.0053
6   6   male 1014.4837
7   7 female 1188.6763
8   8 female  611.0265
9   9 female  643.5498
10 10 female 1121.6005

After showing the above example for data frame, lets switch to list.

> mydatalist <- list(v1=id1, v2=sn, v3=slno_gender, v4=mydataset)
> mydatalist
$v1
[1] 1 2 3 4

$v2
 [1]  1  2  3  4  5  6  7  8  9 10

$v3
      sn   gender
 [1,] "1"  "male"
 [2,] "2"  "male"
 [3,] "3"  "male"
 [4,] "4"  "male"
 [5,] "5"  "male"
 [6,] "6"  "male"
 [7,] "7"  "female"
 [8,] "8"  "female"
 [9,] "9"  "female"
[10,] "10" "female"

$v4
   sn gender       sal
1   1   male  898.5731
2   2   male 1105.1638
3   3   male  757.7067
4   4   male  934.9025
5   5   male 1006.0053
6   6   male 1014.4837
7   7 female 1188.6763
8   8 female  611.0265
9   9 female  643.5498
10 10 female 1121.6005

I put different object types together in a single collection and stored it in one object. We shall refer to individual data as variables as shown below.

R studio 15 - list

> mydatalist$v1
[1] 1 2 3 4

Cool isn’t it!

Data Coercion

Let’s look at data coercion now. mode command helps is to find out the data type of an object

> mode(mydatalist$v1)
[1] "numeric"
> mode(mydatalist$v3)
[1] "character"
> mode(gender)
[1] "character"

Let’s convert this character type to factor type now.


gender1<-as.factor(gender)

How is this possible to represent the text  as numeric? Let’s look at the class, which shows us the data structure.

> class(gender1)
[1] "factor

Okay, let’s unclass gender1 now to know how is it stored.

> unclass(gender1)
 [1] 2 2 2 2 2 2 1 1 1 1
attr(,"levels")
[1] "female" "male"

So 1 represents female, 2 represents male.

See you in another interest post.

Advertisements

One thought on “Working with data types of R

  1. Pingback: Exploring data files with R | JavaShine

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s