Content

Part1: Some useful R commands
1.1 Command Warm Up
1.2 Data objects: Vectors
1.3 Data objects: Matrices
1.4 Data objects: Data frame
1.5 Data sorting
1.6 Matrix calculations
1.7 Reading and writing data
1.8 Re-directing output and file management
Part2: Financial data analysis

Part1: Some useful R commands

1.1 Command Warm Up

To launch a Web browser that allows to show the help pages type:

help.start()

To obtain help on particular topic (e.g. ar: to fit an autoregressive time series model to the data) type:

?ar # or help(ar)

Assignment operator is <- or -> (less used). To type <-, useAlt + - in Windows or Option + - in macOS

x <- 10
x 
5 -> x
x

The difference of = and <-: the later one explictly declare a variable in environment, see below:

median(x = 1:5)
x ## Error: object 'x' not found

median(x <- 1:5)
x

Extended: <<- is another operator which is useful in R Language Object-oriented programming (OOP), it will change the parent state variables in inheritance

R is an object oriented program. It handles many types of object. Objects are created and stored by name. To display the names of (most of) the objects which are currently stored within R, the R command:

objects()

Remove variables:

rm(x) #to remove the object obj type
rm(list=ls(all=TRUE)) #to remove all objects > objects()

To use commands (e.g. functions) stored in an external file, e.g. commands.R in the current working directory work, type:

source("commands.R")
abcfunction() #here we assume abcfuntion() is defined and stored in commands.R

Use () for variable will print the variable out:

(daisy <- "42029ef215256f8fa9fedb53542ee6553eef76027b116f8fac5346211b1e473c")

Quit R:

q() # no need to try :)

1.2 Data objects: Vectors

Use c() to create a vector:

value.num <- c(3,4,2,6,20) 
value.char <- c("math","cs","finance") 
value.logical <- c(F,F,T,T)

The rep function replicates elements of vector:

(value <- rep(5,6))

The seq function creates a regular sequence of values to form a vector:

seq(from=2,to=12,by=2)

The functions can be used in combination:

value <- c(1,2,5,rep(3,4),seq(from=1,to=6,by=3)); value

The scan function is used to enter data at the terminal:

value <- scan()   #press "Esc" to exit

Vector operations

x <- runif(10) # generates random vector of length 10 independent, uniformly distributed
x
y <- 10*x + 1
y
z <- (x-mean(x))/sd(x)
z
mean(x)
sd(x)

1.3 Data objects: Matrices

A matrix may be created from a vector by using dim:

value <- rnorm(10) # generates random vector of length 10 independent, normal distributed
dim(value) <- c(2,5) #2× 5 matrix
value
dim(value) <- NULL # back to vector
value

It may also be created from a vector by using matrix:

value1 <- matrix(value,2,5); value1 #2,5 is the dimension of the matrix
matrix(value,2,5,byrow=T) #type ?matrix to see the difference

To bind a row onto an already existing matrix, the rbind function can be used:

value2 <- rbind(value1,c(1,1,2,2,3)) # add one row

To bind a column onto an already existing matrix, the cbind function can be used:

value3 <- cbind(value2,c(1,1,2)) # add one column
value3

1.4 Data objects: Data frame

The function data.frame converts a matrix or collection of vectors into a data frame:

value3 <- data.frame(value3)
value3
value4 <- data.frame(rnorm(3),runif(3))
value4

To view the row and column names of a data frame:

names(value4)
row.names(value4)

Alternative labels can be assigned by doing the following:

names(value4) <- c("C1","C2")
row.names(value3) <- c("R1","R2","R3")

Names can also be specified within the data.frame function itself:

data.frame(C1=rnorm(3),C2=runif(3),row.names=c("R1","R2","R3"))

The following example is to show how to access elements of a vector or matrix:

x <- sample(1:5, 10, rep=T) #produces a random sample of values between one and five, ten times
x
ones <- (x == 1); ones #check if all the elements of x are equal to 1
x[ones] <- 0
x
others <- (x > 1)
y <- x[others] #stores the values greater than 1 into y.
y
which(x > 1) #finds indices of elements bigger than 1
y <- x[-(1:5)] #copies x without the first 5 elements. To exclude values, negative index vectors are used
y

1.5 Data sorting

The command order allows sorting with tie-breaking: Find an index vector that arranges the first of its arguments in increasing order. Ties are broken by the second argument and any remaining ties are broken by a third argument.

x <- sample(1:5, 20, rep=T) 
y <- sample(1:5, 20, rep=T) 
z <- sample(1:5, 20, rep=T) 
xyz <- rbind(x, y, z) 
dimnames(xyz)[[2]] <- letters[1:20] #names the columns by the first 20 letters
xyz
o <- order(x, y, z) #orders the matrix xyz first by x, then by y and at last by z
xyz[, o]

1.6 Matrix calculations

A*B    #is the matrix of element by element products
A %*% B #is the matrix product.

x %*% A %*% x #is a quadratic form, if x is a vector

(mat1 <- matrix(c(1,0,1,1), nrow=2))
(mat2 <- matrix(c(1,1,0,1), nrow=2))
solve(mat1) # inverts the matrix

#Matrix operation
mat1 %*% mat2 # product
mat1 + mat2 # Matrix addition
t(mat1) # Matrix transposition
det(mat1) # Matrix determinant
# diag()
(A<-diag(c(1,2)))  # input a vector
(diag(A))          # input a matrix
(diag(4))          # input a number

1.7 Reading and writing data

For reading and writing in files, R uses the working directory, so make sure you either set the data file path or put data file at your working space.

getwd() # check current work space
setwd("your working space path") # set your working space

There are several ways to read and load data into the R working space, depending on the data format. For simple text data, the command is read.table. For .csv files, the command is read.csv. The data file is specified in either a single or double quotes; see examples below and the R commands of Lecture 1 available on IVLE.
R treats the data as an object and refer to them by the assigned name. For both loading commands, R stores the data in a matrix framework. As such, one can use the command dim (i.e., dimension) to see the size of the data.

read.table("http://www.stats.ox.ac.uk/pub/datasets/csb/ch11b.dat") # read via http
read.table("AAPL.txt") # read via local file; make sure AAPL.txt is under your work space 
read.csv("AAPL.csv")

With the growing volume in dataset, the above commands may behave bad if data is larger than 500MB. We recommend to use fread instead. fread is a function in data.table package.
Extension: both fread and read.csv are written via C language, but fread memory maps the file into memory and then iterates through the file using pointers. Whereas read.csv reads the file into a buffer via a connection.

install.packages("data.table")
require(data.table) 
system.time(fread("AAPL.csv")) # use system.time to check the performance of a function
fread("AAPL.txt")

Also, fwrite is superior to write.csv, use fwrite instead. *Use ?fread and fwrite to see more info.

1.8 Re-directing output and file management

By default, the output are showed in the R working space. However, you can re-direct the output to a file in your current working directory. See the following example:

print("hello") 
sink("out.file") 
print('hello') 
sink() 
file.show('out.file') # shows the file
file.remove('out.file') # removes out.file 
list.files() # no out.file any more

Part2: Financial data analysis

Download the financial data of Leture 1 from IVLE, then analyse them (mean, variance, test, plot, etc) as shown on the Lecture by using the R commands.

#2.1 load data
require(data.table)
rate <- fread("EURUSD.csv")
#2.2 calculation
mean(rate$rate)
var(rate$rate)
var(rate$return[-1]) # omit NA in first row
hist(rate$rate)
hist(rate$return)
summary(rate)
fBasics::basicStats(rate$rate)  ## need to install.packages("fBasics")
#2.3 plot
require(ggplot2)
qplot(rate$time, rate$rate, rate, 
      colour = I("darkblue"), 
      xlab = "time",
      ylab = "EU/USD Rate",
      geom = "line")
qplot(rate$time, rate$return, rate, 
      colour = I("darkred"), 
      xlab = "time",
      ylab = "EU/USD return",
      geom = "line")

# for digital asset
require(coinmarketcapr) ## install.packages("coinmarketcapr")

## Loading required package: coinmarketcapr

## Warning: package 'coinmarketcapr' was built under R version 3.4.4

require(treemap)        ## install.packages("treemap")

## Loading required package: treemap

## Warning: package 'treemap' was built under R version 3.4.4

plot_top_5_currencies()

market_today <- get_marketcap_ticker_all()
head(market_today[,1:8])

##             id         name symbol rank     price_usd  price_btc
## 1      bitcoin      Bitcoin    BTC    1 4068.53744485        1.0
## 2     ethereum     Ethereum    ETH    2 139.334327768 0.03427599
## 3       ripple          XRP    XRP    3  0.3094146194 0.00007612
## 4          eos          EOS    EOS    4  4.2554984289 0.00104684
## 5     litecoin     Litecoin    LTC    5 60.9539009492 0.01499455
## 6 bitcoin-cash Bitcoin Cash    BCH    6 169.678170527 0.04174052
##   X24h_volume_usd market_cap_usd
## 1   9992201807.77  71670997597.0
## 2   4499761131.45  14690390321.0
## 3   759836701.765  12904620810.0
## 4   2190870174.15   3856524674.0
## 5   1867734118.86   3723986898.0
## 6   562465066.029   3003023649.0

df1 <- na.omit(market_today[,c('id','market_cap_usd')])
df1$market_cap_usd <- as.numeric(df1$market_cap_usd)
df1$formatted_market_cap <-  paste0(df1$id,'\n','$',format(df1$market_cap_usd,big.mark = ',',scientific = F, trim = T))
treemap(df1, index = 'formatted_market_cap', vSize = 'market_cap_usd', title = 'Cryptocurrency Market Cap', fontsize.labels=c(12, 8), palette='RdYlGn')

Lecture1: An introduction to R

Zhou Chao