Useful R syntax

Posted by admin on Wednesday Oct 13, 2010 Under Statistics

Reading table of selected file from a broswer

read.table(file.choose())

nrow (dat) # number of rows
head (dat) # shows names and first few rows of dat
paste(“hello”, “world”, sep=”-”) # hello-world

source(mylibrary.R) # will import mylibrary content

rep(NA, 5) # NA NA NA NA NA
rep(1:4, 2) # 1 2 3 4 1 2 3 4
rep(1:4, each=2) # 1 1 2 2 3 3 4 4
Validating assumption of multivariate normal data

Univariate diagnostic plots : Histogram and QQ plot

Standardize the data and plot a histogram
mydata.st<-scale(mydata.dat)
hist(mydata.st, main="histgram", xlab="X values")

#qq plot

## pch =16 (16 is a symbol for a filled circle)
qqnorm(mydata.st, main="QQ plot", pch=16, col="navy")

Chi-squre plot

==========
Output multiple plots in one screen (page)

## c(2,3) determines no of rows and columns
## no of row = 2
## no of columns = 3
par(mfrow=c(2,3))

Parameters for graphs
Pch : plotting character, i.e., symbol to use
there are 18 symbols.

============
Random variable generator in R
# Standard normal
# n: number of values you want to generate
rnorm(n)

# Chi-square
# n: no of values, df: degrees of freedom
rchisq(n, df)

# Cauchy
# n: no of values
rcauchy(n)

Create a Matrix in R

yes no maybe
apple 1 4 7
orange 2 5 8
banana 3 6 9

Evac <- matrix(c(1,2,3,4,5,6,7,8,9), 3, 3, dimnames=list(fruit=c("apple", "orange", "banana"), answer=c("yes", "no", "maybe")))

Perform Fishers Exact Test in R
fisher.test(Evac)

Manipulating data frame and data
When reading a large set of data, it is better to scan than loading the whole data set.

Using linux command in R is a good way to save processing time

grep
string function that returns indices of your interest

#print working directory path
getwd()

#set working directory path
setwd("C://...")

# installing packages
install.packages(package_name)

# print files and dir in the working dir
list.files()

# Lower and Uppercase
toupper # to uppercase
tolower # to lowercase

Comments are closed.