Useful R syntax
Posted by admin on Wednesday Oct 13, 2010 Under StatisticsReading table of selected file from a broswer
read.table(file.choose())
nrow (dat) # number of rows
head (dat) # shows names and first few rows of dat
paste(“hello”, “world”, sep=”-”) # hello-world
source(mylibrary.R) # will import mylibrary content
rep(NA, 5) # NA NA NA NA NA
rep(1:4, 2) # 1 2 3 4 1 2 3 4
rep(1:4, each=2) # 1 1 2 2 3 3 4 4
Validating assumption of multivariate normal data
Univariate diagnostic plots : Histogram and QQ plot
Standardize the data and plot a histogram
mydata.st<-scale(mydata.dat)
hist(mydata.st, main="histgram", xlab="X values")
#qq plot
## pch =16 (16 is a symbol for a filled circle)
qqnorm(mydata.st, main="QQ plot", pch=16, col="navy")
Chi-squre plot
==========
Output multiple plots in one screen (page)
## c(2,3) determines no of rows and columns
## no of row = 2
## no of columns = 3
par(mfrow=c(2,3))
Parameters for graphs
Pch : plotting character, i.e., symbol to use
there are 18 symbols.
============
Random variable generator in R
# Standard normal
# n: number of values you want to generate
rnorm(n)
# Chi-square
# n: no of values, df: degrees of freedom
rchisq(n, df)
# Cauchy
# n: no of values
rcauchy(n)
Create a Matrix in R
yes no maybe
apple 1 4 7
orange 2 5 8
banana 3 6 9
Evac <- matrix(c(1,2,3,4,5,6,7,8,9), 3, 3, dimnames=list(fruit=c("apple", "orange", "banana"), answer=c("yes", "no", "maybe")))
Perform Fishers Exact Test in R
fisher.test(Evac)
Manipulating data frame and data
When reading a large set of data, it is better to scan than loading the whole data set.
Using linux command in R is a good way to save processing time
grep
string function that returns indices of your interest
#print working directory path
getwd()
#set working directory path
setwd("C://...")
# installing packages
install.packages(package_name)
# print files and dir in the working dir
list.files()
# Lower and Uppercase
toupper # to uppercase
tolower # to lowercase