r

Plotting density in R

How to plot density plot(density(DATA)) Rainbow color in R If you want to make a plot have rainbow color range, you can use rainbow function: rcol=rainbow(length(YOURDATA)) plot(DATAX, DATAY, type=”l”) points(DATAX, DATAY, pch=16, col=rcol) Simple Plot How to change the size of text in a plot? Use argument cex.[attribute] , and examples are below: main titles [...]

Manipulating data frame/table

To eliminate rows with condition # eliminate rows that Age is empty dat

r error: FEXACT error 7

R Error: FEXACT error 7 Testing with small sample size, it is more preferable to use the Fisher’s Exact test than the Chi-square test. fisher.test(counts, simulate.p.value=TRUE) If you have too many rows or columns, you may get an error saying, FEXACT error 7. LDSTP is too small for this problem. Try increasing the size of [...]

Contingency Table for Categorical data and R

How to create contingency table from categorical data in r. Example: There are three categorical variables x1, x2, x3 measured from wild cats where x1 = gender (male, female) x2 = age (young, kitten, adult) x3 = test result ( positive = 1, negative =0). r table will generate two tables: 2by2 table for each [...]

Validating assumption of multivariate normal data

Univariate and Multivariate diagnostics Univariate diagnostic (Histogram and QQ plot) Plot a histogram hist(mydata.st, main=”histgram”, xlab=”X values”) Plot QQ plot ## pch =16 (16 is a symbol for a filled circle) qqnorm(mydata.st, main=”QQ plot”, pch=16, col=”navy”) Multivariate dignostics Chi-squre plot We will graph distance vs chsq # function to compute distance between X and X.bar [...]

k nearest neighbors classification (knn)

Nonparametric classification method Idea behind knn is that you measure distance between new value (x0) and each of the neighboring points and count the first k shortest distances, then classify the new value to the group that wins the majority rule. Steps: 1. Choose k as an odd integer 2. Measure the distance between xo [...]

Principle Component Analysis (PCA)

Performing a PCA after standardizing the variables and obtain estimates for the principal components for the standardized variables. Reading in athelete’s data ath.dat <- read.table(“athelete.txt”) Standardizing the data ath.dat.std <- scale(ath.dat) Correlation matrix (since covariance of standardized data is correlation) R = cov(ath.dat.std) Eigen Values lambda = eigen(R)$val Eigen values are read to assess which [...]