Logistic Regression and How to interpret it
Posted by admin on Friday Jan 21, 2011 Under StatisticsWhen to use Simple Logistic Regression
Logistic regression is used when Yi, response variable is binary, 0 or 1.
Meaning of Response Function of binary response var
Yi = beta_0 + beta_1* Xi + ei
Considering Yi as bernoulli random variable,
P(Yi =1 ) = pi ** let’s say probability of success
P(Yi = 0 ) = 1-pi ** probability of failure
E(Yi) = 1(pi) + 0(1-pi) = pi which equals P(Yi=1)
Therefore, we can say Expected value of Yi is same as probability of Yi being 1. (p of success)
Problems with binary response variables are:
1. error term can only take two values
2. variance is dependent of Xs
How to run logistic regression in R
#upload data
dat1<-read.table(dat1.txt, sep='\t', header=T)
test.logr<-glm( result~gender, family=binomial(logit))
Let Yi=1 success and Yi=0 failure and
Let probability of success (p(Yi=1)) be 0.2 and probability of failure (p(Yi=0) be 0.8.
The odds of success is p/(1-p) = 0.2/0.8 =0.25 is 1 to 4
Basically logit is transforming the odds function using log
log(p/(1-p)) . It's monotonic transformation and it can ease the problem of restricted range.
So how does logit look like?
logit(p) = log(p/(1-p)) = b0 + b1X1 + ... bkXk
p = exp(b0+b1x1 + bkxk)/(1+exp(b0+b1x1 + ... + bkxk)
How to Interpret coefficients?
logit(p) = b0 + b1(school),
where school (public =1 and private =0)
success = 0, failure = 1
private = 0 , public =1
b0 is log odds for public since we coded private =0 (baseline)
b1 = log(1.325) = Odds ratio of private to public
Let coefficients be b1 = 0.5234 and b0 = -1.23
How to interpret the coefficients?
By exponentiating b1 (that is log(1.325)), odds ratio may be calculated and it can interpreted as:
Odds for private school being successful are 33% than odds for public school.
To check, you can simply compute odds for public school and private school, then log the ratio log(1.325) then you will get b1 value.
Multiple Logistic Regression Model
It can be interpreted just like a simple logistic regression. But you interpret it as assuming that all other predictor variables are held constant.
With coefficients, you may compute odds ratio and can be worded as follows:
the odds of a student being successful increase by xx percent with each additional year of tutoring (X1) for given soceioeconomic status and location.
the odds of a student being successful in area 1 is at most 7 time as as great as for a student
in area 2. where area1 = 1 and area2 coded as 0
http://division.aomonline.org/rm/1997_forum_regression_models.html
