Logistic Regression with Iterative (re-)Weighted Least Squares
Logistic regression (LR) models are generalized linear models and
often used for binary response models where an observation
is binary zero or one. The logit-link is used as cannonical link to
ensure that the modelled probabilities
lie within
.
for a specific observation
is the probability that we will observe
.
The model can be written as follows:
-
or
.
is a vector of length
of regression coefficients,
a matrix of dimension
containing the covariates. Given as set of observations
and the corresponding covariates
the regression coefficients
can be estimated using e.g., maximum likelihood. The log-likelihood sum
of the LR model can be written as follows:
-
.
Fitting Logistic Regression Models
The parameters of binary LR models can be estimated using an
interative (re-)Weighted least squares (IWLS) solver. The regression
coefficients
are iteratively updated using a Newton-Raphson update procedure. A
single Newton update (one single iteration) is:
Where the derivates are evaluated at
from the previouse iteration. The first order and second order
derivatives of the log-likelihood are:
The same can be written in matrix notation:
-
,
β¦ where
is an
diagonal matrix of weights with the diagonal elements
evaluated at
.
Thus, the Newton step in matrix notation is given as:
With
()
we can write the Newton step as:
-
.
With
and
the equation can be rewritten as:
β¦ similar to ordinary least squares.
IWLS Algorithm
Given the equations above the iterative algorithm can be written as
follows:
Initialization
- Initialize
(set all coefficients to zero)
- Initialize
()
Update step for iteration
:
- Update weights:
- Update weighted adjusted response:
- Update coefficients:
- Calculate likelihood:
.
If
proceed with step 3.
- For
:
if
the likelihood could not have been improved in this iteration (converged
or stuck): stop IWLS algorithm and return
.
If
:
maximum number of iterations reached, stop algorith and return
.
Else proceed with step 3 until one of the stopping
criteria is reached.
The manual page of the
iwls_logit function contains a practical example. More details about
the IWLS procedure can be found in Hastie, Tibshirani, and Friedman (2009,
Chap. 4.4.1), McCullagh and
Nelder (1999, Chap. 4.4), and many other statistical text books (see
References for details).