I show here how to perform simple regression with least squares.
We implement the closed formulae that result from
$$\min_{\alpha,\,\beta}Q(\alpha,\beta)$$
where
$$Q(\alpha,\beta) = \sum_{i=1}^n\hat{\varepsilon}_i^{\,2} = \sum_{i=1}^n (y_i - \alpha - \beta x_i)^2 $$
We thus compute
$$\frac{\partial}{\partial \beta} Q(\alpha,\beta)$$
and
$$\frac{\partial}{\partial \alpha} Q(\alpha,\beta)$$
Which results in
$$\hat\beta = \frac{ \sum_{i=1}^{n} (x_{i}-\bar{x})(y_{i}-\bar{y}) }{ \sum_{i=1}^{n} (x_{i}-\bar{x})^2 }
= \frac{ \overline{xy} - \bar{x}\bar{y} }{ \overline{x^2} - \bar{x}^2 }
= \frac{ \operatorname{Cov}[x,y] }{ \operatorname{Var}[x] }
$$
$$\hat\alpha = \bar{y} - \hat\beta\,\bar{x}$$
So here is the code:
#include "stdafx.h" #include <stdlib.h> double sum(double *x, int n) { double s=0; for (int i=0; i<n; i++) { s+=x[i]; } return s; } double dot(double *x, double *y, int n) { double s=0; for (int i=0; i<n; i++) { s+=x[i]*y[i]; } return s; } void ols(double *x, double *y, int n, double *beta, double *alpha) { double sx = sum(x,n); double sy = sum(y,n); double sx2 = dot(x,x,n); double sxy = dot(x,y,n); double sxx = sx*sx; *beta = (sxy-sx*sy/n) / (sx2-sxx/n); *alpha = sy/n - *beta*sx/n; }
And you can download the Visual C++ project here.
No comments:
Post a Comment