I show here how to perform simple regression with least squares.

We implement the closed formulae that result from

$$\min_{\alpha,\,\beta}Q(\alpha,\beta)$$

where

$$Q(\alpha,\beta) = \sum_{i=1}^n\hat{\varepsilon}_i^{\,2} = \sum_{i=1}^n (y_i - \alpha - \beta x_i)^2 $$

We thus compute

$$\frac{\partial}{\partial \beta} Q(\alpha,\beta)$$

and

$$\frac{\partial}{\partial \alpha} Q(\alpha,\beta)$$

Which results in

$$\hat\beta = \frac{ \sum_{i=1}^{n} (x_{i}-\bar{x})(y_{i}-\bar{y}) }{ \sum_{i=1}^{n} (x_{i}-\bar{x})^2 }

= \frac{ \overline{xy} - \bar{x}\bar{y} }{ \overline{x^2} - \bar{x}^2 }

= \frac{ \operatorname{Cov}[x,y] }{ \operatorname{Var}[x] }

$$

$$\hat\alpha = \bar{y} - \hat\beta\,\bar{x}$$

So here is the code:

#include "stdafx.h" #include <stdlib.h> double sum(double *x, int n) { double s=0; for (int i=0; i<n; i++) { s+=x[i]; } return s; } double dot(double *x, double *y, int n) { double s=0; for (int i=0; i<n; i++) { s+=x[i]*y[i]; } return s; } void ols(double *x, double *y, int n, double *beta, double *alpha) { double sx = sum(x,n); double sy = sum(y,n); double sx2 = dot(x,x,n); double sxy = dot(x,y,n); double sxx = sx*sx; *beta = (sxy-sx*sy/n) / (sx2-sxx/n); *alpha = sy/n - *beta*sx/n; }

And you can download the Visual C++ project here.

## No comments:

## Post a Comment