This is a generic linear model, and it currently has one dimensional output. So it's suited for one-dimensional regression or binary classification. It supports using custom basis functions and training with either gradient descent or the normal equation.

In [165]:

%pylab inline
classLinearModel(object):""" A generic linear regressor. Uses nonlinear basis functions, can fit with either the normal equations or gradient descent """def__init__(self,basisfunc=None):""" Instantiate a linear regressor. If you want to use a custom basis function, specify it here. It should accept an array and output an array. The default basis function is the identity function. """self.w=array([])self.basisfunc=basisfuncifbasisfuncisnotNoneelseself.identitydefidentity(self,x):#identity basis function - for linear models in xreturnxdefbasismap(self,X):#return X in the new basis (the design matrix)Xn=zeros((X.shape[0],self.basisfunc(X[0,:]).shape[0]))fori,xiinenumerate(X):Xn[i,:]=self.basisfunc(xi)returnXndeffit_gd(self,X,y,itrs=100,learning_rate=0.1,regularization=0.1):""" fit using iterative gradient descent with least squares loss itrs - iterations of gd learning_rate - learning rate for updates regularization - weight decay. Greated values -> more regularization """#first get a new basis by using our basis funcXn=self.basismap(X)#initial weightsself.w=uniform(-0.1,0.1,(Xn.shape[1],1))#now optimize in this new space, using gradient descentprint'initial loss:',self.loss(X,y)foriinrange(itrs):grad=self.grad(Xn,y,regularization)self.w=self.w-learning_rate*gradprint'final loss:',self.loss(X,y)defgrad(self,X,y,reg):""" Returns the gradient of the loss function with respect to the weights. Used in gradient descent training. """return-mean((y-dot(X,self.w))*X,axis=0).reshape(self.w.shape)+reg*self.wdeffit_normal_eqns(self,X,y,reg=1e-5):""" Solves for the weights using the normal equation. """Xn=self.basismap(X)#self.w = dot(pinv(Xn), y)self.w=dot(dot(inv(eye(Xn.shape[1])*reg+dot(Xn.T,Xn)),Xn.T),y)defpredict(self,X):""" Makes predictions on a matrix of (observations x features) """Xn=self.basismap(X)returndot(Xn,self.w)defloss(self,X,y):#assumes that X is the data matrix (not the design matrix)yh=self.predict(X)returnmean((yh-y)**2)

So we can see that the regulization pulls the size (l2 norm) of the weight vector to zero, which makes the decision boundary "less complicated". It can also be seen as putting a centered gaussian prior on the weight vector. Let's look at regularization in regression.