MITB Banner

Implementing Model Averaging To Reduce Variance Using Keras

Deep learning neural networks are highly capable networks that can predict and solve problems with complexity. Yet the model comes with an inbuilt disability, a model may not always have the same prediction in terms of accuracy on the same machine with the same dataset. Although the model may come up with a good prediction each and every time, the variance in prediction is a real drawback in the network

Ensemble learning enables us to use multiple algorithms or the same algorithm multiple times to solve the same problem helps to reduce the variance in the prediction of the same model. Model averaging is an ensemble learning technique that helps to reduce the variance in neural networks.

We referenced Jason Brownlee’s tutorial, to implement Model averaging on a neural network. We also made a few changes to the code mentioned in Brownlee’s blog

Following this tutorial will require you to have:

  • Basic knowledge in Python
  • Understanding of Neural Networks

Model Averaging

Model averaging belongs to the family of ensemble learning techniques that uses multiple models for the same problem and combines their predictions to produce a more reliable and consistent prediction accuracy.

Model Averaging on A Multi-Class Classification Problem

First, we will create a sample dataset for a  multiclass classification problem using the make_blobs() function from sklearn.datasets.

X, y = make_blobs(n_samples=500, centers=3, n_features=2, cluster_std=2, random_state=2)

The above function will return 500 samples of an independent variable set with 2 features and a dependent variable which is categorical. The data points will have a standard deviation of 2 and will have 3 centers meaning that they will fall into either of the 3 categories.

Visualizing the dataset :

from matplotlib import pyplot
import pandas as pd
df = pd.DataFrame(dict(x=X[:,0], y=X[:,1], label=y))
colors = {0:'red', 1:'black', 2:'yellow'}
fig, ax = pyplot.subplots()
grouped = df.groupby('label')
for key, group in grouped:
group.plot(ax=ax, kind='scatter', x='x', y='y', label=key, color=colors[key])
pyplot.show()

Output:

The Multi-Layer Perceptron Model

Now that we have our dataset, we will determine the variance of the prediction in the same model applied to the same dataset in the same machine.

The problem is a Multi-Class classification problem, and the model will use softmax function on the output layer to predict either of the 3 categories or classes that a point falls in. Thus the first step would be to one hot encode the categorical feature which is the dependent factory here.

from keras.utils import np_utils
y = np_utils.to_categorical(y)

Now we will create the training and testing samples for our dataset. We will split the dataset, 30% goes to the training_set and 70% goes to the test_set.

from sklearn.model_selection import train_test_split
X_train, X_test, Y_train, Y_test = train_test_split(X,y,test_size = 0.7, random_state = 1)

Now let’s create out Neural Network model

We will create a Neural Network with 2 input nodes and one hidden layer with 20 nodes and an output layer with 3 nodes and with softmax activation. The model will be compiled with ‘adam’ optimizer.

from keras.models import Sequential
from keras.layers import Dense
model = Sequential()
model.add(Dense(20, input_dim=2, activation='relu'))
model.add(Dense(3, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
_model = model.fit(X_train, Y_train, validation_data=(X_test, Y_test), epochs=100)

After fitting the model with the training set will now evaluate its performance and compare the accuracy metrics of training and test sets.

_, train_acc = model.evaluate(X_train, Y_train, verbose=0)
_, test_acc = model.evaluate(X_test, Y_test, verbose=0)
print('Train: %.3f, Test: %.3f' % (train_acc, test_acc))

Variance in MLP

To see the variance, we just need to fit the already defined model with the same dataset on the same machine multiple times.To simplify the process of fitting the models and evaluating it for a specific number of times we will create a function.

def evaluate_model(trainX, trainy, testX, testy):
model = Sequential()
model.add(Dense(15, input_dim=2, activation='relu'))
model.add(Dense(3, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(trainX, trainy, epochs=100, verbose=0)
_, test_acc = model.evaluate(testX, testy, verbose=0)
return test_acc

n_repeats = 10
scores = list()
for _ in range(n_repeats):
score = evaluate_model(X_train, Y_train, X_test, Y_test)
print('> %.3f' % score)
scores.append(score)
from statistics import *
print('Scores Mean: %.3f, Standard Deviation: %.3f' % (mean(scores), stdev(scores)))

Output:

Model Averaging Ensemble

Now that we have understood how model averaging works we will implement it on our classification problem. But we still do not know how many reps will give the best score. Hence we will perform a sensitivity analysis to determine the optimum number of rounds that a model should run before averaging the scores.

from sklearn.datasets.samples_generator import make_blobs
from keras.utils import to_categorical
from keras.models import Sequential
from keras.layers import Dense
import numpy
from numpy import array
from numpy import argmax
from sklearn.metrics import accuracy_score
from matplotlib import pyplot
from sklearn.model_selection import train_test_split

def fit_model(trainX, trainy):
model = Sequential()
model.add(Dense(20, input_dim=2, activation='relu'))
model.add(Dense(3, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(trainX, trainy, epochs=100, verbose=0)
return model

#make an ensemble prediction for multi-class classification
def ensemble_predictions(members, testX):
#make predictions
yhats = [model.predict(testX) for model in members]
yhats = array(yhats)
#sum across ensemble members
summed = numpy.sum(yhats, axis=0)
#argmax across classes
result = argmax(summed, axis=1)
return result

#evaluate a specific number of members in an ensemble
def evaluate_n_members(members, n_members, testX, testy):
#select a subset of members
subset = members[:n_members]
print(len(subset))
#make prediction
yhat = ensemble_predictions(subset, testX)
#calculate accuracy
return accuracy_score(testy, yhat)

X, y = make_blobs(n_samples=500, centers=3, n_features=2, cluster_std=2, random_state=2)
X_train, X_test, Y_train, Y_test = train_test_split(X,y,test_size = 0.3, random_state = 1)
Y_train = to_categorical(Y_train)

#fit all models
n_members = 20
members = [fit_model(X_train, Y_train) for _ in range(n_members)]
#evaluate different numbers of ensembles
scores = list()

for i in range(1, n_members+1):
score = evaluate_n_members(members, i, X_test, Y_test)
print('> %.3f' % score)
scores.append(score)

print("Average Accuracy Score : ", numpy.mean(score))
#plot score vs number of ensemble members
x_axis = [i for i in range(1, n_members+1)]
pyplot.plot(x_axis, scores)
pyplot.show()

Output:

We can see that the accuracy maintains the average at around 13 and then fluctuates within close ranges of the average.Hence we will choose the optimum number of members to be 13,

Now we can update the code to use an ensemble of 13 models.

from sklearn.datasets.samples_generator import make_blobs
from keras.utils import to_categorical
from keras.models import Sequential
from keras.layers import Dense
import numpy
from numpy import array
from numpy import argmax
from numpy import mean
from numpy import std
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split

#fit model on dataset
def fit_model(trainX, trainy):
#define model
model = Sequential()
model.add(Dense(20, input_dim=2, activation='relu'))
model.add(Dense(3, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
#fit model
model.fit(trainX, trainy, epochs=100, verbose=0)
return model

#make an ensemble prediction for multi-class classification
def ensemble_predictions(members, testX):
#make predictions
yhats = [model.predict(testX) for model in members]
yhats = array(yhats)
#sum across ensemble members
summed = numpy.sum(yhats, axis=0)
#argmax across classes
result = argmax(summed, axis=1)
return result

#evaluate ensemble model
def evaluate_members(members, testX, testy):
#make prediction
yhat = ensemble_predictions(members, testX)
#calculate accuracy
return accuracy_score(testy, yhat)

X, y = make_blobs(n_samples=500, centers=3, n_features=2, cluster_std=2, random_state=2)
X_train, X_test, Y_train, Y_test = train_test_split(X,y,test_size = 0.7, random_state = 1)
Y_train = to_categorical(Y_train)

#repeated evaluation
n_repeats = 10
n_members = 13 #optimum number of modeling
scores = list()

for _ in range(n_repeats):
#fit all models
members = [fit_model(X_train, Y_train) for _ in range(n_members)]
#evaluate ensemble
score = evaluate_members(members, X_test, Y_test)
print('> %.3f' % score)
scores.append(score)

#summarize the distribution of scores
print('Scores Mean: %.3f, Standard Deviation: %.3f' % (mean(scores), std(scores)))

Output:

Access all our open Survey & Awards Nomination forms in one place >>

Picture of Amal Nair

Amal Nair

A Computer Science Engineer turned Data Scientist who is passionate about AI and all related technologies. Contact: amal.nair@analyticsindiamag.com

Download our Mobile App

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox
Recent Stories