Department of EEE, National Institute of Technology, Tamil Nadu, India

ABSTRACT
In a deregulated power market, generating companies (Gencos) evaluate bidding strategies to maximize their
profit. A Genco has to make a decision based on limited information available, since it does not know the actual system
Market Clearing Price (MCP) beforehand. Thus, an optimal bidding strategy is a challenging task for GenCos.
Accurately forecasted MCP will aid as vital information in enhancing the chances of winning bids in today’s competitive
electricity markets. Based on the literatures, neural networks are used in most of the forecasting applications. This paper
proposes a near optimal ANN architecture based electricity price forecast engine using the available historical data for
forecasting MCP in Indian Energy exchange (IEX). This paper uses a similar-day approach for forecasting the MCP.
The recent available historical data from 1st January 2014 to 16th March 2014 is used in this research work. This paper also
investigates the performance related issues with the various ANN architecture models.

1. INTRODUCTION
A decisive issue for all market participants in today’s restructured electricity power industry has been the
electricity price forecasting. A precise price forecasting helps suppliers to set up bidding strategies, make investment
decisions and be cautious against risks. Conversely, consumers can use price forecasting to exploit appropriate power
purchasing strategies for maximum utility utilization. Electricity market clearing price (MCP) is the price that exists when
an electric market is clear of shortage and surplus [1]. It is the final outcome of market bidding price. When electricity
MCP is determined, every supplier whose offering price is below or equal to the electricity MCP will be picked up to
supply electricity at that hour. They will be paid at the same price, the electricity MCP, not the price they offered.
The reason for this is to keep fairness of the market and to avoid market manipulation. The accuracy of the forecast
depends on the availability of the data and further depends on other influential price drivers such as volatility in fuel price,
load uncertainty, fluctuations in hydroelectricity production, generation uncertainties, transmission congestion, behaviour
of market participants etc…
Owing to the significance and intricacies of the electricity price forecasting, several methods have been proposed
by researchers for short-term price forecasting. Among these methods, two extensively used approaches are time series [2]
Impact Factor(JCC): 1.3268 - This article can be downloaded from www.impactjournals.us

112

Smitha Elsa Peter, I. Jacob Raglend & Sishaj P Simon

models and artificial neural networks (ANNs)[3]. Time series models such as dynamic regression and transfer function,
ARIMA [1], EGARCH (exponential GARCH) [4,5], WT-ARIMA model [6] have been proposed for this purpose.
However, most time series models are linear predictors, which have difficulties in predicting the hard nonlinear behaviour
of electricity price.
ANNs have also been used by many researchers for price forecasting. Yamin et al. [7] have proposed a
comprehensive model using ANN for short-term electricity price forecasting. Zhang et al. [8] have applied the cascaded
architecture of multiple ANN to forecast the market clearing price (MCP) in New England to improve the prediction
accuracy, other approaches considering hybrid model have been proposed. Rodriguez and Anders [9] have proposed a
combination of neural networks and fuzzy logic for MCP prediction in the Ontario electricity market. Li et al. [10] have
presented the fuzzy inference system and least-squares estimation for price forecasting. Though ANN based forecast
engines are developed, the network architecture and the manner in which the available historical data being used will be
different for different electricity markets or energy exchanges. Therefore, with the available data, designing the near
optimal ANN architecture for a typical exchange is always challenging.

2. PROPOSED WORK
The proposed work is carried out for forecasting market clearing price of the Indian Energy Exchange. Not so
many literatures are available for the forecast of MCP in the Indian Energy Exchange (IEX). IEX is one of the Indiaâ&#x20AC;&#x2122;s
electricity power trading platform, Over 2600 participants across utilities from 27 states, 5 Union Territories,
more than 500 private generators and more than 2300 open access consumers are doing business with IEX to manage
power portfolio in the most competitive and reliable way. Day-Ahead and Term-Ahead market is followed in the
IEX. Day-Ahead-Market (DAM) is a physical electricity trading market for deliveries for any/some/all 15 minute time
blocks in 24 hours of next day starting from midnight. The prices and quantum of electricity to be traded are determined
through a double sided closed auction bidding process. Term-Ahead-Market (TAM) provides a range of products allowing
participants to buy/sell electricity for contracts beyond day-ahead market, besides intraday contracts [www.iexindia.com].
The proposed work concentrates in forecasting the hourly Weak-Ahead Market Clearing price which is the part of
TAM using a similar day approach using feed forward back propagation neural network (FFBPNN).
The activities of the consumers are found to be similar on the same week days. So, in this case study, MCP of
similar days is correlated for training the historical MCP data. For example, the MCP profile on Monday of the previous
week is correlated to Monday of the present week. So when a test input is fed into the forecast model, a week-ahead MCP
profile is forecasted. Various architectures of FFBPNN are tried out and the best one is proposed. The data is pre-processed
by normalizing the load between 0.1 and 0.9 and is used in this work.

3. HISTORICAL DATA OF IEX
The historical data reports that are available in the IEX website as market snapshot are considered in the proposed
work. The market snap shot consist of the hourly Purchase Bid (MW), Sell Bid (MW), Market Clearing Volume (MW),
Cleared Volume (MW) and Market Clearing Price (MCP). Market Clearing Volume (MCV) is carried out before
transmission congestion, whereas Cleared Volume (CV) is carried out after transmission congestion. It is very important to
understand the nature of the recorded data which may be vital or very much related to the MCP. Sometimes,

Index Copernicus Value: 3.0 - Articles can be sent to editor@impactjournals.us

113

An Architectural Frame Work of ANN Based Short Term Electricity Price
Forecast Engine for Indian Energy Exchange Using Similar Day Approach

the performance of the forecast largely varies due to the homogeneity of the data used. It should be noted that the Market
Clearing Price is non-homogenous in nature. Therefore, understanding the shape of the historical data, it will be easier to
choose the right data for the development of the proposed forecast engine. The market snapshot data for first 75 days is
presented in the Figure 1.
PB(MW)

Figure 1: Market Snapshot of Historical Data (1st 75 Days of the Year 2014)
The total number of samples for the 75 days is 19320. The wave form profile of Purchase Bid (PB), Sell Bid (SB),
Market Clearing Volume (MCV), Cleared Volume (CV) and Market Clearing Price (MCP) are found to be homogenous in
nature. Since the all data is non- homogenous in nature, the correlation of any combination of the 5 waveforms
(PB, SB, MCV, CV and MCP) with that of the Market Clearing Price (MCP) need to be explored in the forecast engine.
However, all possible architectures will be tried out in the following section before a near optimal ANN model is proposed
for the IEX. The training data, validation data and the testing data for the FFBNN is considered only from the
19320 samples. The source and the target training data for FFBNN training is taken from 1st Jan 2014 to 26th Feb 2014, and
from 8th Jan 2014 to 5th Mar 2014, respectively. The validation data is taken from 12th Feb 2014 to 19th Feb 2014 and is
compared with the actual data from 20th Feb 2014 to 26th Feb 2014. The testing or verification data is taken from
26th Feb 2014 to 5th Mar 2014, and is compared with the actual data from 6th Mar 2014 to 12th Mar 2014. It should be noted
that the testing data is not used in the training set whereas the validation data is used in the training set. Validation is
carried out while training to check that the network do not over train, thereby the forecast accuracy will not deteriorate.

4. PROPOSED METHODOLOGY
4.1 Architecture
The architecture of the feed forward back propagation neural network is given in Figure 2. This ANN model
consists of ‘M’ input nodes and ‘O’ output nodes with ‘H’ hidden nodes in the hidden layer. The hidden layer and the
output layer nodes consist of log-sigmoid transfer function whose output value will in the range between 0 and 1.

Impact Factor(JCC): 1.3268 - This article can be downloaded from www.impactjournals.us

114

Smitha Elsa Peter, I. Jacob Raglend & Sishaj P Simon

Figure 2: Architecture of Feed Forward Back Propagation Neural Network (FFBPNN)
The historical dataset is usually not used directly in process modelling of ANNs due to the difference in
magnitude of the process variables. Therefore, the data needs to be scaled to a fixed range to prevent unnecessary
domination of certain variables, and to prevent data with larger magnitude from overriding the smaller and impede the
premature learning process. The choice of range depends on transfer function of the output nodes in ANN. Typically,
[0, 1] for sigmoid function and [-1, 1] for hyperbolic tangent function. However, due to nonlinear transfer function has
asymptotic limits; the range of dataset is always set slightly less than the lower and upper limits. In this work, since the
sigmoid function is adopted, the data is normalized in the range of [0.1-0.9]. i.e., If
minimum value of the training set, respectively, then the normalised data is given by

x 1 and x 2 is the maximum and

Ν ( x ) as in (4.1).

 (x − x 1 ) × (0.1 − 0.9) 
 + 0.9
Ν ( x ) = 
(x 2 − x1 )



(4.1)

Based on the data being sent in the forecast engine the following cases of various architectures are proposed and
the performance related to training error and forecast accuracy are discussed. In all possible architecture the output node
remains one and the number of hidden nodes is set based on trial and error. To understand the relationship of all the
5 waveforms with that of the MCP, the number of input nodes varies from 1 to 5. The training data set, validation data and
testing data is created based on the similar day approach.
Case-I (5-H-1 FFBNN Architecture)
The network architecture consists of 5 input nodes and 1 output node. All the 5 input waveforms are given as
input for the training set.
Case-II (4-H-1 FFBNN Architecture)
The network architecture consists of 4 input nodes and 1 output node. If the 5 waveforms are represented as
(PB,-1 SB-2, MCV-3, CV-4 and MCP-5), then the following 5 combination of input data need to be evaluated for the
ANN model. They are 4(1)-H-1, 4(2)-H-1, 4(3)-H-1, 4(4)-H-1 and 4(5)-H-1. The number within the bracket is the
waveform which is not considered. For example, in 4(2)-H-1, the 2nd waveform (Sell Bid Price) is not considered.
Case-III (3-H-1 FFBNN Architecture)
The network architecture consists of 3 input nodes and 1 output node. There will be 10 possible combinations.
Index Copernicus Value: 3.0 - Articles can be sent to editor@impactjournals.us

An Architectural Frame Work of ANN Based Short Term Electricity Price
Forecast Engine for Indian Energy Exchange Using Similar Day Approach

115

They are 3(1-2)-H-1, 3(1-3)-H-1, 3(1-4)-H-1, 3(1-5)-H-1, 3(2-3)-H-1, 3(2-4)-H-1, 3(2-5)-H-1, 3(3-4)-H-1, 3(3-5)-H-1 and
3(4-5)-H-1. The numbers within the bracket are the waveform which are not considered
Case-IV (2-H-1 FFBNN Architecture)
The network architecture consists of 2 input nodes and 1 output node. There will be 9 possible combinations.
They are 2(1-2-3)-H-1, 2(1-3-4)-H-1, 2(1-4-5)-H-1, 2(2-3-4)-H-1, 2(2-4-5)-H-1, 2(2-5-1)-H-1, 2(3-4-5)-H-1, 2(3-5-1)-H-1
and 2(3-5-2)-H-1.
Case-V (1-H-1 FFBNN Architecture)
The network architecture consists of 2 input nodes and 1 output node. There will be 5 possible combinations.
They are 1(2-3-4-5)-H-1, 1(1-3-4-5)-H-1, 1(1-2-4-5)-H-1, 1(1-2-3-5)-H-1 and 1(1-2-3-4)-H-1.
4.2 Step by Step Algorithm of FFBPNN Architecture
Nomenclature
I

Input training vector

I = (i1 ,..., i n ,..., i M )
T

Output target vector

T = ( t 1 ,..., t y ,..., t O )
δy

Error correction weight adjustment for why due to an error at output unit Ky
δh Error correction weight adjustment for vnh due to an error at hidden unit Jh

α

Learning rate

f (sum) =

1
Activation function or Threshold function
1 + exp( −sum)

Step 1: Set the trial number tr =1
Step 2: Set the epoch ep =1
Step 3: Generate the weights randomly to small random values between 0 and 1 to ensure that the network is not
saturated by large values of weights. Let I and T be the normalized input and target training vector from
set of P number of training patterns.
Step 4: Choose a training pair from the training set.
Step 5: For each training pair, do steps 6 -11
Step 6: Each input unit receives input signal in and broadcasts this signal to all units in the hidden layer J.
Step 7: Each hidden unit Jh sums its weighted input signals and the net input to the hidden unit is given as in
(4.2) and the output at the hidden layer (J) is given as in (4.3). Send the output of the hidden layer signals
to all units in the output units.

Impact Factor(JCC): 1.3268 - This article can be downloaded from www.impactjournals.us

116

Smitha Elsa Peter, I. Jacob Raglend & Sishaj P Simon

M

sum Jh = b J + ∑ i n × Vnh ,

(4.2)

n =1

1
1 + exp(−sum Jh )

f (sum Jh ) =

(4.3)

Step 8: Each output unit Ky sums its weighted input signals and the net input to the output unit is given as in
(4.4) and the output at the output layer (K) is given as in (4.5).
H

sum Ky = b K + ∑ J h × Why ,

(4.4)

h =1

f (sum Ky ) =

1
1 + exp( −sum Ky )

(4.5)

Back Propagation of Error
Step 9: Each output unit Ky receives a target pattern corresponding to the input training pattern, computes its
error information term as in (4.6) and calculates its weight correction term as in (4.7) which is used to
update Why later.

δ y = ( t y − K y ) × f ' (sum Ky )

(4.6)

∆w hy = α × δ y × f (sum Jh )

(4.7)

The bias correction term is given in (4.8)

∆b K = α × δ y

(4.8)

Step 10: Each hidden unit Jh sums its delta inputs as in (4.9), multiplies by the derivative of its activation function
to calculate its error information term as in (4.10) and calculates its weight correction term as in (4.11)
O

sum δJ = ∑ δ y × Why ,

(4.9)

n =1

δh = sum δJ × f ' (sum Jh )

(4.10)

∆v nh = α × δh × i n

(4.11)

The bias correction term is given in (4.12)

∆b J = α x δh

(4.12)

Update Weights and Biases
Step 11: Each output unit Ky updates its weights and bias as in (4.13) and (4.14). Also each hidden unit Jh updates
its weights and bias as in (4.15) and (4.16).

Index Copernicus Value: 3.0 - Articles can be sent to editor@impactjournals.us

An Architectural Frame Work of ANN Based Short Term Electricity Price
Forecast Engine for Indian Energy Exchange Using Similar Day Approach

117

w hy ( new ) = w hy ( old) + ∆w hy

(4.13)

b K ( new ) = b K ( old) + ∆bK

(4.14)

w nh ( new ) = w nh (old) + ∆w nh

(4.15)

bJ ( new ) = bJ ( old) + ∆bJ

(4.16)

Go to Step 5, till all the training pairs in the training set are sent into the input layer I (one epoch is over).
Otherwise go to Step 12.
Step 12: Do again Step 4 to Step 8 till all the training pairs in the training set are sent into the input layer I.
Calculate the error ( ε ), the difference between the network output and the desired output, for all the
training pairs as in (4.17) and then the average mean squared error (AMSE) as in (4.18), which is
calculated for every epoch. Update ep=ep+1.
ε yp = Tpy − K yp
 O y
 ∑ εp
 y =1
∑
 O
p =1


AMSE =
P
P

(4.17)







(4.18)

Step 13: Repeat steps 2-12, if ep<TE (total number epochs), else go to step 14. The total number of epochs is
fixed based on trial and error approach such that the AMSE obtained is the least. Record the final
weights and biases obtained for the trial number tr =1. Update tr = tr+1. Also if the validation error is
increasing and if the number validation checks are greater than the validation count (VC), then stop the
training for the current trial and update tr = tr+1.
Step 14: Do sufficient numbers of trials (TR) and record the final weights obtained in each of the trials.
If tr < TR, go to step 1, else stop the execution.
4.3 Performance Evaluation
The accuracy of the results in this case study is evaluated based on three error indices. They are: Mean Absolute
Percentage Error (MAPE), Normalized Mean Square Error (NMSE) and Error Variance (EV). The Mean Absolute
Percentage Error (MAPE) is defined by the following equation (4.19).

Where, Pi and Ai are the ith predicted and actual values respectively, AAve is the mean of the actual value and NH is
the total number of predictions.

5. RESULTS AND DISCUSSIONS
The five types of architectures mentioned in section 4.1 is simulated for ten number of trials. A statistical analysis
considering the average of the performance indices for all the trials is evaluated. The parameter settings in all the five
architectures such as learning rate (0.9), momentum factor (0.9), slope factor (0.05) and validation count (VC=10) are kept
same so as to have a fair comparison on the same reference among the architectures. The weights and bias are initialized
randomly between zero to one. The number of epochs is kept same for all the architectures as 1000. The number of nodes
in the hidden layer is kept as H=20.
Tables 1-5 give the best and average of all the performance indices for all the cases. From the results (Table 1-5),
five best performing architectures are grouped based on the lowest average error and are given five ranks according to their
performance in Table 6 below. Table 6 gives the details of the five best architectures. Here, the network which consists of
two input nodes with Purchase Bid and Market Clearing Price data as input is ranked I as the best performing architecture
with an average Training Error=4.7774E-05, Validation Error=1.1072E+01, MAPE=1.4428E+01, NMSE=1.4654E-07 and
EV=2.0406E+02.
Table 1: Case-I (5-H-1)
Architecture
5-H-1

From Table 6, it is observed that in all the five best categories, both Purchase Bid and Market Clearing Price is
available as input data which indicates a good correlation with Market Clearing Price as the target data in the training set.
Therefore, Purchase Bid data is found to be more suitable with MCP when training is carried out using FFBNN.
Since the architecture 2(2-3-4)-H-1 is found to be the best among all the architectures considered for performance
evaluation, instead of stopping at 1000th epoch, the training for the same architecture is carried out for 5000 epochs.
The resultant plots for the training error convergence and validation error convergence is given in Figures 3 and 4,
respectively. The final results of the performance indices after 5000 epochs are Training Error=3.8409E-05, Validation
Error=8.5732E+00, MAPE=1.1853E+01, NMSE=9.4847E-08 and EV=1.3771E+02.
The forecasted MCP and the actual MCP from March 6th to March 12th, 2014 is shown in Figure 5. The forecasted
price for the future week using similar day approach will enable the generating companies to carefully participate in
bidding process of the electricity price in the week-ahead market.
2

x 10

-4

Error Plot for Feed Forward Backpropagation Neural Network

1.8

Average Mean Squared Error

1.6

1.4

1.2

1

0.8

0.6

0.4

0.2
0

500

1000

1500

2000
2500
3000
Number of Epochs

3500

4000

4500

5000

Figure 3: Average Mean Square Error Convergence Plot (Training)

Index Copernicus Value: 3.0 - Articles can be sent to editor@impactjournals.us

121

An Architectural Frame Work of ANN Based Short Term Electricity Price
Forecast Engine for Indian Energy Exchange Using Similar Day Approach
Validation Plot
28

Figure 5: Forecasted MCP and Actual MCP from March 6th to March 12th, 2014

6. CONCLUSIONS
The statistical analysis with the available data in the Indian Energy Exchange shows the importance of Purchase
Bid data closely related to MCP even with the non-homogenous nature of the data profile. Among 30 various combinations
of architectures, the architecture with two input nodes with Purchase Bid and MCP is found to be successful in minimizing
the Mean Absolute Error between the forecasted MCP and actual MCP. This architecture can be used by generating
companies in deciding the bidding strategy in the highly competitive Indian Energy Exchange.