Verification and Validation

Friday, March 21, 2014

Transport planners are starting to consider how "big data" retrieved from passenger smart cards, computers and mobile phones could improve the design of urban rail networks and timetables, and improve operations by predicting ridership.

Niels van Oort, assistant professor at Delft University of Technology, and consultant at Goudappel Coffeng, explains how big data was utilised to support the business case for a proposed light rail line in Utrecht.

WE are currently living through the big data age. All around us technology is collating reams and reams of data about our everyday activities, including travel habits and preferences. But how can this data, and information about passengers' habits in particular, be utilised effectively to aid public transport planning?

In the last decade or so, more and more attention has been placed on service reliability. This is an important quality characteristic in public transport because it benefits passengers by decreasing and offering reliable predictions of travel times, and operators by lowering costs.

My PhD research provided an overview of several measures that could improve the level of service reliability at all levels of public transport planning and operations. However, I found that in cost-benefit analyses (CBA) of specific projects this quality aspect is rarely taken into account. Instead a qualitative assessment or expert judgement is used rather than proper calculations.

One of the main reasons for ignoring service reliability impacts in CBAs is the difficulty to quantify the service reliability effects of projects on passengers. In general, the focus of service reliability indicators is on vehicle effects, while the passenger effects are of importance when calculating costs and benefits. My work therefore aimed to bridge this gap by focusing on passengers and utilising new data sources to calculate, illustrate and determine their potential impact on a CBA for a transport network, which could inform decision making.

Service reliability is currently taken into account more in road schemes than public transport. While similarities do exist, public transport applications are more complex since a schedule is involved and a passenger trip chain consists of waiting, transferring, access and egress time in addition to the time it takes to complete the actual journey.

As well as setting up the theoretical framework for improving service reliability, we applied new big data sources such as onboard computer and smart card data to perform a case study in Utrecht, the Netherlands' fourth largest city with over 300,000 inhabitants, with the aim of improving the CBA process through a practical application. The study was particularly pertinent because the results were used to assist the Dutch government as it considered whether to support the construction of a light rail line in Utrecht between the central station and the Uithof in the east of the city, the location of the hospital and campuses of Utrecht University and the University of Professional Education Utrecht as well as other businesses.

The quality of existing public transport services between Utrecht central station and the Uithof is quite poor. Although services are operated by 23 double-articulated buses per hour per direction, which carry 23,000 passengers per day, capacity is lacking, with passengers often having to wait for two or three buses to board during peak times, while dedicated right of way is provided only on short sections of the route, which leads to conflicts with cars and cyclists. This is particularly apparent at the border of the old town, where road space is limited and delays are commonplace. Bunching of two or even three buses sometimes occurs, while the average deviation of the timetable is four minutes exceeding the scheduled headway of about 2.5 minutes.

The city of Utrecht hopes to expand the area around the Uithof by 25% by 2020 with up to 53,000 students and 30,000 employees expected to use this area everyday. The city aims to achieve this growth without building additional car parks and as result plans to accommodate the expansion by stimulating increased use of bicycles and public transport. Subsequent demand forecasts conducted by Goudappel Coffeng in 2011 indicate growth of up to 45,000 public transport passengers per day in 2020. This will require more than 50 buses an hour per direction to provide adequate capacity, which the existing infrastructure is clearly not able to support.

To deal with this large expected increase in public transport use, and to provide a high level of service, a new connection was designed and proposed as a light rail rather than a bus route in order to achieve the desired level of service. The 8km Uithof Line is expected to operate 16-20 services per hour, per direction during the morning peak.

Primary benefit

In addition to fewer emissions, the primary benefit of converting the bus route into a light rail line is that it is possible to operate fewer vehicles, which are less prone to interference from road traffic and decrease the probability of bunching. However, the construction and operation costs of light railways are higher than bus operations, especially because this is Utrecht's inaugural project. It was therefore appropriate to conduct a CBA to highlight the pros and cons of building this line in order to argue the case for investment from the Dutch Ministry of Infrastructure and Environment.

In the CBA, we calculated the service reliability benefits of transferring the existing bus operation into a light rail system. We compared five future situations for the route in 2020 including bus rapid transit alternatives, but for this article we will only focus on the reference case and the preferred alternative.

The reference case states that no additional infrastructure will be constructed and the capacity of the bus service is limited to its current levels. Since ridership and the number of buses will increase, it is expected that unreliability will increase.

In contrast, in the light rail case the service is operated by LRVs on a dedicated right-of-way. Due to sufficient capacity on the track and at the stops, and with limited interaction with other traffic, the expected level of service reliability will be high. In addition, compared with over 50 buses required to operate a comparable service, the number of vehicles is limited, thereby reducing the probability of bunching and delay propagation.

When considering the impact of service reliability on passengers for the CBA, we analysed the actual performance in 2008, which we used as the base for the 2020 predictions. The level of service was determined by investigating onboard computer data, which offered insights on the distribution of dwell times per stop, overall journey times and delays. Smart card data was also analysed to illustrate passenger flows, with results from both data sources combined using our framework to calculate the passenger impacts of the level of service reliability.

Specifically, onboard computer data was used to calculate the effects of changes in waiting times, and the change in distribution of total travel times on passengers, while we also calculated future demand by using the Omnitrans demand model. This information was used to develop a simulation of new vehicle and passenger data, and the expected resulting trip times, dwell times, delays, and the level of bunching, which helped us to calculate the passenger effects.

Our results showed that in the reference case the level of service will be very low due to high passenger demand and insufficient bus infrastructure. In case of the light rail line, sufficient infrastructure is provided and light rail services require fewer vehicles thereby reducing the probability of bunching.

After the calculation of these passenger impacts, the monetary values of these effects were found using values of time and values of reliability. The total costs and benefits of the project showed the substantial contribution of improved reliability to the positive score of the CBA, which is 1.2 ie the benefits are 20% higher than the costs. The impact of reduced waiting times due to the light rail line's enhanced service reliability is €123m over the complete life of the line, and the reduction of distribution in travel time results in a €78m reduction in societal costs. So, service reliability and related benefits account for two-thirds of the project's total benefits of €336m.

Without this method and big data, it would not have been possible to calculate the benefits of enhanced service reliability, which proved to be a major part of the total benefits which contributed to the CBA result of 1.2. This result was critical in convincing the Dutch Ministry of Infrastructure and Environment to provide €110m for the light rail project and for the line getting off the drawing board. Construction is now underway on the 7.5km route which is being managed by Uithoflijn, a joint undertaking between the Utrecht region and city municipality, and is expected to open in 2018. Utrecht's existing 21km "sneltram," which runs to Nieuwegein and IJsselstein, will be converted to low-floor operation to offer a through service to the new line.

This project shows the potential value of utilising big data in railway planning. In this case study the impacts of quantifying service reliability were substantial and made the difference between a positive or negative business case, and ultimately the construction of a light rail line which will improve the quality of transport for millions of passengers in Utrecht in the years to come.

Monday, February 4, 2013

A perfect design is an enemy of a
good design. Often, designers striving for a perfect design may end-up with no
design at all due to schedule and cost overruns. A simple design may not
provide best solution to a given problem but it would probably have the best
chance of meeting the schedule and cost constraints with acceptable quality. Also,
a simple design is easier to implement, maintain and enhance.

It is
advisable not to introduce additional complexity in the name of future design
hooks. More often than not, these hooks turn out to be a liability than an
asset for the future designers as they might be forced to accept the design
based on these hooks. Thus avoid adding design hooks for the future, as they
will just add additional work that you don't need to do now and will be of
little help to future designers.

"Use
a special purpose computing platform only after you have exhausted all
possibilities of using a general purpose platform. For example, if your
application requires signal processing capabilities, consider if the
performance goals can be met by a general purpose PC platform without using
Digital Signal Processors (DSPs). General purpose processors might support
specialized instructions that might bring them at par with specialized
platforms like DSPs. Low cost software and hardware development tools. Its easy
to find people with skills in using general purpose platforms. General purpose
platforms have much higher market volume so they are often an order of
magnitude cheaper than specialized platforms.

When developing a hardware and
software architecture, prefer designs that will reuse already developed
software and hardware modules. The future reusability of the software and
modules should also be a factor in choosing new architectures. Avoid "lets
start with a clean slate" approach to developing systems. New projects
should build over the results of previous project. This lowers cost by reducing
complexity of the system being developed.

Many
embedded systems use home grown protocols and operating systems. This leads to
additional cost to maintain the associated software. Use of standard protocols
and operating systems lowers cost and improves stability of the product, as
standard products have been subjected to rigorous testing by countless systems.
Proprietary protocols/operating systems often cost a lot more due to need to
train developers.

A railway signalling
system is safety critical system that controls the traffic which includes train
routes, shunting moves and the movements of all other railway vehicles in
accordance with railway rules, regulations and technological processes required
for the operation of the railway system. The overall signalling system consists
of Microprocessor based Wayside controllers, On-board systems controlling the
railway vehicle and supervision systems to monitor the vehicle movements from a
centralized location. The complex nature of railway signalling rules and
operational practices adopted by different railroads pose a difficult task for
the software development of these systems. The complex nature of the software
poses an even more challenging task during the Independent Verification and
Validation of the system. The CENELEC set of standards is the widely accepted
as the governing standard for design, development and Independent Verification
and Validation (IV and V) of railway signalling systems. This paper describes
the challenges faced during the different phases of IV and V of safety critical
railway signalling software which is unique compared to other domains.

Nomenclature

IV and V
= Independent Verification and
Validation

ATP
= Automatic Train Protection

On-Board = Embedded systems used on the Train

CENELEC = European
standards for Railway Signalling

I.Introduction

IV and V
is the most important phase of any safety critical system life cycle. The
result of this phase decides the final outcome of the project and decides
whether the product is fit for use. The IV and V of safety critical software for
railway signalling applications is faced with many challenges due to the
complexity of the systems and the variations it has depending upon the
geography and environment in which it needs to operate. This paper particularly
focuses on the experiences and challenges during different phases of the
IV and V in a railway signalling project. The following areas will be
discussed:

1)Systematic Problems

2)Challenges during Software Analysis

3)Challenges during System Integration and Field
Validation Testing

4)Challenges during Test Result Analysis

II.Systematic Problems during IV and V of Railway
Signalling Software

The following systematic
problems are experienced during the IV and V of safety critical software
developed for railway signalling applications:

5)Lack of formal methods in developing the control
algorithms results in poor understanding of the system by a test engineer.

6)Lack of domain knowledge in railway signalling systems creates
a technological gap between the software test engineers and the domain
consultants. This leads to errors in software testing, which might lead to
unsafe failures being un-detected.

7)Since the software and hardware is so complex, complete
test of the system is not possible and most of the faults are revealed at the
field Installation stage or during normal working of the system in field.

8)The software is often changed for every geographical
location and results in specific code for each location. When the software
structure is not in a generic form, it becomes difficult for the test engineer
to develop test cases for every possible scenario.

9)The lack of standardization in the railway working
principles results in incomplete test cases as test engineers are not well
versed with all types of railroads.

10)Increase
in the complexity of the software leads to difficulty in testing, since most of
the railway systems are sequential machines they are error prone and are very
difficult to test.

III.Challenges during Software Analysis

The following section describes the challenges faced during
Static and Dynamic analysis of safety critical software developed for railway
signalling applications:

1)Static Analysis of software is the analysis of the
software code without actually executing it. The railway signalling software
particularly the vehicle braking algorithms are very complex and require the
test engineer to be well versed with the dynamics of the vehicle as well as
good mathematical knowledge. These algorithms require the test engineer to
envisage all the possible states of the algorithms and then create a formal model
of the system. In many cases the lack of Test Engineer’s knowledge about these
algorithms results in insufficient test cases for the model and results in many
of the errors revealing in the latter part of the dynamic analysis.

2)Dynamic Analysis of software is the analysis of the
software code by actually executing the software and observing the executions.
The dynamic analysis of the safety critical software is an important phase of
the Independent Verification and Validation of the system. The Test engineer
should be well versed with the domain of inputs and outputs of the system. In
many cases the test engineer chooses the boundary values based on the data
range of the variable type. In the real world the boundary values depends on
the actual working environment of the system, for example the GPS signal
boundary values received by the vehicle to determine its position varies based
on the geographical location of the railroad and is embedded in the vehicle
database. When the test engineer creates test cases for this part of the
software it should be changed based on the geographical location where the
train is operating, instead the literal boundary value of the variable type
would pass the dynamic analysis and this error would only be revealed during
field validation tests.

3)Inexperienced test engineers just follow the rule book
and often result in insufficient test scenarios. Testing experience and
intuition combined with knowledge and curiosity about the system under test may
add some uncategorized test cases to the designed test case set. Special values
or combinations of values may be error-prone. Some interesting test cases may
be derived from inspection checklists.

4)Test engineers generally are not well versed with the
concept of error seeding and do not try to measure the effectiveness of their
test cases. Some known error types should be inserted in the program, and the
program should be executed with the test cases under test conditions. If only
some of the seeded errors are found, the test case set is not adequate. The
ratio of found seeded errors to the total number of seeded errors is an
estimate of the ratio of found real errors to total number errors. This gives a
possibility of estimating the number of remaining errors and thereby the
remaining test effort.

5)Performance testing of the system in the lab
environment is often inadequate, since the simulators are not exactly
replicating the field environment, this result in many errors being revealed
during field validation phase.

6)Test engineers often follow the concept of “Equivalence
Classes and Input Partition Testing” to save time and testing effort. This
principle is often flawed due to the inexperience of the test engineer or
insufficient coverage of the data classes.

7)Lack of formal methods in developing software prevents
the test engineer to take full advantage of the “Structure Based Testing”
concept where clearly defined states and modules are required for generating
test cases for complete coverage of the system.

IV.Challenges during System Integration and Field
Validation Testing

The following section describes the challenges faced during
System Integration testing of safety critical railway signalling systems:

1)The System Integration testing should be ideally
started after the unit/Module tests are successful passed, but in reality due
to the delayed nature of the railway signalling projects, the system
integration tests are carried out in parallel with the unit/module tests, this
causes many problems to be revealed only at the system level tests.

2)Many unexpected behaviors of the system are revealed at
this stage since the software has not completely gone through unit tests. This
causes more delays in the system integration tests and changes in the
requirements are required since all the scenarios at the unit level have not
been accounted.

3)The test engineers who have been devoted to find
integration issues find themselves more involved in sorting out the problems
that should have been caught during unit tests.

4)The integration issues found during this phase often
lead to design changes which are costly to fix and in-turn increase the
complexity of the system.

5)The Field Validation tests are executed in the actual
field environment and many of these scenarios are not accounted in the lab, so
the same software has different behavior in the lab and field, this leads to
confusion and is hard for the developers to debug.

V.Challenges in Analysis of Test Results

The Railways signalling systems generally have large test
scenarios to be performed in the field and lot of the data collected during the
tests require offline analysis, the below section describes the challenges and
problems associated with this phase.

1)In many railway projects, often the field engineers are
recruited locally to ensure easy access to the test site, often these field
engineers are new to railway signalling and have little or no training to
execute the tests.

2)The complex nature of the offline log analysis requires
test analyst to be fully in synchronization with the field engineer who
executed the test, in many cases the lack of communication between the field
and offline test analysts result in falsely reporting the test as failed.

3)In many cases, due to some inherent errors in the test
procedure, the log file analyst reports the test as failed and goes through
multiple cycles of test execution.

4)In On-board ATP log analysis, often it is required to
check the braking profiles of the systems and requires complete knowledge of
the braking algorithms. In many cases the test engineer is not qualified to
perform this analysis and results in poorly reported analysis.

5)The lack of co-ordination between the test lead and the
field technicians often result in incomplete tests and later results in
incomplete log analysis.

VI.Conclusion

Railway signalling is very specialized and unique area where high
level of planning is required for all the phases of the project lifecycle
especially for the IV and V of safety critical software. Poor planning at the
start of the project usually result in cost overruns and delays. In our
experience with railway signalling projects, generally limited budgets and time
is allocated to IV and V phase which in realty takes the majority of the
project budget. If the IV and V phase is planned well in advance and sufficient
managerial responsibility is assigned specifically for this task, the projects
can be completed in time and with better results, which in turn makes the job
of the safety assessor easy. We suggest the following mitigation measures to
ensure a successful IV and V of railways signalling systems:

1)Care should be taken to recruit test engineers who at
least have basic knowledge of railway signalling and associated systems.

2)In case the test engineers are new recruits, they
should be put through rigorous training before being assigned critical tasks
such as writing test procedures and analyzing the test data logs.

3)Regular training sessions should be conducted for the
test engineers in the project to impart in-depth knowledge of the system.

4)Encourage test engineers to be innovative in their
testing methods instead of just following the regular patterns, this way more
errors in the system are revealed which often get undetected with traditional
test methods.

5)Create an environment where test engineers regularly
interact with the design team to share each ones experiences and concerns

6)Create a dedicated managerial team to monitor all the
test activities occurring a different sites and co-ordinate them. Better
co-ordination between the Lab and field test teams leads to better analysis of
the system.

7)Never follow the approach of parallel testing
activities, for example, the system integration tests should never be planned
in parallel with the unit tests.

Acknowledgments

The author would like to express his gratitude to Stephen A.
Jacklin from the NASA Ames Research Center for his encouragement to take up
this study and present my experiences with IV and V in railway signalling
domain.

Saturday, January 21, 2012

Abstract—The railway points
(switches) are vital component of any Railway Interlocking system. Regular
maintenance of points is required to keep them in operating condition. Present
maintenance of points involves frequent inspection by maintenance staff and is
not fool proof. Currently Electronic Monitoring systems are available which
only logs the event and does not give any predictive analysis about the health
of the points subsystem. This paper discusses a new approach for maintenance
and diagnosis of railway points which is capable of remote monitoring and is
intelligent enough to give predictive maintenance reports about the railway
point’s health. This reduces the effort and huge costs in reducing manual
monitoring and also it fool proof avoiding accidents. Distributed data
gathering and centralized data processing methods have been discussed that not
only report the fault but also give predictive measures to be taken by the
field staff to avoid catastrophic failures.

Introduction

Railways traverse through the length and
breadth of our country covering 63,140 route kms, comprising broad gauge
(45,099 kms), meter gauge (14,776 kms) and narrow gauge (3,265 kms).The
most important part of the railways to carry out operations like safe movement
of trains and communications between different entities is Signalling. The
Railway signalling is governed by a concept called Interlocking. The main
component of the interlocking is the Railways Points consisting of DC
electrical motors to switch the rails to a different route. These
vast and widespread assets to meet the growing traffic needs of developing
economy is no easy task and makes Indian Railways a complex cybernetic system.
The current mechanism in place to maintain the railway points are completely
manual and requires large pool of maintainers to check the validity of the
point machine and the related point infrastructure regularly, this process
employed is neither cost effective nor fool proof. By employing the traditional
method of manual maintenance, the rail operators do not have any prior warning
for replacement or repair of points. The discussion in this paper mainly
focuses on development of a system that not only monitors the points remotely
without manual intervention, but also diagnosis the problem in the point thus
saving human lives and huge manual maintenance costs.The
motivation for developing a predictive maintenance system for Railway Points is
as follows:

To use an array of sensors to monitor all
relevant parameters, in order to provide advanced warning of degradation
prior to railways points failure.

To provide predictive maintenance reports about the
point machines to the maintainers.

To provide continuous monitoring at both local
and centralized locations.

To provide an automated archival
record from which broad trends can be extracted from the entire
railway asset base.

To provide, in the event of a catastrophic failure,
the immediate past history to identify the cause.

Railway Points Structure

The followingFigure 1describes
the architecture of railway points in operation.

Points, or switches as they are known, allow a
rail vehicle to move from one set of rails to another. They are a ‘digital
output device’ in that there are only two acceptable states for the point to be
set in, ‘normal’, and ‘reverse’. Movement is carried out by way of a geared
motor, which actuates the stretcher bar. Location or state detection is made by
atwo-position, polarized, magnetic stick
contactor. A signal is fed back from these switches to the signal box where all
point directions are controlled and monitored. The snap-action switches at the
end of the stroke stop the machine and help brake the motor to help reduce any
impact at the end of the travel. Two stretcher bars (Figure 1) make sure
that the switch rails remain the correct distance apart – this can vary between
installations depending on the curvature of the main rails, and the speed limit
of that section of the track. There are usually two stretcher bars for each
point machine. Any fault in this mechanism like poorly securing of the bolts holding
these stretcher bars, loose bolts etc. may lead to deadly accidents.

Proposed Predictive
Maintenance System Architecture

The proposed architecture of Predictive
Maintenance System (PMS) for Railways points is discussed below using theFigure
2

Figure2Architecture of PMS

Sensors are used to measure Voltage, Current,
load and temperature of Point Motor. The Throwing load sensor is used to
measure the stress in the operating rod of the point machine. The sensor values
are read on real time basis by the wayside device and sent to a central
location for analysis. The wayside device uses GSM/GRPS network to transmit
this data to a central location. The Central Station analyzes the data in real
time and makes predictions on the point machines and stores them in to a
database. The status of any point machine can be viewed using any internet
browser in the central station. The Local station maintainers can view the data
by logging in to the web server using any internet browser.Based
on the Current consumption, the load sensor values and the point motor
temperature, predictions are made for the maintenance or replacement of the
Point Motors. The central location is a Web server based architecture, where
anyone with a Web browser can login and see the details.

Data processing and
analytics

The system has a database of current and load
characteristics of good working railway points. This data is used as a
reference for processing real time data received from the wayside units. The
following figures show the current (i) and load sensor values plotted against
time during point machine operation.

Figure3Current Characteristics

Figure4Load Force Characteristics

Data Processing
Techniques

Various Signal Processing Techniques are
available for analysis of real time data described below:

1) Data
Cluster method – This involves recording the characteristics of a parameter of
a subsystem under different simulated conditions and then using this as a
reference to validate the real time data. This method is different from
template matching, since it not entirely based on matching the plotted
characteristics.

2) Template
matching – Entails comparing complete data sets with pre-recorded examples of
data resulting from known fault conditions. The method can be used effectively
in some circumstances, provided a representation of the data that produces good
discrimination between pattern classes can be made. However, this requires a
substantial amount of experimentation with different transformations of the
data sets to find such distinctions, and would be a computationally intensive
process.

3) Statistical
and decision theoretic methods – Matches are made based on statistical features
of the signal. For example, the mean and peak-to-peak value are evaluated for
each vector, and plotted in feature space, whereby different patterns are
distinguishable because they form clusters for each class that are located apart
from the fully functioning case.

4) Structural
or syntactic methods – Involves deconstructing a pattern or vector into
structural components, to enable comparisons to be made on more simple,
sub-segments of data rather than a complete vector. Mathematically, these
methods are similar to fractal-based compression routines.

The method that was of specific interest to
this project was to use a data clustering methodology where a database of good
measurements as well as load sensor data readings under various simulated
faults in the laboratory on some specimen railway points is stored and
then the real time load sensor data is plotted against it. This generates very
unique clusters of data points which represent each type of fault.

By applying the above techniques, we get
clusters of fault data. We have found that these data clusters are unique in
the sense that these represent different types of faults.

Figure5Force Data Clusters

Types of faults detectable

Tight lock on reverse side (sand on bearers both
sides) – Refers to the lock which holds the point in position after it has
changed direction. This lock prevents the point from moving out of
position because of vibration.

A 12-mm obstruction at toe on normal side – Simulates
a piece of ballast impeding point motion between the toe of the switch
rail (the mobile section of rail), and the stock rail.

Back drive slackened off at toe end on LHS – The
drive to the midpoint of the switch rail is only loosely connected to the
stretcher bar. The stretcher bar holds the mobile rails a fixed distance
apart.

Saturday, September 24, 2011

Background:The dynamic analysis of the safety critical software is an important phase of the Independent Verification and Validation of the system. The EN 50128 has detailed the methods that shall be used for this phase of the verification life cycle. This phase is so critical to the project output that it demands meticulous planning and organization. Here we discuss the dynamic analysis methods suggested by the CENELEC standards for SIL 4 software.

Boundary Value Analysis

The aim of this method is to remove software errors occurring at parameter limits or boundaries. The input domain of the program is divided into a number of input classes. The tests should cover the boundaries and extremes of the classes. The tests check that the boundaries in the input domain of the specification coincide with those in the program. The use of the value zero, in a direct as well as in an indirect translation, is often error-prone and demands special attention:

Zero divisor;

Blank ASCII characters;

Empty stack or list element;

Null matrix;

Zero table entry.

Normally the boundaries for input have a direct correspondence to the boundaries for the output range. Test cases should be written to force the output to its limited values. Consider also, if it is possible to specify a test case which causes output to exceed the specification boundary values. If output is a sequence of data, for example a printed table, special attention should be paid to the first and the last elements and to lists containing none, 1 and 2 elements.

Error Guessing

The aim of this method is to remove common programming errors. Testing experience and intuition combined with knowledge and curiosity about the system under test may add some uncategorised test cases to the designed test case set. Special values or combinations of values may be error-prone. Some interesting test cases may be derived from inspection checklists. It may also be considered whether the system is robust enough. Can the buttons be pushed on the front-panel too fast or too often? What happens if two buttons are pushed simultaneously?

Error Seeding

The aim of this method is to ascertain whether a set of test cases is adequate. Some known error types are inserted in the program, and the program is executed with the test cases under test conditions. If only some of the seeded errors are found, the test case set is not adequate. The ratio of found seeded errors to the total number of seeded errors is an estimate of the ratio of found real errors to total number errors. This gives a possibility of estimating the number of remaining errors and thereby the remaining test effort.. The detection of all the seeded errors may indicate either that the test case set is adequate, or that the seeded errors were too easy to find. The limitations of the method are that in order, to obtain any usable results, the error types as well as the seeding positions must reflect the statistical distribution of real errors.

Performance Modelling

The aim of the method is to ensure that the working capacity of the system is sufficient to meet the specified requirements. The requirements specification includes throughput and response requirements for specific functions, perhaps combined with constraints on the use of total system resources. The proposed system design is compared against the stated requirements by:

Defining a model of the system processes, and their interactions,

Identifying the use of resources by each process, for example, processor time, communications bandwidth, storage devices etc),

Identifying the distribution of demands placed upon the system under average and worst-case conditions,

Computing the mean and worst-case throughput and response times for the individual system functions.

For simple systems, an analytic solution may be possible whilst for more complex systems, some form of simulation is required to obtain accurate results. Before detailed modelling, a simpler ’resource budget’ check can be used which sums the resources

requirements of all the processes. If the requirements exceed designed system capacity, the design is infeasible. Even if the design passes this check, performance modelling may show that excessive delays and response times occur due to resource starvation. To avoid this situation engineers often design systems to use some fraction (e.g. 50 %) of the total resources so that the probability of resource starvation is reduced

Equivalence Classes and Input Partition Testing

The aim of this method is to test the software adequately using a minimum of test data. The test data is obtained by selecting the partitions of the input domain required to exercise the software. This testing strategy is based on the equivalence relation of the inputs, which determines a partition of the input domain.

Test cases are selected with the aim of covering all subsets of this partition. At least one test case is taken from each equivalence class. There are two basic possibilities for input partitioning which are:

Equivalence classes may be defined on the specification. The interpretation of the specification may be either input oriented, for example the values selected are treated in the same way or output oriented, for example the set of values leading to the same functional result; and

Equivalence classes may be defined on the internal structure of the program. In this case the equivalence class results are determined from static analysis of the program, for example the set of values leading to the same path being executed

Structure Based Testing

The aim of this method to apply tests which exercise certain subsets of the program structure. Based on an analysis of the program a set of input data is chosen such that a large fraction of selected program elements are exercised. The program elements exercised can vary depending upon the level of rigour required.

Statements: This is the least rigorous test since it is possible to execute all code statements without exercising both branches of a conditional statement.

Branches: Both sides of every branch should be checked. This may be impractical for some types of defensive code.

Compound Conditions: Every condition in a compound conditional branch (i.e. linked by AND/OR is exercised).

LCSAJ: A linear code sequence and jump is any linear sequence of code statements including conditional jumps terminated by a jump. Many potential sub-paths will be infeasible due to constraints on the input data imposed by the execution of earlier code.

Data Flow: The execution paths are selected on the basis of data usage for example a path where the same variable is both written and wrote.

Call Graph: A program is composed of subroutines which may be invoked from other subroutines. The call graph is the tree of subroutine invocations in the program. Tests are designed to cover all invocations in the tree.

Entire Path: Execute all possible path through the code. Complete testing is normally infeasible due to the very large number of potential paths.

Saturday, September 17, 2011

From the desk of

Sandeep Patalay

The Computer
Based Interlocking Architecture

The Solid state Interlocking systems for Railways should
ensure the following:

Fail
safety

Availability

Reliability

Maintainability

Architecture and methodology

Generally following three types of
redundancy techniques are used for achieving fail-safety in the design of
signaling systems:

Hardware Redundancy –
In this case, more than one hardware modules of identical design with common
software are used to carry out the safety functions and their outputs are
continuously compared. The hardware units operate in tightly syncronised mode
with comparison of outputs in every clock cycle. Due to the tight
syncronisation, it is not possible to use diverse hardware or software. In this
method, although random failures are taken care of, it is difficult to ensure
detection of systematic failures due to use of identical hardware and software.

Software Redundancy –
This approach uses a single hardware unit with diverse software. The two
software modules are developed independently and generally utilize inverted
data structures to take care of common mode failures. However, rigorous self
check procedures are required to be adopted to compensate for use of a single
Hardware unit.

Hybrid Model - The
hardware units have been loosely syncronised where the units operate in
alternate cycle and the outputs are compared after full operation of the two
modules. Therefore, it is no more required to use identical hardware and
software. Although the systems installed in the field utilize identical
hardware and software, the architecture permits use of diverse hardware and
software. Moreover, operation of the two units in alternate cycles permits use
of common system bus and interface circuitry.

To ensure the
above said points hardware and software is designed accordingly. There are
various techniques to meet the above said requirements as discussed below:

The same software is executed on
the same hardware during two different time intervals

(Refer: Figure
5: Time Redundancy)

Errors Caused by transients. They
are avoided by reading at two different time Intervals

Single hardware Fault leads to Shut
down of the System. This method is not used since software faults are not
completely found in validation. And the Self diagnostics employed for
checking of hardware faults is not complete.

Hardware Redundancy

The same software is executed on
two

identical hardware channels

(Refer: Figure
6: Hardware Redundancy)

Hardware faults are detected since
outputs from both the channels are compared. And single hardware fault does
not lead to shut down of the system

Software faults are not detected
since the same software is running on two identical hardware channels.
Software Faults at design stage are still not detected.

Diverse Hardware

Identical Software is Executed on
Different hardware Versions

(Refer: Figure
7: Hardware Diversity)

Hardware Design faults at the
Initial stage are Detected

Software Faults at the design stage
are still not detected

Diverse software

The different software versions are

executed on the same hardware
during

two different time intervals

(Refer: Figure
8: Software Diversity)

Software Faults at design stage are
detected

Even though the software is
diverse, they are executed on the single hardware channel; single hardware
fault leads to Shut down of the system.

Diverse software on

redundant hardware

The different software versions are

executed on two identical hardware

channels

(Refer: Figure
9: Diverse software on redundant
hardware)

Software Faults at design stage are
detected and single hardware faults does not lead to system shut down