An Engineering Approach Using MATLAB

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by law. Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions.

The right of Bangjun Lei, Dick de Ridder, David M. J. Tax, Ferdinand van der Heijden, Guangzhu Xu, Ming Feng, Yaobin Zou to be identified as the authors of this work has been asserted in accordance with law.

For details of our global editorial offices, customer services, and more information about Wiley products visit us at www.wiley.com.

Wiley also publishes its books in a variety of electronic formats and by print-on-demand. Some content that appears in standard print versions of this book may not be available in other formats.

Limit of Liability/Disclaimer of Warranty:MATLAB¯ is a trademark of The MathWorks, Inc. and is used with permission. The MathWorks does not warrant the accuracy of the text or exercises in this book. This work's use or discussion of MATLAB¯ software or related products does not constitute endorsement or sponsorship by The MathWorks of a particular pedagogical approach or particular use of the MATLAB¯ software. While the publisher and authors have used their best efforts in preparing this work, they make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives, written sales materials or promotional statements for this work. The fact that an organization, website, or product is referred to in this work as a citation and/or potential source of further information does not mean that the publisher and authors endorse the information or services the organization, website, or product may provide or recommendations it may make. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for your situation. You should consult with a specialist where appropriate. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.

Figure 3.4 Probability densities of the measurements shown in Figure 3.3. (a) The 3D plot of the unconditional density together with a 2D contour plot of this density on the ground plane. (b) 2D contour plots of the conditional probability densities.

Figure 4.2 Different realizations of the backscattering coefficient and its corresponding measurement.

Figure 4.3 Parameter estimation.

Figure 4.4 Probability densities for the backscattering coefficient. (a) Prior density p(x). (b) Conditional density p(z|x) with Nprobes = 8. The two axes have been scaled with x and 1/x, respectively, to obtain invariance with respect to x.

Figure 4.5 Three different Bayesian estimators.

Figure 4.6 MAP estimation, ML estimation and linear MMSE estimation.

Figure 4.7 The bias and the variance of the various estimators in the backscattering problem.

Figure 4.8 LS estimation of the diameter D and the position y0 of a blood vessel. (a) X-ray image of the blood vessel. (b) Cross-section of the image together with a fitted profile. (c) The sum of least squares errors as a function of the diameter and the position.

Figure 4.9 A robust error norm and its derivative.

Figure 4.10 LS estimation of the diameter D and the position y0. (a) Cross-section of the image together with a profile fitted with the LS norm. (b) The LS norm as a function of the diameter and the position.

Figure 4.11 Robust estimation of the diameter D and the position y0. (a) Cross-section of the image together with a profile fitted with a robust error norm. (b) The robust error norm as a function of the diameter and the position.

Figure 4.12 Determination of a calibration curve by means of polynomial regression.

Figure 4.13 A family tree of estimators.

Chapter 5

Figure 5.1 A density control system for the process industry.

Figure 5.2 An overview of online estimation.

Figure 5.3 Motion tracking: both system and measurement errors are white noises with covariance 1.

Figure 5.17 Representation of a probability density. (a) A density p(x). (b) The proposal density q(x). (c) The 40 samples of q(x). (d) Importance sampling of p(x) using the 40 samples of q(x). (e) Selected samples from (d) as an equally weighted sample representation of p(x).

Figure 5.18 Application of particle filtering to the density estimation problem. (a) Real states and measurements. (b) The particles obtained at i = 511. (c) Results.

Figure 5.19 A general GA-based state estimator.

Figure 5.20 GA-based state estimation for the jet engine compressor model, ΨJ(0) = 0.5, RJ(0) = 0.1 and ΦJ(0) = 0.8. Plot 1 corresponds to ΨJ, plot 2 corresponds to ΦJ and plot 3 corresponds to RJ. The dotted lines in the right two plots indicate the actual state, while the solid lines indicate the state estimates. The dash–dot line in plot 1 indicates the reference input for ΨJ (adopted from James et al., 2000).

Figure 6.4 Probability densities of the measurements shown in Figure 6.1. (a) The 3D plot of the Parzen estimate of the unconditional density together with a 2D contour plot of this density on the ground plane. The parameter σh was set to 0.0486. (b) The resulting decision boundaries. (c) Same as (a), but with σh set to 0.0176. (d) Same as (b), for the density estimate shown in (c).

Figure 6.12 A two-layer feedforward neural network with two input dimensions and one output (for presentation purposes, not all connections have been drawn).

Figure 6.13 Application of Adaboost with a linear perceptron as the weak classifier. (a) 200 base classifiers. (b) The classification result with the stronger classifier combined with 200 weak classifiers.

Figure 6.14 Application of Adaboost with the decision stump as a weak classifier. (a) 200 base classifiers. (b) The classification result with the stronger classifier combined with 200 weak classifiers.

Figure 7.3 Error bounds and the true minimum error for the Gaussian case (C1 = C2). (a) The minimum error rate with some bounds given by the Chernoff distance. In this example the bound with s = 0.5 (Bhattacharyya upper bound) is the most tight. The figure also shows the Bhattacharyya lower bound. (b) The Chernoff bound with dependence on s.

Figure 7.4 A top-down tree structure on behalf of feature selection.

Figure 7.5 A bottom-up tree structure for feature selection.

Figure 7.6 Character classifications for licence plate recognition. (a) Character sets from licence plates, before and after normalization. (b) Selected features. The number of features is 18 and 50, respectively.

Figure 8.7 Kernel PCA methodology. A training set is mapped from input space I to feature space F via a non-linear function ϕ. PCA is performed in F to determine the principal directions defining the kernel PCA space (learned space): oval area. Any element of I can then be mapped to F and projected on the kernel PCA space via Plϕ, where l refers to the first l eigenvectors used to build the KPCA space.

Figure 8.8 The example of reducing dimensionality of a 2D circle by using KPCA.

Figure 8.12 The development of the cluster means during 10 update steps of the K-means algorithm.

Figure 8.13 Two results of K-means clustering applied to the ‘mechanical parts’ dataset.

Figure 8.14 Two results of the EM algorithm for the mixture of Gaussians estimation.

Figure 8.15 The development of a one-dimensional self-organizing map, trained on a two-dimensional uniform distribution: (a) initialization; (b) to (d) after 10, 25 and 100 iterations, respectively.

Figure 8.16 A SOM that visualizes the effects of a highlight. (a) RGB image of an illuminated surface with a highlight (= glossy spot). (b) Scatter diagram of RGB samples together with a one-dimensional SOM.

Figure 9.5 The classification error by the 1NN rule as a function of the number of eigenvectors.

Figure 9.6 The total Kimia dataset.

Figure 9.7 The selected classes.

Figure 9.8 The computed contours.

Figure 9.9 Scatterplot.

Figure 9.10 Scatterplots of the Boston Housing dataset. The left subplot shows features STATUS and INDUSTRY, where the discrete nature of INDUSTRY can be spotted. In the right subplot, the dataset is first scaled to unit variance, after which it is projected on to its first two principal components.

Figure 9.11 Performance of a polynomial kernel SVC (left, as a function of the degree of the polynomial) and a radial basis function kernel SVC (right, as a function of the basis function width).

Figure 9.12 Performance of neural networks with one or two hidden layers as a function of the number of units per hidden layer, trained using bpxnc (left) and neurc (right).

Figure 9.13 Setup of a sensory system for acoustic distance measurements.

Figure 9.14 A record of the nominal response h(t).

Figure 9.15 ToF measurements based on thresholding operations.

Figure 9.16 ToF estimation by fitting a one-sided parabola to the foot of the envelope.

Preface

Information processing has always been an important factor in the development of human society and its role is still increasing. The inventions of advanced information devices paved the way for achievements in a diversity of fields like trade, navigation, agriculture, industry, transportation and communication. The term ‘information device’ refers here to systems for the sensing, acquisition, processing and outputting of information from the real world. Usually, they are measurement systems. Sensing and acquisition provide us with signals that bear a direct relation to some of the physical properties of the sensed object or process. Often, the information of interest is hidden in these signals. Signal processing is needed to reveal the information and to transform it into an explicit form. Further, in the past 10 years image processing (together with intelligent computer vision) has gone through rapid developments. There are substantial new developments on, for example, machine learning methods (such as Adaboost and it's varieties, Deep learning etc.) and particle filtering like parameter estimation methods.

The three topics discussed in this book, classification, parameter estimation and state estimation, share a common factor in the sense that each topic provides the theory and methodology for the functional design of the signal processing part of an information device. The major distinction between the topics is the type of information that is outputted. In classification problems the output is discrete, that is a class, a label or a category. In estimation problems, it is a real-valued scalar or vector. Since these problems occur either in a static or in a dynamic setting, actually four different topics can be distinguished. The term state estimation refers to the dynamic setting. It covers both discrete and real-valued cases (and sometimes even mixed cases).

The similarity between the topics allows one to use a generic methodology, that is Bayesian decision theory. Our aim is to present this material concisely and efficiently by an integrated treatment of similar topics. We present an overview of the core mathematical constructs and the many resulting techniques. By doing so, we hope that the reader recognizes the connections and the similarities between these constructs, but also becomes aware of the differences. For instance, the phenomenon of overfitting is a threat that ambushes all four cases. In a static classification problem it introduces large classification errors, but in the case of a dynamic state estimation it may be the cause of instable behaviour. Further, in this edition, we made some modifications to accommodate engineering requests on intelligent computer vision.

Our goal is to emphasize the engineering aspects of the matter. Instead of a purely theoretical and rigorous treatment, we aim for the acquirement of skills to bring theoretical solutions to practice. The models that are needed for the application of the Bayesian framework are often not available in practice. This brings in the paradigm of statistical inference, that is learning from examples. Matlab®1is used as a vehicle to implement and to evaluate design concepts.

As alluded to above, the range of application areas is broad. Application fields are found within computer vision, mechanical engineering, electrical engineering, civil engineering, environmental engineering, process engineering, geo-informatics, bio-informatics, information technology, mechatronics, applied physics, and so on. The book is of interest to a range of users, from the first-year graduate-level student up to the experienced professional. The reader should have some background knowledge with respect to linear algebra, dynamic systems and probability theory. Most educational programmes offer courses on these topics as part of undergraduate education. The appendices contain reviews of the relevant material. Another target group is formed by the experienced engineers working in industrial development laboratories. The numerous examples of Matlab® code allow these engineers to quickly prototype their designs.

The book roughly consists of three parts. The first part, Chapter 2, presents an introduction to the PRTools used throughout this book. The second part, Chapters 3, 4 and 5, covers the theory with respect to classification and estimation problems in the static case, as well as the dynamic case. This part handles problems where it is assumed that accurate models, describing the physical processes, are available. The third part, Chapters 6 up to 8, deals with the more practical situation in which these models are not or only partly available. Either these models must be built using experimental data or these data must be used directly to train methods for estimation and classification. The final chapter presents three worked out problems. The selected bibliography has been kept short in order not to overwhelm the reader with an enormous list of references.

The material of the book can be covered by two semester courses. A possibility is to use Chapters 3, 4, 6, 7 and 8 for a one-semester course on Classification and Estimation. This course deals with the static case. An additional one-semester course handles the dynamic case, that is Optimal Dynamic Estimation, and would use Chapter 5. The prerequisites for Chapter 5 are mainly concentrated in Chapter 4. Therefore, it is recommended to include a review of Chapter 4 in the second course. Such a review will make the second course independent from the first one.

Each chapter is closed with a number of exercises. The mark at the end of each exercise indicates whether the exercise is considered easy (‘0’), moderately difficult (‘*’) or difficult (‘**’). Another possibility to acquire practical skills is offered by the projects that accompany the text. These projects are available at the companion website. A project is an extensive task to be undertaken by a group of students. The task is situated within a given theme, for instance, classification using supervised learning, unsupervised learning, parameter estimation, dynamic labelling and dynamic estimation. Each project consists of a set of instructions together with data that should be used to solve the problem.

The use of Matlab® tools is an integrated part of the book. Matlab® offers a number of standard toolboxes that are useful for parameter estimation, state estimation and data analysis. The standard software for classification and unsupervised learning is not complete and not well structured. This motivated us to develop the PRTools software for all classification tasks and related items. PRTools is a Matlab® toolbox for pattern recognition. It is freely available for non-commercial purposes. The version used in the text is compatible with Matlab® Version 5 and higher. It is available from http://37steps.com.

The authors keep an open mind for any suggestions and comments (which should be addressed to cpese@wiley.com). A list of errata and any other additional comments will be made available at the companion website.

Note

1Matlab® is a registered trademark of The MathWorks, Inc. (http://www .mathworks.com).

Acknowledgements

We thank everyone who has made this book possible. Special thanks are given to Dr. Robert P. W. Duin for his contribution to the first version of this book and for allowing us to use PRTools and all materials on 37steps.com throughout this book. Thanks are also extended to Dr. Ela Pekalska for the courtesy of sharing documents of 37steps.com with us.