Automatic speech recognition has been widely studied and is already being applied in everyday use. Nevertheless, the recognition performance is still a bottleneck in many practical applications of large vocabulary continuous speech recognition. Either the recognition speed is not sufﬁcient, or the errors in the recognition result limit the applications. This thesis studies two aspects of speech recognition, decoding and training of acoustic models, to improve speech recognition performance in different conditions.

User-centered computer based learning is an emerging field of interdisciplinary research.
Research in diverse areas such as psychology, computer science, neuroscience and signal
processing is making contributions to take this field to the next level. Learning
systems built using contributions from these fields could be used in actual training and education
instead of just laboratory proof-of-concept. One of the important advances in this research is the
detection and assessment of the cognitive and emotional state of the learner using such systems.
This capability moves development beyond the use of traditional user performance metrics to
include system intelligence measures that are based on current theories in neuroscience. These
advances are of paramount importance in the success and wide spread use of learning systems
that are automated and intelligent.
Emotion is considered an important aspect of how learning occurs, and yet estimating it
and making adaptive adjustments are not part of most learning systems. In this research we focus
on one specific aspect of constructing an adaptive and intelligent learning system, that is,
estimation of the emotion of the learner as he/she is using the automated training system. The
challenge starts with the definition of the emotion and the utility of it in human life. The next
challenge is to measure the co-varying factors of the emotions in a non-invasive way, and find
consistent features from these measures that are valid across wide population. In this research we
use four physiological sensors that are non-invasive, and establish a methodology of utilizing the
data from these sensors using different signal processing tools. A validated set of visual stimuli
used worldwide in the research of emotion and attention, called International Affective Picture
System (IAPS), is used. A dataset is collected from the sensors in an experiment designed to
elicit emotions from these validated visual stimuli. We describe a novel wavelet method to
calculate hemispheric asymmetry metric using electroencephalography data. This method is
tested against typically used power spectral density method. We show overall improvement in
accuracy in classifying specific emotions using the novel method. We also show distinctions
between different discrete emotions from the autonomic nervous system activity using
electrocardiography, electrodermal activity and pupil diameter changes. Findings from different
features from these sensors are used to give guidelines to use each of the individual sensors in
the adaptive learning environment.

In recent years, the amount of digital data which is stored and transmitted for private and public usage has increased considerably. To allow a save transmission and storage of data despite of error-prone transmission media, error correcting codes are used. A large variety of codes has been developed, and in the past decade low-density parity-check (LDPC) codes which have an excellent error correction performance became more and more popular. Today, low-density parity-check codes have been adopted for several standards, and eﬃcient decoder hardware architectures are known for the chosen structured codes. However, the existing decoder designs lack ﬂexibility as only few structured codes can be decoded with one decoder chip. In consequence, diﬀerent codes require a redesign of the decoder, and few solutions exist for decoding of codes which are not quasi-cyclic or which are unstructured.
In this thesis, three diﬀerent approaches are presented for the implementation of fully programmable LDPC decoders which can decode arbitrary LDPC codes. As a design study, the ﬁrst programmable decoder which uses a heuristic mapping algorithm is realized on an ﬁeld-programmable gate array (FPGA), and error correction curves are measured to verify the correct functionality. The main contribution of this thesis lies in the development of the second and the third architecture and an appropriate mapping algorithm. The proposed fully programmable decoder architectures use one-phase message passing and layered decoding and can decode arbitrary LDPC codes using an optimum mapping and scheduling algorithm. The presented programmable architectures are in fact generalized decoder architectures from which the known decoders architectures for structured LDPC codes can be derived.

The author was sponsored by EnTegra Ltd, a company who develop hardware and software products and services for the real time implementation of DSP and RF systems. The field programmable gate array (FPGA) is being used increasingly in the field of DSP. This is due to the fact that the parallel computing power of such devices is ideal for today’s truly demanding DSP algorithms. Algorithms such as the QR-RLS update are computationally intensive and must be carried out at extremely high speeds (MHz). This means that the DSP processor is simply not an option. ASICs can be used but the expense of developing custom logic is prohibitive. The increased use of the FPGA in DSP means that there is a significant requirement for efficient arithmetic cores that utilises the resources on such devices. This thesis presents the research and development effort that was carried out to produce fixed point division and square root cores for use in a new Electronic Design Automation (EDA) tool for EnTegra, which is targeted at FPGA implementation of DSP systems. Further to this, a new technique for predicting the accuracy of CORDIC systems computing vector magnitudes and cosines/sines is presented. This work allows the most efficient CORDIC design for a specified level of accuracy to be found quickly and easily without the need to run lengthy simulations, as was the case before. The CORDIC algorithm is a technique using mainly shifts and additions to compute many arithmetic functions and is thus ideal for FPGA implementation.

Vibration isolation endeavors to reduce the transmission of vibration energy from one structure
(the source) to another (the receiver), to prevent undesirable phenomena such as sound
radiation. A well-known method for achieving this is passive vibration isolation (PVI). In the
case of PVI, mounts are used - consisting of springs and dampers - to connect the vibrating
source to the receiver. The stiffness of the mount determines the fundamental resonance frequency
of the mounted system and vibrations with a frequency higher than the fundamental
resonance frequency are attenuated. Unfortunately, however, other design requirements (such
as static stability) often impose a minimum allowable stiffness, thus limiting the achievable
vibration isolation by passive means.
A more promising method for vibration isolation is hybrid vibration isolation control. This
entails that, in addition to PVI, an active vibration isolation control (AVIC) system is used
with sensors, actuators and a control system that compensates for vibrations in the lower
frequency range. Here, the use of a special form of AVIC using statically determinate stiff
mounts is proposed. The mounts establish a statically determinate system of high stiffness
connections in the actuated directions and of low stiffness connections in the unactuated
directions. The latter ensures PVI in the unactuated directions. This approach is called
statically determinate AVIC (SD-AVIC). The aim of the control system is to produce antidisturbance
forces that counteract the disturbance forces stemming from the source. Using
this approach, the vibration energy transfer from the source to the receiver is blocked in the
mount due to the anti-forces.
This thesis deals with the design of controllers generating the anti-forces by applying
techniques that are commonly used in the field of signal processing. The control approaches
- that are model-based - are both adaptive and fixed gain and feedforward and feedback
oriented. The control approaches are validated using two experimental vibration isolation
setups: a single reference single actuator single error sensor (SR-SISO) setup and a single
reference input multiple actuator input multiple error sensor output (SR-MIMO) setup.
Finding a plant model can be a problem. This is solved by using a black-box modelling
strategy. The plants are identified using subspace model identification. It is shown that
accurate linear models can be found in a straightforward manner by using small batches of
recorded (sampled) time-domain data only. Based on the identified models, controllers are
designed, implemented and validated.
Due to resonance in mechanical structures, adaptive SD-AVIC systems are often hampered
by slow convergence of the controller coefficients. In general, it is desirable that the SD-AVIC
system yields fast optimum performance after it is switched on. To achieve this result and
speed up the convergence of the adaptive controller coefficients, the so-called inverse outer
factor model is included in the adaptive control scheme. The inner/outer factorization, that has to be performed to obtain the inverse outer factor model, is completely determined in
state space to enable a numerically robust computation. The inverse outer factor model is also
incorporated in the control scheme as a state space model. It is found that fast adaptation
of the controller coefficients is possible.
Controllers are designed, implemented and validated to suppress both narrowband and
broadband disturbances. Scalar regularization is used to prevent actuator saturation and an
unstable closed loop. In order to reduce the computational load of the controllers, several
steps are taken including controller order reduction and implementation of lower order models.
It is found that in all experiments the simulation and real-time results correspond closely for
both the fixed gain and adaptive control situation. On the SR-SISO setup, reductions up to
5.0 dB are established in real-time for suppressing a broadband disturbance output (0-2 kHz)
using feedback-control. On the SR-MIMO vibration isolation setup, using feedforward-control
reductions of broadband disturbances (0-1 kHz) of 9.4 dB are established in real-time. Using
feedback-control, reductions are established up to 3.5 dB in real-time (0-1 kHz). In case of
the SR-MIMO setup, the values for the reduction are obtained by averaging the reductions
obtained in all sensor outputs.
The results pave the way for the next generation of algorithms for SD-AVIC.

The auditory system of living creatures provides useful information about the world, such as
the location and interpretation of sound sources. For humans, it means to be able to focus one's
attention on events, such as a phone ringing, a vehicle honking, a person taking, etc. For those
who do not suffer from hearing impairments, it is hard to imagine a day without being able to
hear, especially in a very dynamic and unpredictable world. Mobile robots would also benefit
greatly from having auditory capabilities.
In this thesis, we propose an artificial auditory system that gives a robot the ability to locate
and track sounds, as well as to separate simultaneous sound sources and recognising simultaneous
speech. We demonstrate that it is possible to implement these capabilities using an array
of microphones, without trying to imitate the human auditory system. The sound source localisation
and tracking algorithm uses a steered beamformer to locate sources, which are then
tracked using a multi-source particle filter. Separation of simultaneous sound sources is achieved
using a variant of the Geometric Source Separation (GSS) algorithm, combined with a multisource
post-filter that further reduces noise, interference and reverberation. Speech recognition
is performed on separated sources, either directly or by using Missing Feature Theory (MFT) to
estimate the reliability of the speech features.
The results obtained show that it is possible to track up to four simultaneous sound sources,
even in noisy and reverberant environments. Real-time control of the robot following a sound
source is also demonstrated. The sound source separation approach we propose is able to
achieve a 13.7 dB improvement in signal-to-noise ratio compared to a single microphone when
three speakers are present. In these conditions, the system demonstrates more than 80% accuracy
on digit recognition, higher than most human listeners could obtain in our small case study
when recognising only one of these sources. All these new capabilities will allow humans to
interact more naturally with a mobile robot in real life settings.

This dissertation is concerned with the possibilities of restoration of degraded film-sound.
The sound-quality of old films are often not acceptable, which means that the sound is so
noisy and distorted that the listener have to take strong efforts to understand the conversations
in the film. In this case the film cannot give artistic enjoyment to the listener. This is
the reason that several old films cannot be presented in movies or television.
The quality of these films can be improved by digital restoration techniques. Since we
do not have access to the original signal, only the distorted one, therefore we cannot adjust
recording parameters or recording techniques. The only possibility is to post-compensate
the signal to produce a better estimate about the undistorted, noiseless signal. In this dissertation
new methods are proposed for fast and efficient restoration of nonlinear distortions
in the optically recorded film soundtracks.
First the nonlinear models and nonlinear restoration techniques are surveyed and the
ill-posedness of nonlinear post-compensation (the extreme sensitivity to noise) is explained.
The effects and sources of linear and nonlinear distortions at optical soundtracks are also
described. A new method is proposed to overcome the ill-posedness of the restoration problem
and to get an optimal result. The effectiveness of the algorithm is proven by simulations
and restoration of real film-sound signals.

Motivation:
A man was interested in knowing of unknown from the very beginning of the human history. Our human eyes help us to investigate our environment by reflection of light. However, wavelengths of visible light allows transparent view through only a very small kinds of materials. On the other hand, Ultra WideBand (UWB) electromagnetic waves with frequencies of few Gigahertz are able to penetrate through almost all types of materials around us. With some sophisticated methods and a piece of luck we are able to investigate what is behind opaque walls. Rescue and security of the people is one of the most promising fields for such applications.
Rescue:
Imagine how useful can be information about interior of the barricaded building with terrorists and hostages inside for a policemen. The tactics of police raid can be build up on realtime information about ground plan of the room and positions of big objects inside. How useful for the firemen can be information about current interior state of the room before they get inside? Such hazardous environment, full of smoke with zero visibility, is very dangerous and each additional information can make the difference between life and death.
Security:
Investigating objects through plastic, rubber, dress or other nonmetallic materials could be highly useful as an additional tool to the existing x-ray scanners. Especially it could be used for scanning baggage at the airport, truckloads on borders, dangerous boxes, etc.

Audio time-scale modification is an audio effect that alters the duration of an audio
signal without affecting its perceived local pitch and timbral characteristics. There
are two broad categories of time-scale modification algorithms, time-domain and
frequency-domain. The computationally efficient time-domain techniques produce
high quality results for single pitched signals such as speech, but do not cope well
with more complex signals such as polyphonic music. The less efficient frequencydomain
techniques have proven to be more robust and produce high quality results for a variety of signals; however they introduce a reverberant artefact into the output.
This dissertation focuses on incorporating aspects of time-domain techniques into
frequency-domain techniques in an attempt to reduce the presence of the reverberant
artefact and improve upon computational demands.
From a review of prior work it was found that there are a number of time-domain
algorithms available and that the choice of algorithm parameters varies considerably
in the literature. This finding prompted an investigation into the effects of the choice
of parameters and a comparison of the various techniques employed in terms of
computational requirements and output quality. The investigation resulted in the
derivation of an efficient and flexible parameter set for use within time-domain
implementations.
Of the available frequency-domain approaches the phase vocoder and timedomain/
subband techniques offer an efficiency and robustness advantage over
sinusoidal modelling and iterative phase update techniques, and as such were
identified as suitable candidates for the provision of a framework for further
investigation. Following from this observation, improvements in the quality produced
by time-domain/subband techniques are realised through the use of a bark based
subband partitioning approach and effective subband synchronisation techniques.
In addition, computational and output quality improvements within a phase vocoder
implementation are achieved by taking advantage of a certain level of flexibility in the
choice of phase within such an implementation. The phase flexibility established is used to push or pull phase values into a phase coherent state. Further improvements
are realised by incorporating features of time-domain algorithms into the system in
order to provide a ‘good’ initial set of phase estimates; the transition to ‘perfect’ phase coherence is significantly reduced through this scheme, thereby improving the overall output quality produced. The result is a robust and efficient time-scale modification algorithm which draws upon various aspects of a number of general approaches to time-scale modification.

Computers are changing the way sound and recorded music are listened to and
used. The use of computers to playback music makes it possible to change and
adapt music to different usage situations in ways that were not possible with
analog sound equipment. In this thesis, interaction with pre-recorded music is
investigated using prototypes and user studies.
First, different interfaces for browsing music on consumer or mobile devices
were compared. It was found that the choice of input controller, mapping and
auditory feedback influences how the music was searched and how the interfaces
were perceived. Search performance was not affected by the tested interfaces.
Based on this study, several ideas for the future design of music browsing interfaces
were proposed. Indications that search time depends linearly on distance
to target were observed and examined in a related study where a movement
time model for searching in a text document using scrolling was developed.
Second, work practices of professional disc jockeys (DJs) were studied and a
new design for digital DJing was proposed and tested. Strong indications were
found that the use of beat information could reduce the DJ’s cognitive workload
while maintaining flexibility during the musical performance. A system for
automatic beat extraction was designed based on an evaluation of a number of
perceptually important parameters extracted from audio signals.
Finally, auditory feedback in pen-gesture interfaces was investigated through
a series of informal and formal experiments. The experiments point to several
general rules of auditory feedback in pen-gesture interfaces: a few simple functions
are easy to achieve, gaining further performance and learning advantage
is difficult, the gesture set and its computerized recognizer can be designed to
minimize visual dependence, and positive emotional or aesthetic response can
be achieved using musical auditory feedback.

The author was sponsored by EnTegra Ltd, a company who develop hardware and software products and services for the real time implementation of DSP and RF systems. The field programmable gate array (FPGA) is being used increasingly in the field of DSP. This is due to the fact that the parallel computing power of such devices is ideal for today’s truly demanding DSP algorithms. Algorithms such as the QR-RLS update are computationally intensive and must be carried out at extremely high speeds (MHz). This means that the DSP processor is simply not an option. ASICs can be used but the expense of developing custom logic is prohibitive. The increased use of the FPGA in DSP means that there is a significant requirement for efficient arithmetic cores that utilises the resources on such devices. This thesis presents the research and development effort that was carried out to produce fixed point division and square root cores for use in a new Electronic Design Automation (EDA) tool for EnTegra, which is targeted at FPGA implementation of DSP systems. Further to this, a new technique for predicting the accuracy of CORDIC systems computing vector magnitudes and cosines/sines is presented. This work allows the most efficient CORDIC design for a specified level of accuracy to be found quickly and easily without the need to run lengthy simulations, as was the case before. The CORDIC algorithm is a technique using mainly shifts and additions to compute many arithmetic functions and is thus ideal for FPGA implementation.

In recent years, the amount of digital data which is stored and transmitted for private and public usage has increased considerably. To allow a save transmission and storage of data despite of error-prone transmission media, error correcting codes are used. A large variety of codes has been developed, and in the past decade low-density parity-check (LDPC) codes which have an excellent error correction performance became more and more popular. Today, low-density parity-check codes have been adopted for several standards, and eﬃcient decoder hardware architectures are known for the chosen structured codes. However, the existing decoder designs lack ﬂexibility as only few structured codes can be decoded with one decoder chip. In consequence, diﬀerent codes require a redesign of the decoder, and few solutions exist for decoding of codes which are not quasi-cyclic or which are unstructured.
In this thesis, three diﬀerent approaches are presented for the implementation of fully programmable LDPC decoders which can decode arbitrary LDPC codes. As a design study, the ﬁrst programmable decoder which uses a heuristic mapping algorithm is realized on an ﬁeld-programmable gate array (FPGA), and error correction curves are measured to verify the correct functionality. The main contribution of this thesis lies in the development of the second and the third architecture and an appropriate mapping algorithm. The proposed fully programmable decoder architectures use one-phase message passing and layered decoding and can decode arbitrary LDPC codes using an optimum mapping and scheduling algorithm. The presented programmable architectures are in fact generalized decoder architectures from which the known decoders architectures for structured LDPC codes can be derived.

Computers are changing the way sound and recorded music are listened to and
used. The use of computers to playback music makes it possible to change and
adapt music to different usage situations in ways that were not possible with
analog sound equipment. In this thesis, interaction with pre-recorded music is
investigated using prototypes and user studies.
First, different interfaces for browsing music on consumer or mobile devices
were compared. It was found that the choice of input controller, mapping and
auditory feedback influences how the music was searched and how the interfaces
were perceived. Search performance was not affected by the tested interfaces.
Based on this study, several ideas for the future design of music browsing interfaces
were proposed. Indications that search time depends linearly on distance
to target were observed and examined in a related study where a movement
time model for searching in a text document using scrolling was developed.
Second, work practices of professional disc jockeys (DJs) were studied and a
new design for digital DJing was proposed and tested. Strong indications were
found that the use of beat information could reduce the DJ’s cognitive workload
while maintaining flexibility during the musical performance. A system for
automatic beat extraction was designed based on an evaluation of a number of
perceptually important parameters extracted from audio signals.
Finally, auditory feedback in pen-gesture interfaces was investigated through
a series of informal and formal experiments. The experiments point to several
general rules of auditory feedback in pen-gesture interfaces: a few simple functions
are easy to achieve, gaining further performance and learning advantage
is difficult, the gesture set and its computerized recognizer can be designed to
minimize visual dependence, and positive emotional or aesthetic response can
be achieved using musical auditory feedback.

User-centered computer based learning is an emerging field of interdisciplinary research.
Research in diverse areas such as psychology, computer science, neuroscience and signal
processing is making contributions to take this field to the next level. Learning
systems built using contributions from these fields could be used in actual training and education
instead of just laboratory proof-of-concept. One of the important advances in this research is the
detection and assessment of the cognitive and emotional state of the learner using such systems.
This capability moves development beyond the use of traditional user performance metrics to
include system intelligence measures that are based on current theories in neuroscience. These
advances are of paramount importance in the success and wide spread use of learning systems
that are automated and intelligent.
Emotion is considered an important aspect of how learning occurs, and yet estimating it
and making adaptive adjustments are not part of most learning systems. In this research we focus
on one specific aspect of constructing an adaptive and intelligent learning system, that is,
estimation of the emotion of the learner as he/she is using the automated training system. The
challenge starts with the definition of the emotion and the utility of it in human life. The next
challenge is to measure the co-varying factors of the emotions in a non-invasive way, and find
consistent features from these measures that are valid across wide population. In this research we
use four physiological sensors that are non-invasive, and establish a methodology of utilizing the
data from these sensors using different signal processing tools. A validated set of visual stimuli
used worldwide in the research of emotion and attention, called International Affective Picture
System (IAPS), is used. A dataset is collected from the sensors in an experiment designed to
elicit emotions from these validated visual stimuli. We describe a novel wavelet method to
calculate hemispheric asymmetry metric using electroencephalography data. This method is
tested against typically used power spectral density method. We show overall improvement in
accuracy in classifying specific emotions using the novel method. We also show distinctions
between different discrete emotions from the autonomic nervous system activity using
electrocardiography, electrodermal activity and pupil diameter changes. Findings from different
features from these sensors are used to give guidelines to use each of the individual sensors in
the adaptive learning environment.

Motivation:
A man was interested in knowing of unknown from the very beginning of the human history. Our human eyes help us to investigate our environment by reflection of light. However, wavelengths of visible light allows transparent view through only a very small kinds of materials. On the other hand, Ultra WideBand (UWB) electromagnetic waves with frequencies of few Gigahertz are able to penetrate through almost all types of materials around us. With some sophisticated methods and a piece of luck we are able to investigate what is behind opaque walls. Rescue and security of the people is one of the most promising fields for such applications.
Rescue:
Imagine how useful can be information about interior of the barricaded building with terrorists and hostages inside for a policemen. The tactics of police raid can be build up on realtime information about ground plan of the room and positions of big objects inside. How useful for the firemen can be information about current interior state of the room before they get inside? Such hazardous environment, full of smoke with zero visibility, is very dangerous and each additional information can make the difference between life and death.
Security:
Investigating objects through plastic, rubber, dress or other nonmetallic materials could be highly useful as an additional tool to the existing x-ray scanners. Especially it could be used for scanning baggage at the airport, truckloads on borders, dangerous boxes, etc.

Audio time-scale modification is an audio effect that alters the duration of an audio
signal without affecting its perceived local pitch and timbral characteristics. There
are two broad categories of time-scale modification algorithms, time-domain and
frequency-domain. The computationally efficient time-domain techniques produce
high quality results for single pitched signals such as speech, but do not cope well
with more complex signals such as polyphonic music. The less efficient frequencydomain
techniques have proven to be more robust and produce high quality results for a variety of signals; however they introduce a reverberant artefact into the output.
This dissertation focuses on incorporating aspects of time-domain techniques into
frequency-domain techniques in an attempt to reduce the presence of the reverberant
artefact and improve upon computational demands.
From a review of prior work it was found that there are a number of time-domain
algorithms available and that the choice of algorithm parameters varies considerably
in the literature. This finding prompted an investigation into the effects of the choice
of parameters and a comparison of the various techniques employed in terms of
computational requirements and output quality. The investigation resulted in the
derivation of an efficient and flexible parameter set for use within time-domain
implementations.
Of the available frequency-domain approaches the phase vocoder and timedomain/
subband techniques offer an efficiency and robustness advantage over
sinusoidal modelling and iterative phase update techniques, and as such were
identified as suitable candidates for the provision of a framework for further
investigation. Following from this observation, improvements in the quality produced
by time-domain/subband techniques are realised through the use of a bark based
subband partitioning approach and effective subband synchronisation techniques.
In addition, computational and output quality improvements within a phase vocoder
implementation are achieved by taking advantage of a certain level of flexibility in the
choice of phase within such an implementation. The phase flexibility established is used to push or pull phase values into a phase coherent state. Further improvements
are realised by incorporating features of time-domain algorithms into the system in
order to provide a ‘good’ initial set of phase estimates; the transition to ‘perfect’ phase coherence is significantly reduced through this scheme, thereby improving the overall output quality produced. The result is a robust and efficient time-scale modification algorithm which draws upon various aspects of a number of general approaches to time-scale modification.

Multirate systems are building blocks commonly used in digital signal processing (DSP). Their function is to alter the rate of the discrete-time signals, by adding or deleting a portion of the signal samples. They are essential in various standard signal processing techniques such as signal analysis, denoising, compression and so forth. During the last decade, however, they have increasingly found applications in new and emerging areas of signal processing, as well as in several neighboring disciplines such as digital communications.
The main contribution of this thesis is aimed towards a better understanding of multirate systems and their use in modern communication systems. To this end, we first study a property of linear systems appearing in certain multirate structures. This property is called biorthogonal partnership and represents a terminology introduced recently to address a need for a descriptive term for such class of filters. In the thesis we especially focus on the extensions of this simple idea to the case of vector signals (MIMO biorthogonal partners) and to accommodate for nonintegral decimation ratios (fractional biorthogonal partners).
The main results developed here study the properties of biorthogonal partners, e.g., the conditions for the existence of stable and of finite impulse response (FIR) partners. In this context we develop the parameterization of FIR solutions, which makes the search for the best partner in a given application analytically tractable. This proves very useful in their central application, namely, channel equalization in digital communications with signal oversampling at the receiver. A good channel equalizer in this context is one that helps neutralize the distortion on the signal introduced by the channel propagation but not at the expense of amplifying the channel noise.
In the second part of the thesis, we focus on another class of multirate systems, used at the transmitter side in order to introduce redundancy in the data stream. This redundancy generally serves to facilitate the equalization process by forcing certain structure on the transmitted signal. We first consider the transmission systems that introduce the redundancy in the form of a cyclic prefix. The examples of such systems include the discrete multitone (DMT) and the orthogonal frequency division multiplexing (OFDM) systems. We study the signal precoding in such systems, aimed at improving the performance by minimizing the noise power at the receiver.
We also consider a different class of communication systems with signal redundancy, namely, the multiuser systems based on code division multiple access (CDMA). We specifically focus on the special class of CDMA systems called `a mutually orthogonal usercode receiver' (AMOUR). We show how to find the best equalizer from the class of zero-forcing solutions in such systems, and then increase the size of this class by employing alternative sampling strategies at the receiver.

Wireless communications systems are evolving to be more diverse in use and more ubiquitous in nature. It is of fundamental importance that we consume the resources available in such systems, i.e., bandwidth and energy, to preserve room for more users and to preserve longevity. Signal processing can greatly help us achieve this. In this thesis we consider improving the utility of resources available in wireless communications systems. The basic obstacle for most wireless communications systems is the multipath channel that causes intersymbol interference. Channel estimation is a crucial step for recovering the transmitted symbols. Moreover, as more devices are equipped with wireless capabilities, the bandwidth becomes scarce and it is important to allow more than one device or more than one user to use the same frequency range or the same channel. However, this introduces multiuser interference, which is again eliminated only if the channel is known. Furthermore, most wireless systems are battery powered, at least at the transmitter end. Hence it is crucial that energy consumption is minimized to preserve the longevity of the system. The contribution of this thesis is three fold: (i) We propose novel bandwidth efficient blind channel estimation algorithms for single input multiple output systems, and for multiuser OFDM systems. The former exploits cyclostationarity inherent in communications signals. The latter exploits the structure introduced to the transmitted signal via precoding. We consider design of such precoders by optimizing performance metrics such as the bit error rate and signal to interference plus noise ratio. (ii) In the multiuser systems case, we propose a novel cooperative OFDM system and show that, when users face significantly different channel conditions, cooperation can improve the performance of all the cooperating users. (iii) We consider energy efficient training based system estimation in large MIMO systems. The goal there is to minimize energy consumption both in transmission of training symbols and in performing computations. We show that by using a divide and conquer strategy in selecting the active set of transmitters and receivers, it is possible to minimize energy consumption without degrading the accuracy of the channel estimate.

Automatic speech recognition has been widely studied and is already being applied in everyday use. Nevertheless, the recognition performance is still a bottleneck in many practical applications of large vocabulary continuous speech recognition. Either the recognition speed is not sufﬁcient, or the errors in the recognition result limit the applications. This thesis studies two aspects of speech recognition, decoding and training of acoustic models, to improve speech recognition performance in different conditions.

Vibration isolation endeavors to reduce the transmission of vibration energy from one structure
(the source) to another (the receiver), to prevent undesirable phenomena such as sound
radiation. A well-known method for achieving this is passive vibration isolation (PVI). In the
case of PVI, mounts are used - consisting of springs and dampers - to connect the vibrating
source to the receiver. The stiffness of the mount determines the fundamental resonance frequency
of the mounted system and vibrations with a frequency higher than the fundamental
resonance frequency are attenuated. Unfortunately, however, other design requirements (such
as static stability) often impose a minimum allowable stiffness, thus limiting the achievable
vibration isolation by passive means.
A more promising method for vibration isolation is hybrid vibration isolation control. This
entails that, in addition to PVI, an active vibration isolation control (AVIC) system is used
with sensors, actuators and a control system that compensates for vibrations in the lower
frequency range. Here, the use of a special form of AVIC using statically determinate stiff
mounts is proposed. The mounts establish a statically determinate system of high stiffness
connections in the actuated directions and of low stiffness connections in the unactuated
directions. The latter ensures PVI in the unactuated directions. This approach is called
statically determinate AVIC (SD-AVIC). The aim of the control system is to produce antidisturbance
forces that counteract the disturbance forces stemming from the source. Using
this approach, the vibration energy transfer from the source to the receiver is blocked in the
mount due to the anti-forces.
This thesis deals with the design of controllers generating the anti-forces by applying
techniques that are commonly used in the field of signal processing. The control approaches
- that are model-based - are both adaptive and fixed gain and feedforward and feedback
oriented. The control approaches are validated using two experimental vibration isolation
setups: a single reference single actuator single error sensor (SR-SISO) setup and a single
reference input multiple actuator input multiple error sensor output (SR-MIMO) setup.
Finding a plant model can be a problem. This is solved by using a black-box modelling
strategy. The plants are identified using subspace model identification. It is shown that
accurate linear models can be found in a straightforward manner by using small batches of
recorded (sampled) time-domain data only. Based on the identified models, controllers are
designed, implemented and validated.
Due to resonance in mechanical structures, adaptive SD-AVIC systems are often hampered
by slow convergence of the controller coefficients. In general, it is desirable that the SD-AVIC
system yields fast optimum performance after it is switched on. To achieve this result and
speed up the convergence of the adaptive controller coefficients, the so-called inverse outer
factor model is included in the adaptive control scheme. The inner/outer factorization, that has to be performed to obtain the inverse outer factor model, is completely determined in
state space to enable a numerically robust computation. The inverse outer factor model is also
incorporated in the control scheme as a state space model. It is found that fast adaptation
of the controller coefficients is possible.
Controllers are designed, implemented and validated to suppress both narrowband and
broadband disturbances. Scalar regularization is used to prevent actuator saturation and an
unstable closed loop. In order to reduce the computational load of the controllers, several
steps are taken including controller order reduction and implementation of lower order models.
It is found that in all experiments the simulation and real-time results correspond closely for
both the fixed gain and adaptive control situation. On the SR-SISO setup, reductions up to
5.0 dB are established in real-time for suppressing a broadband disturbance output (0-2 kHz)
using feedback-control. On the SR-MIMO vibration isolation setup, using feedforward-control
reductions of broadband disturbances (0-1 kHz) of 9.4 dB are established in real-time. Using
feedback-control, reductions are established up to 3.5 dB in real-time (0-1 kHz). In case of
the SR-MIMO setup, the values for the reduction are obtained by averaging the reductions
obtained in all sensor outputs.
The results pave the way for the next generation of algorithms for SD-AVIC.