Author Profile

Sunil Kumar Kopparapu

Recent Posts

Update: Now Google ASR needs a key! Steps for generating Google Speech API key (Thanks to my colleague Chirag Patel) Assumed that you have a Google account (if you have a Gmail account – you can safely assume that you have a Google account! else create one.) Make sure you are a member of chromium-dev-discussion […]

“There is a traffic jam on a particular road and hence take an alternative route, now enjoy the music” is the general message on the FM radio in the car. It set me thinking, I hear this almost everyday when I am driving in Mumbai, India and yet it is just treated like any other […]

How to make one? While on a morning walk (in a coconut tree rich areas) you will find these small baby coconuts below the coconut trees. Pick one of them and peel off the covering on the baby coconut. This exposes the soft part of the baby coconut. Now pick up two broom dried […]

Continuous density hidden Markov models (CD-HMMs) are
doubly stochastic processes which are extensively used in
speech and image signal processing. In case of isolated spo-
ken word recognition systems, the words are usually mod-
eled using HMMs.While continuous density HMMs are in
extensive use, to most of the speech community the HMMs
remain abstract in the sense there has been no nice way of
visualizing the HMMs. In this paper, we give a visual rep-
resentation for an HMM. These visuals serve two purposes,
they give the beginner in the area of speech technology a
feel for the HMMs and secondly, the HMMs of words can
be compared quickly to check if the HMM models of any
two words are similar, which could cause confusion. There
is scope for improvement in visualizing the HMMs and we
believe this is just a beginning.

a one day session on 25.05.2010 with Frans Johansson at Hotel Leela, Mumbai

May 2, 2011

Page 1 Background The focus of the workshop was to enable people to experience innovation by participating in an innovation creating environment rather than hear to a talk on innovation. In this sense this workshop was a session of participant participation led by the speaker. Frans Johansson is the author of ‘The Medici Effect’ published […]

With increasing adoption of technology in day to day activity, it has become extremely important to build interfaces which are natural and convenient to use by all strata of the society. When human communicate, written or vocal, it is full of
intended meaning
and the language rules are generally tricky and complex. Even in this tricky scenario people can very easily unpack the many nuanced allusions and connotations in every sentence and decode what someone else is saying. While on one hand computers are good at number crunching; on the other hand when it come to language processing it gets hard on computers. Technologist believe that being able to understand a question posed in everyday human natural language, and respond with a precise answer is sort of a holy grail, because it would allow machines to converse more naturally with people, letting people ask questions instead of typing or speaking keywords. In this brief paper, we look at aspects of technologies that can allow humans to interact with machines as they would interact with another human using natural language.

Low-level image processing is an essential first step in any machine vision application. Low-level vision processing tasks need good lighting in the work environment for them to function robustly. Hence, good and uniform illumination from external light source is essential for machine vision applications to function. In this paper we suggest a design procedure to obtain uniform illumination on the scene being imaged using several light sources. We pose the problem of determining the optimal position of the light sources as a minimisation problem. Simulation results shows the effectiveness and suitability of the proposed procedure to illuminate the scene uniformly.

In general, self help systems are being increasingly deployed by service based industries because they are capable ऑफ़ delivering better customer service and increasingly the switch
is to voice based self help systems because they provide a
natural interface for a human to interact with a machine. A
speech based self help system ideally needs a speech recognition engine to convert spoken speech to text and in addition a language processing engine to take care of any misrecognitions by the speech recognition engine. Any off-the-shelf speech recognition engine is generally a combination ऑफ़ acoustic processing and speech grammar. While this is थे norm, we believe that ideally a speech recognition application should have in addition to a speech recognition engine a separate language processing engine to give the system better performance. In this paper, we discuss ways in which the speech recognition engine and the language processing engine can be combined to give a better user experience.