Dr. Venu Govindaraju, SUNY Distinguished Professor of Computer Science and Engineering, is the founding director of the Center for Unified Biometrics and Sensors. He received his Bachelor’s degree with honors from the Indian Institute of Technology (IIT) in 1986, and his Ph.D. from UB in 1992. His research focus is on machine learning and pattern recognition in the domains of Document Image Analysis and Biometrics.

Dr. Govindaraju has co-authored about 400 refereed scientific papers. His seminal work in handwriting recognition was at the core of the first handwritten address interpretation system used by the US Postal Service. He was also the prime technical lead responsible for technology transfer to the Postal Services in US, Australia, and UK. He has been a Principal or Co-Investigator of sponsored projects funded for about 65 million dollars.

Dr. Govindaraju has supervised the dissertations of 35 doctoral students. He has served on the editorial boards of premier journals such as the IEEE Transactions on Pattern Analysis and Machine Intelligence and is currently the Editor-in-Chief of the IEEE Biometrics Council Compendium.

Awards

Govindaraju is a Fellow of the ACM (Association for Computing Machinery),[5] the IEEE (Institute of Electrical and Electronics Engineers),[6] the AAAS (American Association for the Advancement of Science),[7] the IAPR (International Association of Pattern Recognition),[8] and the SPIE (International Society of Optics and Photonics).[9]

•HP Open Innovation
•IBM Faculty Research
•Google Faculty Research
•eBay Faculty Research
•Fujitsu Faculty Research

Research Career

Govindaraju is a SUNY Distinguished Professor in the Department of Computer Science and Engineering at the University at Buffalo, The State University of New York.[13] He is the Associate Director of the Center of Excellence for Document Analysis and Recognition (CEDAR) since 1995 and the founding director of the Center for Unified Biometrics and Sensors (CUBS) since its inception in 2003.

Conferences: Govindaraju has been the General (Co)Chair at 12 conferences/workshops including International Conference on Document Analysis and Recognition (ICDAR), Program (Co) Chair in 14 conferences/workshops including Biometrics: theory, Algorithms and Systems (BTAS) , and program chairs of numerous conferences that span both document analysis and biometrics areas.

Speaking: Govindaraju has given over a hundred invited talks, keynotes, plenaries and seminars, at prestigious venues including influential think tanks such as the NRC Intelligence Committee Workshop of the National Academy of Sciences where he presented on the topic of “Accelerated Discovery in the Era of Scientific Information Overload”.

Technical Accomplishments

Govindaraju’s research has focused on the application of machine learning and pattern recognition techniques to domains such as Document Analysis and Recognition and Biometrics and, in particular, the development of real-time engineered systems.

He has developed principled modeling approaches for pattern classification that have resulted in the development of robust, scalable systems in a variety of application domains, from document processing to biometrics. He has designed several algorithms for cursive handwriting recognition suitable for real time applications that demonstrated the benefits of innovative modeling of application constraints. His language-motivated hierarchical modeling has been extended to computer vision applications such as scene understanding and classifying activities and gestures in unconstrained videos. He has also made contributions to the theoretical foundations of a general fusion architecture and taxonomy of trained combining functions (classifiers) and their input parameters which provides a principled guideline for choosing a particular fusion technique.

Engineered Systems

Govindaraju's work in handwriting recognition[22][23] was at the core of the first handwritten address interpretation system used by the United States Postal Service (USPS). The learning-based system that he developed as project technical lead along with his colleagues at the University at Buffalo helps save the USPS hundreds of millions of dollars by automatically processing, and barcoding for precise delivery, over 25 billion letters a year.[24]

This work was highlighted in the Computing Community Consortium's symposium on Computing Research that Changed the World in 2009 as one of the most successful applications of Machine Learning for developing a real-time engineered system.[25][26]

The Government Executive publication reported in 1999 that "USPS issued a contract to researchers at the State University of New York at Buffalo to develop the handwriting recognition technology. It was first launched in 1997 right before the Christmas holiday season. One year later, an estimated 400 million pieces of mail were automatically routed during the Christmas season alone using the handwriting recognition technology. The new technology has saved the Postal Service at least $90 million in its first year in the field."[27]

USPS Engineering Vice President William J. Dowling singled out Lockheed Martin and its suppliers, the State University of New York at Buffalo, and Parascript, LLP, for their work in improving RCR performance.[28] Edward Kuebert, manager of image and telecommunications technology at USPS also credited improvements in reader technology to the State University of New York at Buffalo, Lockheed Martin Federal Systems and the Parascript Group of Boulder, Colo.[29]

Pattern Recognition Techniques

Govindaraju developed a suite of efficient and field-tested image processing routines for handling the contour representation in handwritten word images.[30][31] Departing from the myriad of heuristic approaches, he introduced a statistical approach to binarization and noise removal by modeling the degraded document as a Markov Random Field (MRF) where the prior is learnt from a training set of high quality images, and the probabilistic density is learnt on-the-fly.[32]

Govindaraju modeled “active” recognition along the lines of the A* algorithm.[33] This method provides a multi-resolution framework for adapting to factors such as the quality of the input pattern, its intrinsic similarities with patterns of other classes, and the processing time available. Finer resolution is accorded to only certain “zones” of the input pattern which are deemed information bearing given the classes that were being discriminated.

Govindaraju has contributed to the combination (fusion) of pattern classifiers and proved that the optimal combination algorithm for identification systems is difficult to express analytically because of the difficulty presented by the dependencies between matching scores assigned to different classes by the same classifier.[34] He developed the first taxonomy of the complexity of classification combination methodoogies and a guideline for choosing a particular type of fusion technique.[35]

Document Recognition and Retrieval

Handwriting Recognition Models

Govindaraju developed the first handwritten word recognition module suitable for real time applications[36] using an innovative dynamic matching algorithm to assign automatically segmented pieces of words to lexical entities. Prior methods modeled either discrete features or continuous features but not both. The early models were extended to a new stochastic framework which modeled sequences of features that combined discrete symbols and continuous attributes.[37]

Govindaraju has incorporated the theories of reading and perception developed in psychology literature in analyzing handwritten words[38] and demonstrated its uses in postal address reading.[39] He has also contributed to improvement in word recognition accuracy of unconstrained handwritten documents by applying OCR correction techniques[40] in a bootstrapping framework where innovative topic categorization techniques are used to generate smaller topic-specific lexicons.[41]

His work on multi-lingual OCR spans recognition of Arabic[42] and Devanagari script.[43] His book on the OCR of Indic Scripts[44] is the first comprehensive book on this subject. Govindaraju developed a performance model that statistically "discovers" the relation between a word recognizer and the lexicon. It uses model parameters that capture a recognizer's ability of distinguishing characters and its sensitivity to lexicon size. Such a model is useful in comparing word recognizers by predicting their performance based on the lexicon presented. He described the notion of “lexicon density” as a metric to measure the expected accuracy of handwritten word recognizers.[45][46]

Document Analysis Systems

Govindaraju authored a widely-cited system-level paper describing the architecture of an end-to-end system for reading unconstrained handwritten page images.[47] Prior research in handwriting recognition treated phrases as a concatenation of the constituent words. He presented a methodology that took advantage of the spacing between the words by modeling the word breaks as a feature of the individual writing style.[48][49] He developed a bank check reading system that could leverage the recognition of the legal amounts (written in long hand) in the decision making process along with the existing and more accurate recognition of courtesy amounts (numeric strings).[50] He enabled recognition of medical forms by modeling the relationships between handwriting and medical topics. This technique showed that a few automatically recognized characters can be used to construct a linguistic model capable of representing a medical topic category.[51]

Innovative Applications

Govindaraju developed the first simulation of human-like handwriting for designing CAPTCHAs to exploit the differential in handwriting reading proficiency between humans and machines.[52] The MIT Tech Review (Jan. 2009) highlighted the ingenuity of his Spambot-Fighting Strategy[53] which is grounded in cognitive science principles.

He has developed methods that stochastically model imperfect word segmentation inherent in handwriting[54] and techniques for transcript mapping[55] useful for indexing handwritten documents.

Biometrics

Fingerprint Templates and Matching

Govindaraju has explored enhancement and binarization techniques for fingerprint images: Fourier analysis,[56] direction median filters,[57] Harris points,[58] fingerprint quality measures.[59] He proposed a minutia extraction algorithm utilizing the contours of minutia ridges[60] and matching based on features extracted from minutia k-plets and formulated the matching algorithm as a minimum cost flow problem[61] and coupled breadth-first search algorithm.[62] The fingerprint templates stored in biometric databases typically contain the original sensor data, or features, from which the intruder can potentially reconstruct fake biometric samples. Govindaraju invented a unique fingerprint hashing method[63][64] where only hashes are transmitted and stored in the database, and it is not possible to restore original biometric data from them. He proposed a novel indexing method based on minutia k-plet paths[65] whose search time remains constant even when increasing number of enrolled persons.

Face and Facial Expression Analysis

Govindaraju proposed one of the earliest model-based face recognition methods[66][67] and developed a face matching system based on semantic descriptors.[68] He proposed a hybrid facial feature localization method based on graphical models and image sampling[69] and verified the individuality of facial expressions, demonstrating that either displacements of facial features[70] or the frequencies of particular expressions[71] could be used as biometric modalities. He innovated methods for automated detection of deceit in facial expressions using changes in facial geometry, texture[72] and changes in the eye movements.[73]

Smart Environments

Govindaraju formulated a probabilistic framework for person identification and tracking in smart environments consisting of a set of connected rooms[74] wherein multiple biometric modalities are coupled with the probabilistic track model in this framework. He investigated the concepts of computer security and online user behavior[75][76] and introduced methods for confirming the identity of online users.[77][78]

Writer Identification

Govindaraju has proposed that, although handwriting is unique to writers, writer style represents a shared component of individual handwriting.[79] He explicitly models this conceptualization via a three-level hierarchical Bayesian framework (LDA) for the purposes of writer identification and verification.[80][81] In this text-independent model, each writer's handwriting is modeled as a distribution over a limited set of writing styles that are shared amongst writers. He has shown that, analogous to speech, accents in writing are treated as distinctive quirks unique to a group of people belonging to a common family of scripts which have roots in cultural and genetic factors.[82][83]