This International Workshop will
provide a combination of introductory, industry relevant, R&D,
mathematical, and philosophical introduction to Data Mining, Soft Computing,
Rough and Fuzzy Sets. The presenters have interests in practical applications,
research and development, as well as academic aspects. This document describes
the information on the talks as well as the invited presenters.

Data
Mining

Rajendra Akerkaris
Senior Scientist at West Norway Research Institute, Norway. He is also Chairman
of the Technomathematics Research Foundation, India. His research and teaching
assignments have taken him around the world to Germany, Japan, Spain, Holland,
Norway, Austria, Canada, Vietnam, and Armenia. He received the BOYSCAST (Young
Scientist) award from Government of India, and was recipient of the prestigious
DAAD (Germany) fellowship, DAAD Visiting Professorship and UNESCO-TWAS
Associateship. He serves as the editor-in-chief of International
Journal of Computer Science & Applications and Journal of Hybrid Computing
Research, and is on the editorial board of numerous other computer science
journals and serves as the program committee member for various international
conferences on data mining, semantic systems and cognitive technologies. He has
authored more than 80 articles in various international journals and
conferences, and authored/co-authored 9 books. His area of interest broadly
includes intelligent systems, semi structured data, data mining, and semantic
Web.

Abstract: Data mining has been a subject of considerable interest
both in academia and industry. Data mining refers to a set of techniques that
have been designed to efficiently find interesting pieces of information or
knowledge in large amounts of data. Association rules, for instance, are a
class of patterns that tell which products tend to be purchased together.
Covering machine learning, statistics, and operations research, this technology
of knowledge discovery now represents a vital tool to assist in intelligent
decision making in the highly complex business environment. This talk gives a
brief introduction to data mining process and explores how this interdisciplinary
field brings together techniques from databases, statistics, machine learning,
and information retrieval. The talk reviews the main data mining methods
currently used, including clustering, classification, and association rule
techniques. Some applications and trends will also be discussed.

Introduction
to Rough and Fuzzy sets

Pawan Lingrasis a Professor
in the Department of Mathematics and Computing Science at Saint Mary’s
University, Halifax, Canada. His undergraduate education from IIT, Bombay was
followed by graduate studies at the University of Regina, Canada. He has
authored more than 140 research papers in various international journals and
conferences. He has also co‐authored a textbook and co‐edited a
collection of research papers. His areas of interests include artificial
intelligence, information retrieval, data mining, web intelligence, and
intelligent transportation systems. He has served as the review committee
chair, program committee member, and reviewer for various international
conferences on artificial intelligence and data mining.

Abstract:
The data mining techniques are based on conventional crisp logic, statistics,
and probabilistic theories. Sometimes, the axiomatic limitations of the
traditional mathematics can make it awkward to apply these techniques in
practical applications. Fuzzy sets introduced in 1965 made it possible to allow
partial membership of an object to a set. This flexibility led to development
of data mining techniques that make it possible to conduct supervised and
unsupervised learning from datasets, as well as identify fuzzy patterns and
predict future trends. Rough set theory was introduced in 1982, and provides a
less descriptive and complementary alternative to fuzzy set theory. Rough sets
allow for multi-level memberships of objects to sets. Researchers have
developed rough alternatives to almost all the data mining techniques. This
talk will provide a general introduction to both rough and fuzzy set theory
that will be helpful towards understanding of the subsequent talks.

Pawan
Lingras will also provide general introduction to all the talks and presenters
at the beginning of the workshop.

Dominik Slezakreceived his PhD
in Computer Science in 2002 from Warsaw University, Poland. In 2005, he co‐founded
Infobright Inc., where he is currently working as chief scientist. He is also
an adjunct professor at McMaster University, York University, and University of
Regina, as well as in the Polish‐Japanese Institute of Information
Technology. Dominik serves as an associate editor and reviewer for a number of
international scientific journals, and chair of several international
scientific conferences. He has published over 50 pier‐reviewed papers
for books, journals, and conference proceedings. He has delivered a number of
invited talks in Canada, China, Czech Republic, Egypt, India, Japan, Korea,
Poland, Russia, Singapore, UK, and US. His research interests are related
mainly to rough sets, data warehousing, data mining, KDD, bioinformatics, as
well as medical and multimedia data.

Abstract:
The theory of rough sets provides a powerful model for representation of
patterns and dependencies, applicable both in databases and data mining. On the
one hand, although there are numerous rough set applications to data mining and
knowledge discovery, the usage of rough sets inside the database engines is
still quite an uncharted territory. On the other hand, however, this situation
is not so exceptional given that even the most well-known paradigms of machine
learning, soft computing, artificial intelligence, and approximate reasoning
are still waiting for more recognition in the database research.

Rough
set-based algorithms and similar techniques can be applied to improve database
performance in several ways. We focus on the idea of using available
information to calculate rough approximations of data needed to resolve queries
and to assist the database engine in accessing relevant data. We partition data
onto rough rows, each consisting of 64K of original rows. We automatically
label rough rows with compact information about their values on data columns,
often involving multi-column and multi-table relationships. One may say that we
create new information systems where objects correspond to rough rows and
attributes - to various flavours of rough information.

In
this talk, we show how the above ideas guided us toward implementing the fully
functional data warehouse product, with interfaces provided via integration
with MySQL and internals based on the newest database trends. Thanks to
compact, flexible rough information, we became especially competitive in the
field of analytical data warehouses, where users want to query terabytes of
data in a complex, dynamically changing way. Recently, we announced at
www.infobright.org the open source edition of our data warehouse, ready for
free usage and further extensions. In the talk, we illustrate the best
scenarios of applying our software to various aspects of data processing. We
also discuss the most promising directions for further improvement of our
technology, with a special attention to the ideas based on the theory of rough
sets and corresponding techniques.

Applications of rough and fuzzy hybridizations to bioinformatics
and biomedicine

Sushmita Mitra is a
Professor at the Machine Intelligence Unit, Indian Statistical
Institute, Kolkata. Dr. Mitra received the National Talent Search
Scholarship (1978-1983) from NCERT, India, the IEEE TNN Outstanding Paper
Award in 1994 for her pioneering work in neuro-fuzzy computing, and the
CIMPA-INRIA-UNESCO Fellowship in 1996. She is the author of three books, more
than 75 research publications in referred international journals, and
associated with editing of books and journals. She is listed as one of the top
100 Women Scientists, in Lilavati's Daughters: The Women Scientists of India,
published by the Indian Academy of Sciences in 2008. She served in the capacity
of Program Chair, Tutorial Chair, Plenary Speaker, and as member of programme
committees of many international conferences. Her current research interests
include data mining, pattern recognition, soft computing, image processing, and
Bioinformatics.

Abstract: In
this talk we cover some of the hybridizations of rough sets with neural
networks, fuzzy sets and genetic algorithms, in the broader framework of soft
computing. Applications are presented for knowledge encoding, rule extraction,
dimensionality reduction, biclustering, and segmentation. Results
demonstrate the suitability of the methodologies for feature selection with
improved recognition, in diverse domains such as microarray gene expressions
for bioinformatics and face recognition. Segmentation of CT scan images of the
infracted regions of the brain also exhibit superior results.

Rough sets applications to biological and agricultural
applications

Sonajharia
Minz
is a Professor Computer & Systems Sciences at the Jawaharlal Nehru
University in New Delhi. Dr. Minz has been working in topics related to Rough
set theory since 2002 having guided 2 PhD’s and 8 M.Tech projects. Her research
focuses on issues relating granular computing for Data mining along with
application of Rough set theory with Machine learning techniques. Dr. Minz has
widely published in the application of rough sets in bioinformatics.

Yiyu Yaois a Professor
of computer science in the Department of Computer Science, University of
Regina, Regina, Saskatchewan, Canada. His research interests include
information retrieval, rough sets, interval sets, granular computing, Web
intelligence, data mining and fuzzy sets. He has published over 200 journal and
conference papers. He is an area editor of International Journal of Approximate
Reasoning, a member of the editorial boards of the Web Intelligence and Agent
Systems journal, Transactions on Rough Sets, Journal of Intelligent Information
Systems, Journal of Chongqing University of Posts and Telecommunication, The
International Journal of Cognitive Informatics & Natural Intelligence
(IJCiNi), International Journal of Software Science and Computational
Intelligence (IJSSCI). He has served and is serving as a program co chair of
several international conferences. He is a member of ACM and IEEE.

Abstract: Granular
computing has emerged as a new multidisciplinary study and has received much
attention in recent years. A conceptual framework is presented by extracting
shared commonalities from many fields. The framework stresses multiple views
and multiple levels of understanding in each view. It is argued that granular
computing is more about a philosophical way of thinking and a practical
methodology of problem solving. By effectively using levels of granularity,
granular computing provides a systematic, natural way to analyze, understand,
represent, and solve real world problems. With granular computing, one aims at
structured thinking at the philosophical level, and structured problem solving
at the practical level.

Rough Clustering and Its Dynamic Extension

Georg Peters
is a Professor in the Department of Computer Sciences
and Mathematics at University of Applied Sciences - Muenchen,
Munich, Germany. He received diploma degrees (equivalent to
master degrees) in electrical engineering, industrial engineering and
in business administration from RWTH Aachen University. He also
obtained a PhD in the field of intelligent data analysis from the same
university. He has published more than 40 papers in the fields of
information systems and soft computing. Currently, his interests
include applications of soft computing concepts, in particular rough
sets.

Abstract:
Since its introduction by Lingras rough clustering has gained
increasing attention. As in original rough set theory in rough
clustering the concept of two approximations are utilized to define a
cluster. In the recent years it has been successfully applied to
several real life applications. Recently a dynamic version of rough
clustering was suggested which adapts to changing data structures.
The presentation gives an overview of rough clustering approaches
and discusses areas of applications. Then dynamic rough clustering
is introduced.

An Evaluation of Result Merging Models in Metasearch

Dr. Vijay Raghavan
is the Distinguished Professor of Computer Science at the Center for Advanced Computer Studies and a co-director of the Laboratory for Internet Computing. His research interests are in data mining, information retrieval, machine learning and Internet computing. He has published over 170 peer-reviewed research papers- many of which appear in top-level journals and proceedings- that cumulatively accord him an h-index* of 21, based on citations. He has served as major advisor for 20 doctoral students and has garnered $8 million in external funding. Dr. Raghavan brings substantial technical expertise, interdisciplinary collaboration experience, and management skills to his projects.
His service work at the university includes coordinating the Louis Stokes-Alliance for Minority Participation (LS-AMP) program. From 1997 to 2003, he worked closely with the USGS National Wetlands Research Center and with the Department of Energy's Office of Science and Technical Information on a digital library with data mining capabilities incorporated. He chaired the IEEE International Conference on Data Mining in 2005 and received the ICDM 2005 Outstanding Service Award. He is a member of the Advisory Committee of the NSF Computer and Information Science and Engineering directorate. Dr. Raghavan was honored as the Grand Marshal for the Fall-2008 Graduate School Commencement Exercises at UL Lafayette.

Abstract:
Search engines queried by a metasearch engine return results in the form of a ranked list of documents. The key issue is to combine these lists to achieve the best performance. In our work, we apply fuzzy aggregation operators to result merging. Our work is an extension of s [1] fuzzy Ordered Weighted Average (OWA) operator based result merging model proposed by Diaz [2]. We propose three extensions to the OWA model for metasearch. These are the Importance Guided OWA (IGOWA), the algebraic t-norm OWA, and the algebraic t-norm IGOWA models. While the first two are based on s extension of the OWA operator, the third is a combination of the first two.
The first model (IGOWA) allows weights to be applied to search engine result lists. The second model (t-norm OWA) allows for alternative t-norm functions to be used in aggregation. The third t-norm IGOWA model allows for both. In our work, for the second and third models we use the algebraic (product) t-norm. We compare and contrast our models and also compare them with existing models such as the OWA model for metasearch proposed by Diaz [2] and the Borda-Fuse model proposed by Aslam and Montague [3].
Two of our models, the algebraic t-norm IGOWA model and the IGOWA model, require search engine weights. Thus we develop a new scheme for obtaining search engine weights. We apply our scheme to the above models and observe that using our weighting scheme results in improved result merging.

Participants are encouraged to
submit an extended abstract and a copy of their ten minute presentation on rfsta09@gmail.com for review by June 30, 2009.

The session will be led by Dr.
Ashok Deshpande, who is an adjunct professor of Bioinformatics at University of
Pune and College of Engineering Pune (COEP). Since early 80’s he is involved in
fuzzy logic and its application to variety of systems. He is co-chair of
Berkley Initiatives in Soft Computing in Bioinformatics. His group is trying to
develop fuzzy logic based infusion pump for anaesthesia control at the Bio
Medical Engineering at COEP.