Tuesday, April 15, 2008

Today, my post is a very special one. I have the great pleasure to propose to you a very private interview with one of the data mining guru, Gregory Piatetsky-Shapiro, from KDnuggets. He kindly accepted to answer to my questions. For the pleasure of the DMR readers, here is the interview:Data Mining Research: Can you introduce yourself in a few lines?

Gregory Piatetsky-Shapiro: I think of myself as someone who searches for truth. With my analytic and math skills and inclination to numbers, I found a great profession of searching for truth in data.

My professional activities are: I publish KDnuggets News (currently with over 11,000 subscribers), maintain KDnuggets, do data mining consulting, and also run ACM SIGKDD, (KDD) which organizes annual KDD conferences. Some people think that KDnuggets is a huge organization, but there is only me and 2 cats, so I try to be very efficient and automate as much as possible.

DMR: Why did you decide to start KDnuggets? Some ideas about its future?

GPS: I started KDnuggets Newsletter (then called Knowledge Discovery Nuggets) in 1993, (here is the very first issue) after the KDD-93 workshop on (KDD stands for Knowledge Discovery in Data), as a way to connect people interested in the field. We started with 50 subscribers and now there are over 11,00 from about 100 countries. I also started a Knowledge Discovery Mine website in 1994 , to archive the old issues when I was at GTE, and in 1997 I moved it to KDnuggets website.

Because of spam, the email format is losing some popularity in favor of social media, so I am exploring new ways to connect to data mining community. However, I think that the function of the editor who separates relevant stories from irrelevant ones will still be in demand, even in the new media.

For example, although there million of blogs, people still read the New York Times for good coverage of stories, even though they now read it on-line.

DMR: What do you think wikis and blogs (i.e. Web 2.0) can bring to the data mining community?

GPS: The part of Web 2.0 that I most interested in are social networks. Besides the obvious social aspects, they offer great opportunities for data analysis and knowledge discovery. Both Myspace and facebook have hired data miners. However, they need to be careful around the issue of privacy.

DMR: Do you have some data miner "dreams" (e.g. mining some improbable data)?

GPS: I would love to discovery a cure for cancer ! Actually, cancer is not one disease, but many related diseases, and data mining can help to find better diagnostics and perhaps cure for some of them.

People with the same diagnosis and the same treatment frequently have different outcomes. Why? The analysis that I (and many other people) did shows that part of the variation is due to people's genetic signatures, so it is possible to come up with personalized medicine to improve patient treatment.

DMR: Thanks a lot to Gregory for giving his personal opinion about these questions.

Gregory Piatetsky-Shapiro, Ph.D. is the President of KDnuggets, which provides research and consulting services in the areas of data mining, knowledge discovery, bioinformatics, and business analytics.