Big Data School of PAKDD 2013 is aimed to provide an educational platform for post-graduate students and young researchers in the areas of data mining, machine learning and analytics. It is a 2-day intensive course, hosted in Sydney from 10-11, April, 2013 immediately prior to the opening of the main conference in Gold Coast. The program of this Big Data School covers various teaching modules (2-3 hours each) on April 10 and 11. A related discussion forum Big Data Summit is followed on April 12.

Analytics vendors, consultants, clients and associates of the Advanced Analytics Institute (AAI) are welcome to display their products and services in an industrial stand area at the Big Data School. Should you have an interest please contact Colin Wise on Colin.Wise@uts.edu.au and or Tel. 02-9514-9267

Large scale biomedical and healthcare data mining and applications Part I

by Professor Limsoon Wong, National University of Singapore

Limsoon Wong is a provost's chair professor of computer science and a professor of pathology at the National University of Singapore. He currently works mostly on knowledge discovery technologies and their application to biomedicine. Prior to that, he has done significant research in database query language theory and finite model theory, as well as significant development work in broad-scale data integration systems. Limsoon has written about 150 research papers, a few of which are among the best cited of their respective fields. He serves/d on the editorial boards of Information Systems, Journal of Bioinformatics and Computational Biology, Bioinformatics, IEEE/ACM Transactions on Computational Biology and Bioinformatics, Drug Discovery Today, and Journal of Biomedical Semantics. He co-founded and is chairman of Molecular Connections, a provider of data curation services employing over 700 curators, analysts, and engineers.

10:15am – 10:30am

Morning tea break

10:30am – 11:45pm

Large scale biomedical and healthcare data mining and applications Part II

At Teradata Aster, Ross is responsible for data mining, analytics and advanced modeling projects using the Teradata Aster platform. Previously Ross ranDatamilk, an independent bespoke data mining consultancy specialising in data mining and advanced predictive analytics. Ross is a six sigma black belt and has had many years of experience in a variety of statistical roles including Business Development Management at Minitab and as a SAS Analyst at New Frontier Publishing.

Ross has a Master of Applied Statistics and a first class honors degree in Pure Mathematics. He has a keen interest in a number of data mining techniques especially social network analysis and random forests.

In his free time Ross regularly competes on Kaggle, an online forum where data scientists match their skills against their global peers including experts in statistics, mathematics, and machine learning. Ross is also an active member of the Sydney users of R Forum (SURF).

12:30pm – 1:30pm

Lunch and industrial demo & exhibition

1:30pm – 3:10pm

Mining Big Data: the state-of-the-art and beyond

by Associate Professor Kai-Ming Ting, Monash University

Kai Ming Ting is an Associate Professor in the Faculty of Information Technology at Monash University, and currently serves as the Associate Dean Research Training in the Faculty of Information Technology. He had previously held academic positions at Waikato University and Deakin University, and visiting positions at Osaka University, Japan, Nanjing University, China, and Chinese University of Hong Kong. His research projects have been supported by grants from Australian Research Council, US Air Force of Scientific Research (AFOSR/AOARD), Australian Institute of Sport, and Toyota InfoTechnology Center (Japan). Awards received include the Runner-up Best Paper Award in 2008 IEEE ICDM, and the Best Paper Award in 2006 PAKDD. He received his PhD from Sydney University.

He is the creator of a new paradigm in data mining called mass estimation. Density estimation is the current paradigm on which most existing data mining algorithms are based. The unique feature of mass estimation is that it has constant time and space complexities, ideal for solving problems with big data.

His research interests are in the areas of mass estimation and mass-based approaches, ensemble approaches and data stream data mining. He is an associate editor for Journal of Data Mining and Knowledge Discovery. He had co-chaired the Pacific-Asia Conference on Knowledge Discovery and Data Mining 2008. He had served as a member of program committees for a number of international conferences including ACM SIGKDD, IEEE ICDM and ICML.

3:10pm-3:30pm

Afternoon tea break

3:30pm – 5:10pm

Large-scale Support Vector Machines: Current Research Trends and Future Directions

By Haimonti Dutta

Haimonti Dutta holds a joint appointment at the Center for Computational Learning Systems (CCLS), Columbia University, NY and Indraprastha Institute of Information Technology (IIIT), Delhi India. She is an Associate Research Scientist at CCLS and Assistant Professor in the Computer Science and Engineering Department at IIIT, Delhi. She received her Ph.D. in Computer Science and Electrical Engineering (CSEE) from the University of Maryland, Baltimore County (UMBC) in 2007 her thesis being on discovering patterns and knowledge from large scale distributed systems. Her research interests include machine learning, data mining and pattern recognition; distributed optimization; data intensive computing; distributed and parallel data mining. She has been on the program committees for many conferences including Knowledge Discovery and Data Mining Conferences (KDD), International Conference on Data Mining (ICDM), SIAM Data Mining Conference (SDM), European Conference on Machine Learning (ECML) and has presented/published research papers at many prestigious venues including ICDM, SIAM Data Mining Conference, ICML, HiPC and ICMLA. Her current research is funded by the National Science Foundation, National Endowment of Humanities, Epilepsy Research Foundation and an industrial funding from the Consolidated Edison Company of New York. She is a recipient of the Dr B. C. Roy Scholarship for academic excellence and the UMBC Graduate Dissertation Fellowship, and was nominated for the Best Paper Award at the International Conference on Machine Learning and Applications (ICMLA) in 2008.

Geoff Webb is a Professor of Information Technology Research in the Faculty of Information Technology at Monash University, where he heads the Centre for Research in Intelligent Systems. Prior to Monash he held appointments at Griffith University and then Deakin University, where he received a personal chair. His primary research areas are machine learning, data mining, and user modelling. He is known for his contribution to the debate about the application of Occam's razor in machine learning and for the development of numerous methods, algorithms and techniques for machine learning, data mining and user modelling. His commercial data mining software, Magnum Opus, incorporates many techniques from his association discovery research. Many of his learning algorithms are included in the widely-used Weka machine learning workbench. He is editor-in-chief of Data Mining and Knowledge Discovery, co-editor of the Springer Encyclopedia of Machine Learning, a member of the advisory board of Statistical Analysis and Data Mining and a member of the editorial boards of Machine Learning and ACM Transactions on Knowledge Discovery from Data. He was co-PC Chair of the 2010 IEEE International Conference on Data Mining and co-General Chair of the 2012 IEEE International Conference on Data Mining.

Jaideep Srivastava is Professor of Computer Science & Engineering at the University of Minnesota, where he directs a laboratory focusing on research in Web Mining, Social Media Analytics, and Health Analytics. He has authored over 300 papers, and supervised 30 PhD dissertations and 59 MS theses. He is currently co-leading a multi-institutional, multi-disciplinary project in the rapidly emerging area of social computing (http://vwobservatory.com/). His research has been supported by government agencies, including NSF, NASA, ARDA, DARPA, IARPA, NIH, CDC, US Army, US Air Force, and MNDoT; and industries, including IBM, United Technologies, Eaton, Honeywell, Cargill, and Huawei Telecom. He has an active collaboration with Allina's Center for Healthcare Innovation (http://www.allina.com/ahs/aboutallina.nsf/page/health_care_innovation), where he is a Distinguished Fellow. Dr.Srivastava has significant experience in the industry, in both consulting and executive roles. He has led a data mining team at Amazon.com (www.amazon.com), built a data analytics department at Yodlee (www.yodlee.com), and served as the Chief Technology Officer for Persistent systems (www.persistentsys.com). He has provided technology and strategy advice to Cargill, United Technologies, IBM, Honeywell, KPMG, 3M, TCS, and Eaton, and has served as Advisor to the State Government of Minnesota, the State Government of Maharashtra, and is presently technology adviser to the UID project of the Government of India. He has held distinguished professorships at Heilongjiang University and Wuhan University, China. Dr.Srivastava has BTech from the Indian Institute of Technology (IIT), Kanpur, India, and MS and PhD from University of California, Berkeley. He is a Fellow of the Institute of Electrical and Electronics Engineers (IEEE), and has been an IEEE Distinguished Visitor. He has given over 150 invited talks in over 30 countries, including more than a dozen keynote addresses at major international conferences. Dr.Srivastava is the Co-Founder and CTO of Ninja Metrics (www.ninjametrics.com), which brings his research in social analytics to the commercial world.