Transcription

2 In the 1920s, the computing labs helped establish statistics on the American continent. Without them, even a modest study was beyond the ability of an individual statistician. At the same time, statistics labs often had the most powerful computing machines within their larger institution. They showed how organized computing could benefit science and provided a place for the earliest of computer scientists to test their ideas. -- Grier The origins of statistical computing, Amstat Online

9 Data Science is just a fancy name for statistics. Fitting simple models to messy and sometimes large data sets Combination of standard black-box fitting tools and good graphics. Doesn t require any fundamental knowledge our students don t have. Needs good computing skills, which our students can learn

11 Data Science is just a fancy name for statistics. Data Wrangling isn t statistics If you value self-consistency, you can hold at most one of these opinions. A/Prof Jenny Bryan, UBC (less than one is good)

12 Data science is statistics in the same way that epidemiology is statistics opinion polling is statistics ag. field trials are statistics

13 I did think, however, that many well-known applied statisticians attacked problems without the necessary mathematical knowledge and manipulative skill. Moreover, I believed that a principal cause of failure among medical research scientists was the lack of basic scientific knowledge in their special chosen field. H. O. Lancaster

15 Computing is easier to steal Need to teach our data science students: A bit about databases and SQL A statistical programming language (eg R) Abstractions such as tidy data, sparse, map/reduce Reproducible data analysis (eg rmarkdown)?collaborative version control (eg git/github) Force students to work with a wild-caught data set and I'm still pretty sure some of the data is Permit interested students to learn the high-tech data structures missing, and butalgorithms could still stuff. be here, in this ONE HUNDRED SHEET excel file a PhD student on Twitter

16 But we don t know this stuff! let mego glethat for you Google Search I'm Feeling Lucky The computing folks are way better at dissemination than us Unlike statistics, the computer can tell you if you get it wrong.

26 Complex Data Robustness of meaning can be hard: Suppose a Wilcoxon test shows X > Y, Y>Z What does this tell us about Means of X and Z? Medians of X and Z? Wilcoxon test of X and Z?

27 i i Messy Data Good applied statisticians know from messy data. o CM - X O and I'm still pretty sure some of the data is missing, but it could still be here, in this ONE HUNDRED SHEET excel file blooc Diastolic NnT i o r o a PhD student on Twitter 0 ao o c Age (years)

28 Badly Sampled Whom the Gods Would Destroy, They First Give Real-time Analytics [Dan McKinley, Etsy] This line of thinking is a trap. It's important to divorce the concepts of operational metrics and product analytics. Confusing how we do things with how we decide which things to do is a fatal mistake. Because non-representativeness of short time slices

What is Data Science? { Data, Databases, and the Extraction of Knowledge Renée T., @becomingdatasci, November 2014 Let s start with: What is Data? http://upload.wikimedia.org/wikipedia/commons/f/f0/darpa

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization GENOME 560, Spring 2012 Data are interesting because they help us understand the world Genomics: Massive Amounts

How to Make Money with Google Adwords For Cleaning Companies. H i tm a n Advertising Target Clients Profitably Google Adwords can be one of the best returns for your advertising dollar. Or, it could be

Department of Social and Political Sciences Computer Programming for the Social Sciences This two day workshop will teach beginner level, practical computer programming skills for use in social science

In this presentation, you will be introduced to data mining and the relationship with meaningful use. Data mining refers to the art and science of intelligent data analysis. It is the application of machine

The 3 questions to ask yourself about BIG DATA Do you have a big data problem? Companies looking to tackle big data problems are embarking on a journey that is full of hype, buzz, confusion, and misinformation.

Course Catalog In order to be assured that all prerequisites are met, students must acquire a permission number from the education coordinator prior to enrolling in any Biostatistics course. Courses are

WHITE PAPER GETTING AHEAD OF THE COMPETITION WITH DATA MINING Ultimately, data mining boils down to continually finding new ways to be more profitable which in today s competitive world means making better

A future career in analytics What is a career in analytics about? In the information age in which we live, almost all of us consume and produce digital data, either for business, community or private uses.

Search: Go blog.castac.org From the Committee on the Anthropology of Science, Technology, and Computing (CASTAC) About Adventures in Pedagogy Beyond the Academy Member Sound-Off News, Links, and Pointers

CHAPTER 4 Probabilities and Proportions Chapter Overview While the graphic and numeric methods of Chapters 2 and 3 provide us with tools for summarizing data, probability theory, the subject of this chapter,

Kerala School of Mathematics Course in Statistics for Scientists Introduction to Data Analysis T.Krishnan Strand Life Sciences, Bangalore What is Data Analysis Statistics is a body of methods how to use

HR STILL GETTING IT WRONG BIG DATA & PREDICTIVE ANALYTICS THE RIGHT WAY OVERVIEW Research cited by Forbes estimates that more than half of companies sampled (over 60%) are investing in big data and predictive

Big Data and Privacy Fritz Henglein Dept. of Computer Science, University of Copenhagen Finance IT Day Riga, 2015-03-26 About me Professor, Programming Languages and Systems, University of Copenhagen Director,

Statistics for BIG data Statistics for Big Data: Are Statisticians Ready? Dennis Lin Department of Statistics The Pennsylvania State University John Jordan and Dennis K.J. Lin (ICSA-Bulletine 2014) Before

Master of Science in Healthcare Informatics and Analytics Program Overview The program is a 60 credit, 100 week course of study that is designed to graduate students who: Understand and can apply the appropriate

Six Signs you are ready for BI WHITE PAPER LET S TAKE A LOOK AT THE WAY YOU MIGHT BE MONITORING AND MEASURING YOUR COMPANY About the auther You re managing information from a number of different data sources.

Correlational Research Chapter Fifteen Correlational Research Chapter Fifteen Bring folder of readings The Nature of Correlational Research Correlational Research is also known as Associational Research.

Five Reasons Spotfire Is Better than Excel for Business Data Analytics A hugely versatile application, Microsoft Excel is the Swiss Army Knife of IT, able to cope with all kinds of jobs from managing personal

2.2 Master of Science Degrees The Department of Statistics at FSU offers three different options for an MS degree. 1. The applied statistics degree is for a student preparing for a career as an applied

T O P I C 1 2 Techniques and tools for data analysis Preview Introduction In chapter 3 of Statistics In A Day different combinations of numbers and types of variables are presented. We go through these

Six Questions to Ask About Your Market Research Don t roll the dice ISR s tagline is Act with confidence because we believe that s what you re buying when you buy quality market research products and services,

data analysis data mining quality control web-based analytics What is Data Mining, and How is it Useful for Power Plant Optimization? (and How is it Different from DOE, CFD, Statistical Modeling) StatSoft

Five Tips for Presenting Data Analyses: Telling a Good Story with Data As a professional business or data analyst you have both the tools and the knowledge needed to analyze and understand data collected

T he complete guide to SaaS metrics What are the must have metrics each SaaS company should measure? And how to calculate them? World s Simplest Analytics Tool INDEX Introduction 4-5 Acquisition Dashboard

A Changing Standard for SEO Spam: Google Penguin, Link Penalties & Declining Leniency Overview If you own a small or medium-sized business, you ve likely hired an outside vendor to build external links

Top 5 Mistakes Made with Inventory Management for Online Stores For any product you sell, you have an inventory. And whether that inventory fills dozens of warehouses across the country, or is simply stacked

DATA SCIENCE CURRICULUM Before class even begins, students start an at-home pre-work phase. When they convene in class, students spend the first eight weeks doing iterative, project-centered skill acquisition.

Classroom Demonstrations of Big Data Eric A. Suess Abstract We present examples of accessing and analyzing large data sets for use in a classroom at the first year graduate level or senior undergraduate