Secondly, we only need basic, clear, straight-forward information in each chapter. We are not trying to be exhaustive or complete--the value of this book is in the simple synthesis across subjects. There are other venues in which to wax eloquent on the deepness and complexities of a particular subject. Please place yourself in a "beginner's mind" as you make contributions. Please also scope each chapter so that it can be taught in a one-hour class period. If the chapter requires more than an hour to teach, it is probably too detailed.

To the extent possible, please use terms and concepts in the way in which they are defined in the Wikipedia and Wiktionary. This way students can refer to the corresponding Wikipedia / Wiktionary page to get a deeper understanding of the concept.

Thirdly, this is a cross-disciplinary book. We want to help people apply data science to all fields. Therefore, we need a wide variety of simple examples and simple exercises.

Fourthly, please adhere to the simple structure of each chapter: Summary of Main Points, Discussion, More Reading, Exercises, and References. We want the More Reading section to link to on-line resources. The References section may contain off-line resources. To start a new page, you should use the wiki markup from this prototype page.

Fifthly, as with any Wikibook please feel free to make corrections, expand explanations, and make additions where necessary, even if it is not "your" chapter. Use the discussion page to explain changes that might be controversial.

Sixthly, some syntax rules:

Please bold key terms and phrases the student should learn.

Put the name of functions and code snippets using the 'code' tags: <code>lm()</code>

Use in-line links [[ ]] to the Wikipedia, Wiktionary, WikiCommons, Wikibooks, and other Wikimedia Foundation properties.

Use references (<ref> </ref>) to "external" sources--both on-line and off-line.

When a data scientist thinks like a scientist, they think in terms of validity and reproducibility. The task is to set up tests that eliminate alternative explanations in such a way that any observer would come to the same conclusion if they did the work themselves.

This Project #2, which spans four chapters. Assemble into groups of 3 or 4 students. A group of three may not have the same members as the group for Project #1. A group of four may have no more than two students repeating from the group from the Project #1. This group will do the entire project together.

Conduct the experiment according to the design. Take pictures. Record your data results.

Enter the data into R. Use R to produce tables and draw plots of your data. See if you can draw the theoretical curve Galileo was trying to discover on your data plots.

Prepare a slide presentation that includes a description of your methods, pictures of your apparatus, a table of your raw data, a table of your analyzed results, plots of your results, a list of several things the group learned on its own about data science during the course of this project.

Note: Your group can specialize on tasks, but everyone needs to participate in all phases of the assignment. Also, the chapters covered to this point do not teach you everything you need to know to do this assignment. Please do the best you can with what you know. This assignment is not just to show the instructor how much of the previous chapters you have learned, but the assignment is a learning experience in and of itself. The assignment is designed for the students to discover knowledge not contained in the chapters.