There is just one goal -- prepare 1st year PhD students in statistics for research.

This is not an applied statistics course.

This is not an R course. Fluency in R is assumed.

If you are not a 1st year PhD student in statistics, please, note:

Undergraduate and MBA students require an interview with the
instructor.

The course grade is heavily based on class participation (35%).

PhD students from programs other than statistics
should not count on completing this class and therefore sign
up for sufficient credit from other courses.

The Course Selection Period ends on Monday, 2017/09/18.

The Drop Period ends on Monday, 2017/10/09.

First day of class:

Friday, 3pm, 2017/09/01, room 1201 SH-DH

Homework Assignments:

General honor code: You may discuss the problems with each
other in general terms, but you must write your own solution. All
sources, including friends and colleagues, must be cited. It is
important to get used to a stringent code of conduct in scientific
writing. On the other hand, use commonsense and attribute where
honesty requires it. Two points worth special mention:
*** If you received an extension for a homework, do not consult
posted solutions.
*** An offense would be consulting solutions of homeworks from
previous years.
*** An exception is with regard to LaTex and English language help:
avail yourself to as much as you need from whichever source.

Homework 1 : Linear algebra (1) and Latex practice.
solutionsDue: Fri, 2017/09/22, 7pm
Edit the LaTex source and submit a PDF file by email.
If you have never used LaTex, you can first install some free software:

In the manual, pay special attention to Section 1.3.2 (special
characters) and Chapter 3 for math typesetting (math symbols:
Section 3.10). To produce PDF from LaTex, the LEd environment
requires you to click the green and blue right arrows in the tool
bar. Feel free to check out other free software and other
documents. If you find something particularly useful, please, let
me know.

Unless instructed otherwise, homeworks should be e-mailed
in attachments to stat961.at.wharton[at-sign]gmail.com.
The format should be .R or or .pdf or .doc depending on
the assignment.
Your checked and graded solutions are returned in e-mail attachments.
Search '#AB' to find comments.
A score such as 8/10 at the end means '8 out of 10 points'.
A deduction of 2 points does not mean you got two questions wrong; it is
only a relative measure of how much below optimal your solutions are.

IMPORTANT: If a function in the class notes does not work or is not
found in your R session, check whether the function is in one
of the R code files below. If so, download and read the file into R
one more time, even if you thought you had done so earlier.
I allow myself to update the code all the time.

Background Papers:

The paper on tree-based regression and classification
is in this PDF file.

Undergraduate students contemplating this course: If you
do not have a solid background in statistics and linear algebra
already as well as some programming experience, you should not take
this class or, at a minimum, not rely on credit from it for
graduation. As mentioned above, the goal is to prepare students for
statistics research, and there will be only one standard of
performance for all students.

Publication quality writing and mathematical typesetting are of
utmost importance for statistics research. To get used to the
standards of writing research papers in statistics, some homeworks
will be required to be typeset in LaTex and submitted by e-mail
as a PDF file. You will have to learn LaTex on your own with the help
of other graduate students, but getting started with Latex
will be facilitated by templates provided by the instructor, so all
you need to do is cannibalize the templates by filling in your
solutions.

The only required text for the course is a book about the art of writing:

If you need more reading about R, look up the numerous
books about R
or the numerous free
web documents about R.
Yet another way to find R introductions is to do a search for
"Introduction to R". If you find something particularly
useful, please, let the instructor know.

Recommended texts:

For regression: Seber and Lee, "Linear Regression Analysis"
(Wiley Series in Probability and Statistics)

For linear algebra: Strang, "Linear Algebra and its Applications" (Academic Press)
Strangely, the most fundamental material is no longer in the recent edition:
"Linear Transformations, Matrices, and Change of Basis."
In older editions this used to be tucked away in the appendix.
For this reason, the material is now included in Homework 2:
You get to derive it yourself by following instructions.

R. L. Harris,
"Information Graphics",
an excellent overview of useful and common data visualizations.
More recent developments can be followed at the TED talks, several
by Rosling,
another by McCandless.

As we go along, special topics books will be recommended.

Supplemental material suggested by previous
participants in Stat 961:

A fabulous and fun article on R by Patrick Burns:
R Inferno.
Print it and keep it on your bedside!