An interview with Brad Efron about scientific writing. I haven’t watched the whole interview, but I do know that Efron is one of my favorite writers among statisticians.

Slidify, another approach for making HTML5 slides directly from R. (1) It is still just a little too hard to change the theme/feel of the slides (2) The placement/insertion of images is still a little clunky, Google Docs has figured this out, if they integrated the best features of Slidify, Latex, etc. into that system, it will be great.

Python is great and I think will be also great. For pure mathematics, it has lots of symbol calculations, since pure mathematics is abstract and powerful, like differential geometry, commutative algebra, algebraic geometry, and so on. However, science is nothing but experiment and computation. We also need powerful computational software to help us to carry out the result by powerful computation. Sage is your choice ! Since Sage claims that

Recently, I have heard a lot about the disadvantages of frequentist statistics, including the complain about p value, which is a hot topic due to the God particle.

Professor Kruschke, J.K. gave a talk on Doing Bayesian Data Analysis @ Michigan State University on September. He mentioned a concept “Intention“, including intended hypothesis, intended experiments, intended sampling. Basically he explained lots of frequentist procedure for doing statistics are intended procedure, which is not science, since everything depends on people’s intention. If you want to know more about this, please refer to the paper.

Sometimes the problem is that the frequentist criterion being used is not of applied relevance. Consider a simple problem such as estimating a proportion p, given y successes out of n trials, where n=100 and y=0. The best estimate of p will be different if I tell you that p is the probability of a rare disease, compared to if I tell you that p is the proportion of African Americans who plan to vote for Mitt Romney.

I do need some frequentist people to explain this intention issue, since I think it’s kind of reasonable questioning. Any comments?

Update:

The following cartoon caused a fight between Frequentist and Bayesian:

Suppose I had a medical test with a 1/6 false positive rate and a 0% false negative rate. That is, if administered to someone without the disease it has a 1/6 chance of reporting positive. The protocol is to administer the test and, if positive, to administer it again. Assuming independence, the probability of two consecutive false positives is 1/36. Some statisticians would reject the null hypothesis (that the patient is disease free) given 2/2 positive tests. That is ridiculous for the same reason the xkcd example is ridiculous (it ignores prior or base rate information) but is is indeed the practice in some circles, I’m told.—–Phil

Also refer to the explanation from Andrew:

In the context of probability mathematics, textbooks carefully explain that p(A|B) != p(B|A), and how a test with a low error rate can have a high rate of errors conditional on a positive finding, if the underlying rate of positives is low, but the textbooks typically confine this problem to the probability chapters and don’t explain its relevance to accept/reject decisions in statistical hypothesis testing.