December 2015


Okay, last month’s diary was total shit. Utter shit. I simply could not keep up. But this month I will return every day, as I used to, and catalog this journey.

Today has been a fun day. I’ve been writing (well, largely rewriting) my blog for the psychonomic society. It’s on confidence intervals, which are really tricky to write about because nobody understands them! Fortunately, that is largely the point of the post.

I don’t think I ever actually revealed what the super secret Bayes project was because the November diary capsized before it was done. It is an investigation of the history of the Bayes factor. E.J. came to me with this idea for a project because he wanted to write a neat little catalogue of how Harold Jeffreys developed these ideas and culminated in his Bayes factors. Well, it turns out that during his research E.J. found that Jeffreys was scooped!

In a footnote in one of Jeffreys’s last papers he mentions that J.B.S. Haldane actually computed the first one before Jeffreys could figure it out. Now this was a big find for E.J. It is part of the common lore in statistics that Jeffreys invented the Bayes factor. So he asked me to do some searching and see if I could find any instance of Jeffreys acknowledging Haldane earlier, because it would be incredible for Jeffreys to get scooped and never know it until he is old. But what I found was Jeffreys not only knew about Haldane’s derivations, but he wrote a commentary on the paper! We were not expecting this at all. How could Jeffreys have known about this and never mention it or cite Haldane? So I started looking for instances where Jeffreys might have acknowledged Haldane, but I could not find any.

He doesn’t acknowledge him in his papers where he develops significance tests, a mere 3 years after Haldane’s paper. But in an obscure paper from 1936 (cited 4 times ever!) I found Jeffreys saying that he and Haldane had both been advocating these methods. Which was at the same time he published his papers on significance tests! If he knew about this then surely he must have cited his work?

This is it for part one of the story behind this paper. I find this stuff so fascinating.


The super secret ba-  Oh right I can call it what it actually is now. The Origin of the Bayes Factor is officially done and ready to submit. We are waiting 2 more days for any comments from a big historian of statistics, and if we don’t hear back we will submit. Woot woot! I am incredibly proud of this paper, and I learned so so much in writing it.

I also finished up the very last revisions for my confidence intervals post for the Psychonomic society blog. Fun stuff!

I went through the old issues of Statistical Science. Found a few good ones I didn’t have. A paper by Le Cam on the history of the central limit theorem with commentaries, an interview with Henry Daniels, Lehmann telling the story behind his Testing Statistical Hypotheses book, Fienberg telling the abridged history of statistics, and Sandy Zabell interviewing William Kruskal.

Funny “article” on the PNIS site. And even funnier that people think they are serious! It’s an obvious parody site, people.

Cool post by Stephen Heard, giving advice on job interviews. Well, this is more of a story of how he flubbed his interviews. Some really really valuable information in there.


My blog post for the Psychonomic Society went up today. A fun experience, but I kind of like blogging on my own. Perhaps because having an editor feels off. I like being in control 😉

Remember folks, inference by interval testing is bad.

Also read some great papers today by Lindley (1997) and Young & Pettit (1996) about Bayes factors. Lindley’s was very interesting, as his focus was on testing using improper priors. The other I still have to look at more closely, but it was about Bayes factor methods that try to get around the problems associated with improper priors and testing.


Did a share of the CI blog with a little context.

Also doing some reading of a memoir for Jeffreys, written right after he died by Alan Cook. It’s a really long memoir so I skipped a lot of the mathy/physicsy stuff.

Micah Allen shared an interesting post of his, where he looked at how many clicks he got on paper shares on twitter.

Working on the pb&r manuscript for about 4 hours this morning. It is looking good I think.


I created a profile on Cross Validated today and started answering stuff. Mainly just basic Bayes questions. I spent a lot of time today reading through threads in the Bayesian, hypothesis testing, and other categories. There’s a lot of fun to be had on there I think.

This was an interesting post. How to calculate Bayesian “power”? Christian Robert answers.

This one, asking for clarification about prevalence of type 1 errors in underpowered studies.

Another, asking if it is kosher to refer to results as “nearly significant” or “somewhat significant.”


Doing some more Cross Validated today. Answered this question, about confidence intervals vs. credible intervals. I took a look at some other stack exchange forums but none of them are that interesting. Perhaps I’ll just peruse them occasionally.

Downloaded a few papers today. O’Hagan’s 1995 paper on Fractional Bayes Factors. Berger and Pericchi’s 1996 intrinsic bayes factor paper. Weiss’s 1997 sample size planning with Bayes paper.

Also reading some of Jeffreys’s book, but oddly enough I find it harder to parse than his papers. Papers which some have characterized as scratch notes that he used to compile his book.

Interesting post from “Bayes Laplace” again. I am not familiar with the methods he mentions in this post; frankly, I know nothing of physics so his example doesn’t help me at all 😛  But the idea of using an easier distribution to approximate the answers from a more complicated distribution reminds me of importance sampling.


Here’s a question on Cross Validated that I am considering answering. Not quite sure if I want to go for this one though since I’d probably have to write quite a bit to cover the things I’d want to say.

The comments section on Gelman’s blog is better than the blog itself. This comment on objective and subjective bayes labels is so good.

This was a good zinger from Bayes Laplace on twitter: “A bayesian is one who catching a glimpse of a donkey says “hello Frequentist'”

And here is an example from Cross Validated of the most bone-headed statistical analysis ever.


Doing some rewriting. We found out our RPP reanalysis got rejected before thanksgiving, so we were thinking about what to do next. We got rejected because (ironically) they didn’t find the results novel enough. But we did get some good feedback to incorporate into it. We also got some pretty stupid feedback from one reviewer, but that’s how it goes. It’s surely partially our fault in that we were not crystal clear on a few things. The plan is to tidy up the manuscript and resubmit somewhere else that isn’t so hell-bent on novel results (it’s a reanalysis, how can it be expected to be novel?).  Anyway, we actually got some good feedback from the editor. The editor basically said they would reconsider our paper if we majorly rewrote with some other, newer focus. Rather than rewrite the whole paper with a new focus, we will just submit this paper and write a new paper with a different focus.

Our Bayes factor history paper got desk-rejected 😦 so we resubmitted somewhere else. The editor said it wasn’t a good fit for the journal. Whatever! At least he didn’t indict the quality of the manuscript!

Also doing some reading of the book compiled for Dennis Lindley’s 90th birthday. Interesting to read about how influential he was. I mean, everyone knows he was incredibly influential broadly on the field of Bayesian statistics, but these stories contain some great stories of how he was influential on so many individuals.


Reading the rest of O’Hagan (started yesterday) and the commentaries. Lindley’s is fantastic. His argument against improper priors is on point. Adrian Smith’s was great too. He used a phrase I might steal for the paper Joachim and I will write next, “(through ignorance or indolence)”, which I think is just great. This is a lovely quote from Lavine and Wolpert’s reply, “When comparing models the investigator simply cannot shirk the responsibility of specifying the prior distribution (and hence the alternative hypothesis) in more detail.”

Really cool applied blog post from Gelman today.


A really neat graphic from the New York Times today. I love graphics that you can interact with that change and morph in real time.

A timely post from Stephen Heard on paper rejections. Given I’ve just had two rejections, it was a good chance for me to reflect. I think, based on his reasonable arguments, it is right for us not to appeal either rejection.

Allen Downey posts on how he breaks the rules in statistics. He comments, “If we were really not allowed to say anything about H, significance testing would be completely useless” and I say, yes that’s the case. He also says, “a small p-value indicates that the observed data are unlikely under the null hypothesis.  Assuming that they are more likely under H (which is almost always the case), you can conclude that the data are evidence in favor of H and against H0.” That’s a bit of an assumption, isn’t it? Nonetheless, when you claim they work because they approximate what a Bayes factor does then that isn’t an argument for their use — but for the Bayes factor’s.

An exciting update from the JASP team: Annotations are in! OSF integration is in! Binomial example is in! I doubt all that many people will watch the video to the end to see the sad news so I’ll print it here: Jonathon is leaving the team 😦 And Damo 😦 I already knew from being with them in Amsterdam last month, but still sad 😦

Interesting article from the RSS “StatsLife” news people, about the need for better use of statistics in biomed research. “ctrl+F  ‘bayes’ = no results” makes me sad.

Interesting piece from Miriam Posner (with Deb Verhoeven) about teaching technical skills to a group. Lots of good stuff in here. I love the post-it notes idea!


Read a few old papers today. One was Jeffreys from 1942, writing about the theory of probability as it relates to quantum probability. Admittedly, I don’t understand a lot of the commentary because I have approximately zero physics education. But the parts I read were good, and again I find that any time I read Jeffreys I come away with something. He also suggests using the terms “Skewordinates” and “Skewmenta” instead of coordinates and momenta. A shame those didn’t catch on.


Doing some writing today on the RPP paper. Nothing too exciting, and not reading much. Went to the UT basketball game though 🙂


(a little) more writing! And also reading over the chapter on model comparison that Joachim, Dora and EJ wrote. It’s a great piece and I’m glad we decided to include it in our PB&R paper.

John Kruschke wrote a new post about how the conclusions we draw shouldn’t be based solely on the Bayes factor. I’ve written similar stuff. He also goes on about modeling the uncertainty in prior model probabilities, which is something to think about I guess.


This was a weird post. What to bring when traveling as an academic? Admittedly a timely post for me, but a little TMI in there.

Joachim and I are finishing up our paper revisions (at least it feels like we are almost finished). We’ve added a bit of substance to the paper, which makes it feel like it has some more body. So I guess the comments we got from the reviewers did help us improve the paper, even if a few comments didn’t make any sense.

Nature Communications is implementing a really really cool policy starting next month. They are going to start publishing the peer review and editorial correspondence alongside accepted submissions. Of course, authors can opt out but I hope setting it as a default makes people more willing to let these things go out with the paper. I’m pretty psyched to see this happen at such a big journal.

I think I want to get more into bayesian model averaging. It’s super cool.


The manybabies project has been announced! An initiative desperately needed in a field which has tremendous difficulty recruiting participants. Babies are notoriously hard to get into the lab, and even when they come in they are often too fussy to do the study. Props to Michael Frank and whoever joins the project.

No word yet on our origin BF resubmission. Hopefully it goes out for review and isn’t desk rejected again!


Stephen Heard posting on peer review again, this time discussing post-publication peer review. I like his discussions on peer review because they’ve challenged me to stop being so black and white about it. I think he has done a good job convincing me that there is merit to the peer review process even if sometimes there are a few bad apples.

An interesting discussion on Jeremy Fox’s blog about the utility of the R² metric. Is it a zombie statistic, or does it have its use? He summarizes some of Cosma Shalizi’s arguments against R² and Fox concludes that he isn’t ready to throw it in the garbage just yet. I don’t think about R² very much, so I don’t have a very informed opinion about it. One of the complaints raised is that one cannot compare R² from the same model across different datasets. Another is that R² doesn’t entail a measurement of precision. Interesting discussion in the comments too!


I came across a very interesting paper from Jan Sprenger (twitter discussion). A lot of the paper is stuff I already knew, but section 4 was pretty insightful. Here is the gist of it: When you sample until a certain BF is achieved, you can bias your effect size estimates. When you stop and your sample is small, the bias is toward larger estimates, and when you stop when the sample is large the bias is toward smaller estimates. Now, this I knew already. What I didn’t really see before is that when you stop in a small sample the dispersion of your credible interval will be wide, indicating much uncertainty in the estimate. So if you take the posterior mode as your estimate, then that will be systematically larger in the small sample in which you stop at BF>X. But when you take into account all of the information you have in the posterior then you will literally see that the estimate comes with a large amount of uncertainty.


Doing some more reading on model averaging. There isn’t much out there on this that is related specifically to psychology problems. Perhaps something to make happen 🙂

Also reading a few papers related to planning sample sizes using estimates from earlier studies. Korn has a couple of papers from 1990 (“Projection from previous studies” and “Projecting power from a previous study”) that are on this topic, and there is also a paper by Kraemer and colleagues from 2006 (“Caution regarding the use of pilot studies”).


Started writing a blog post but stopped half way through since it felt like it wasn’t really going anywhere. Perhaps I will come back to it in a few weeks and will be able to find some direction.

Reading over Robert’s commentary on Alexander Ly’s Harold Jeffreys Bayes Factor paper. He proposes that the Bayes factor is dead, and advocates using a mixture estimation approach.


Holiday scrambling, not much stats going on for me at the moment.

This really cool post explaining probability distributions from the Cloudera Blog.


Doing some linear algebra study today. That little green book I picked up at psychonomics turned out to be pretty good.


Break for Christmas holidays 🙂


Found a neat summary of empirical Bayes by Deely and Lindley from 1981. It is oddly titled “Bayes Empirical Bayes”. Double Bayes? Anyway. They show that Empirical Bayes is not really Bayesian in most cases, and only corresponds to certain real Bayesian procedures in special circumstances. On google scholar this paper has many many cites, so I guess it’s fairly well-known, but I hadn’t seen it before.

Also, this piece on using Bayesian analysis to reduce necessary sample size for clinical research is interesting. I for one did not know that there was an Open BUGS interface through microsoft excel, and am sad to hear that people use excel to run BUGS.


Here is a paper on conjugate distributions by George, Makov, and Smith from 1993. This is a pretty nice outline of what conjugate distributions are, how they work, and it gives a neat example starting on page 154 (Section 4). Conjugate distributions are mostly a crutch from the past, since now one can use any distribution and just MCMC to the results. But they are still very useful for illustrative purposes (blogs, tutorials, etc). They are a vey natural way to explain Bayesian updating since all one needs to do is some basic algebra to change to prior parameter values to posterior values.

An interesting preprint that purports to show how one can determine model priors in linear regression based in part on the number of covariates included in the model and the loss associated with excluding a model from the set under consideration. They claim to be able to outperform uniform priors on models, which is no surprise since in general we can bring information to a problem based on how the problem is formulated.


“Bayes Laplace” has another snarky blog post, this time titled, “Statistics is in a sorry state”. He comes to this conclusion based on an exchange on Deborah Mayo’s blog about measurements of the speed of light. If you can manage to get past the snark you might find an interesting post underneath.

Aaron Clauset’s year in review is certified insane. How does one do that much? I wonder if I’ll ever be half that busy.


Here is a long long list of discussion of the frequentist properties and performance  (ie, “Consistency”) of Bayesian procedures. It isn’t clear to me who compiled it. There is also this huge list of similar notebooks. For example, here is one on model selection broadly defined.

I tweeted my blog year in review. Overall I am pretty happy with what I’ve done on here this year. I grew my audience substantially, I’ve tackled practical posts, theoretical posts, and I think I have some good ideas for going forward.


Happy Holidays 🙂

“Bayes Laplace” writes today about how science can permanently fail. The post is written as a parable, intended to illustrate how frequentist statisticians have created a tower of adhockery around their methods that must be broken down lest it forever poison the minds of scientists forever. A bit melodramatic but a fun post to read. I think it does capture some of the spirit of modern frequentist statistical apologists, who create all these different “corrections” in order to find a sensible (i.e., approximately Bayesian) answer to their problem. Better to just use the Bayesian framework outright so that you don’t have to fool yourself anymore.