# 17.10: Summary

The first half of this chapter was focused primarily on the theoretical underpinnings of Bayesian statistics. I introduced the mathematics for how Bayesian inference works (Section 17.1), and gave a very basic overview of how Bayesian hypothesis testing is typically done (Section 17.2). Finally, I devoted some space to talking about why I think Bayesian methods are worth using (Section 17.3.

The second half of the chapter was a lot more practical, and focused on tools provided by the BayesFactor package. Specifically, I talked about using the contingencyTableBF() function to do Bayesian analogs of chi-square tests (Section 17.6, the ttestBF() function to do Bayesian t-tests, (Section 17.7), the regressionBF() function to do Bayesian regressions, and finally the anovaBF() function for Bayesian ANOVA.

If you’re interested in learning more about the Bayesian approach, there are many good books you could look into. John Kruschke’s book Doing Bayesian Data Analysis is a pretty good place to start (Kruschke 2011), and is a nice mix of theory and practice. His approach is a little different to the “Bayes factor” approach that I’ve discussed here, so you won’t be covering the same ground. If you’re a cognitive psychologist, you might want to check out Michael Lee and E.J. Wagenmakers’ book Bayesian Cognitive Modeling (Lee and Wagenmakers 2014). I picked these two because I think they’re especially useful for people in my discipline, but there’s a lot of good books out there, so look around!

References

Jeffreys, Harold. 1961. The Theory of Probability. 3rd ed. Oxford.

Kass, Robert E., and Adrian E. Raftery. 1995. “Bayes Factors.” Journal of the American Statistical Association 90: 773–95.

Fisher, R. 1925. Statistical Methods for Research Workers. Edinburgh, UK: Oliver; Boyd.

Johnson, Valen E. 2013. “Revised Standards for Statistical Evidence.” Proceedings of the National Academy of Sciences, no. 48: 19313–7.

Morey, Richard D., and Jeffrey N. Rouder. 2015. BayesFactor: Computation of Bayes Factors for Common Designs. http://CRAN.R-project.org/package=BayesFactor.

Gunel, Erdogan, and James Dickey. 1974. “Bayes Factors for Independence in Contingency Tables.” Biometrika, 545–57.

Kruschke, J. K. 2011. Doing Bayesian Data Analysis: A Tutorial with R and BUGS. Burlington, MA: Academic Press.

Lee, Michael D, and Eric-Jan Wagenmakers. 2014. Bayesian Cognitive Modeling: A Practical Course. Cambridge University Press.

1. http://en.wikiquote.org/wiki/David_Hume
3. It’s a leap of faith, I know, but let’s run with it okay?
4. Um. I hate to bring this up, but some statisticians would object to me using the word “likelihood” here. The problem is that the word “likelihood” has a very specific meaning in frequentist statistics, and it’s not quite the same as what it means in Bayesian statistics. As far as I can tell, Bayesians didn’t originally have any agreed upon name for the likelihood, and so it became common practice for people to use the frequentist terminology. This wouldn’t have been a problem, except for the fact that the way that Bayesians use the word turns out to be quite different to the way frequentists do. This isn’t the place for yet another lengthy history lesson, but to put it crudely: when a Bayesian says “a likelihood function” they’re usually referring one of the rows of the table. When a frequentist says the same thing, they’re referring to the same table, but to them “a likelihood function” almost always refers to one of the columns. This distinction matters in some contexts, but it’s not important for our purposes.
5. If we were being a bit more sophisticated, we could extend the example to accommodate the possibility that I’m lying about the umbrella. But let’s keep things simple, shall we?
6. You might notice that this equation is actually a restatement of the same basic rule I listed at the start of the last section. If you multiply both sides of the equation by P(d), then you get P(d)P(h|d)=P(d,h), which is the rule for how joint probabilities are calculated. So I’m not actually introducing any “new” rules here, I’m just using the same rule in a different way.
7. Obviously, this is a highly simplified story. All the complexity of real life Bayesian hypothesis testing comes down to how you calculate the likelihood P(d|h) when the hypothesis h is a complex and vague thing. I’m not going to talk about those complexities in this book, but I do want to highlight that although this simple story is true as far as it goes, real life is messier than I’m able to cover in an introductory stats textbook.
8. http://www.imdb.com/title/tt0093779/quotes. I should note in passing that I’m not the first person to use this quote to complain about frequentist methods. Rich Morey and colleagues had the idea first. I’m shamelessly stealing it because it’s such an awesome pull quote to use in this context and I refuse to miss any opportunity to quote The Princess Bride.
16. If you’re desperate to know, you can find all the gory details in Gunel and Dickey (1974). However, that’s a pretty technical paper. The help documentation to the contingencyTableBF() gives this explanation: “the argument priorConcentration indexes the expected deviation from the null hypothesis under the alternative, and corresponds to Gunel and Dickey’s (1974) a parameter.” As I write this I’m about halfway through the Gunel and Dickey paper, and I agree that setting a=1 is a pretty sensible default choice, since it corresponds to an assumption that you have very little a priori knowledge about the contingency table.
17. In some of the later examples, you’ll see that this number is not always 0%. This is because the BayesFactor package often has to run some simulations to compute approximate Bayes factors. So the answers you get won’t always be identical when you run the command a second time. That’s why the output of these functions tells you what the margin for error is.
20. Again, in case you care … the null hypothesis here specifies an effect size of 0, since the two means are identical. The alternative hypothesis states that there is an effect, but it doesn’t specify exactly how big the effect will be. The r value here relates to how big the effect is expected to be according to the alternative. You can type ?ttestBF to get more details.
22. I don’t even disagree with them: it’s not at all obvious why a Bayesian ANOVA should reproduce (say) the same set of model comparisons that the Type II testing strategy uses. It’s precisely because of the fact that I haven’t really come to any strong conclusions that I haven’t added anything to the lsr package to make Bayesian Type II tests easier to produce.