Okay, that was… long. And even that listing is massively incomplete. There really are a lot of big ideas in statistics that I haven’t covered in this book. It can seem pretty depressing to finish a 600-page textbook only to be told that this only the beginning, especially when you start to suspect that half of the stuff you’ve been taught is wrong. For instance, there are a lot of people in the field who would strongly argue against the use of the classical ANOVA model, yet I’ve devote two whole chapters to it! Standard ANOVA can be attacked from a Bayesian perspective, or from a robust statistics perspective, or even from a “it’s just plain wrong” perspective (people very frequently use ANOVA when they should actually be using mixed models). So why learn it at all?
As I see it, there are two key arguments. Firstly, there’s the pure pragmatism argument. Rightly or wrongly, ANOVA is widely used. If you want to understand the scientific literature, you need to understand ANOVA. And secondly, there’s the “incremental knowledge” argument. In the same way that it was handy to have seen one-way ANOVA before trying to learn factorial ANOVA, understanding ANOVA is helpful for understanding more advanced tools, because a lot of those tools extend on or modify the basic ANOVA setup in some way. For instance, although mixed models are way more useful than ANOVA and regression, I’ve never heard of anyone learning how mixed models work without first having worked through ANOVA and regression. You have to learn to crawl before you can climb a mountain.
Actually, I want to push this point a bit further. One thing that I’ve done a lot of in this book is talk about fundamentals. I spent a lot of time on probability theory. I talked about the theory of estimation and hypothesis tests in more detail than I needed to. When talking about R, I spent a lot of time talking about how the language works, and talking about things like writing your own scripts, functions and programs. I didn’t just teach you how to draw a histogram using
hist(), I tried to give a basic overview of how the graphics system works. Why did I do all this? Looking back, you might ask whether I really needed to spend all that time talking about what a probability distribution is, or why there was even a section on probability density. If the goal of the book was to teach you how to run a t-test or an ANOVA, was all that really necessary? Or, come to think of it, why bother with R at all? There are lots of free alternatives out there: PSPP, for instance, is an SPSS-like clone that is totally free, has simple “point and click” menus, and can (I think) do every single analysis that I’ve talked about in this book. And you can learn PSPP in about 5 minutes. Was this all just a huge waste of everyone’s time???
The answer, I hope you’ll agree, is no. The goal of an introductory stats is not to teach ANOVA. It’s not to teach t-tests, or regressions, or histograms, or p-values. The goal is to start you on the path towards becoming a skilled data analyst. And in order for you to become a skilled data analyst, you need to be able to do more than ANOVA, more than t-tests, regressions and histograms. You need to be able to think properly about data. You need to be able to learn the more advanced statistical models that I talked about in the last section, and to understand the theory upon which they are based. And you need to have access to software that will let you use those advanced tools. And this is where – in my opinion at least – all that extra time I’ve spent on the fundamentals pays off. If you understand the graphics system in R, then you can draw the plots that you want, not just the canned plots that someone else has built into R for you. If you understand probability theory, you’ll find it much easier to switch from frequentist analyses to Bayesian ones. If you understand the core mechanics of R, you’ll find it much easier to generalise from linear regressions using
lm() to using generalised linear models with
glm() or linear mixed effects models using
lmer(). You’ll even find that a basic knowledge of R will go a long way towards teaching you how to use other statistical programming languages that are based on it. Bayesians frequently rely on tools like WinBUGS and JAGS, which have a number of similarities to R, and can in fact be called from within R. In fact, because R is the “lingua franca of statistics”, what you’ll find is that most ideas in the statistics literature has been implemented somewhere as a package that you can download from CRAN. The same cannot be said for PSPP, or even SPSS.
In short, I think that the big payoff for learning statistics this way is extensibility. For a book that only covers the very basics of data analysis, this book has a massive overhead in terms of learning R, probability theory and so on. There’s a whole lot of other things that it pushes you to learn besides the specific analyses that the book covers. So if your goal had been to learn how to run an ANOVA in the minimum possible time, well, this book wasn’t a good choice. But as I say, I don’t think that is your goal. I think you want to learn how to do data analysis. And if that really is your goal, you want to make sure that the skills you learn in your introductory stats class are naturally and cleanly extensible to the more complicated models that you need in real world data analysis. You want to make sure that you learn to use the same tools that real data analysts use, so that you can learn to do what they do. And so yeah, okay, you’re a beginner right now (or you were when you started this book), but that doesn’t mean you should be given a dumbed-down story, a story in which I don’t tell you about probability density, or a story where I don’t tell you about the nightmare that is factorial ANOVA with unbalanced designs. And it doesn’t mean that you should be given baby toys instead of proper data analysis tools. Beginners aren’t dumb; they just lack knowledge. What you need is not to have the complexities of real world data analysis hidden from from you. What you need are the skills and tools that will let you handle those complexities when they inevitably ambush you in the real world.
And what I hope is that this book – or the finished book that this will one day turn into – is able to help you with that.
Adair, G. 1984. “The Hawthorne Effect: A Reconsideration of the Methodological Artifact.” Journal of Applied Psychology 69: 334–45.
Agresti, A. 1996. An Introduction to Categorical Data Analysis. Hoboken, NJ: Wiley.
———. 2002. Categorical Data Analysis. 2nd ed. Hoboken, NJ: Wiley.
Akaike, H. 1974. “A New Look at the Statistical Model Identification.” IEEE Transactions on Automatic Control 19: 716–23.
Bickel, P. J., E. A. Hammel, and J. W. O’Connell. 1975. “Sex Bias in Graduate Admissions: Data from Berkeley.” Science 187: 398–404.
Box, J. F. 1987. “Guinness, Gosset, Fisher, and Small Samples.” Statistical Science 2: 45–52.
Braun, John, and Duncan J Murdoch. 2007. A First Course in Statistical Programming with R. Cambridge University Press Cambridge.
Brown, M. B., and A. B. Forsythe. 1974. “Robust Tests for Equality of Variances.” Journal of the American Statistical Association 69: 364–67.
Campbell, D. T., and J. C. Stanley. 1963. Experimental and Quasi-Experimental Designs for Research. Boston, MA: Houghton Mifflin.
Cohen, J. 1988. Statistical Power Analysis for the Behavioral Sciences. 2nd ed. Lawrence Erlbaum.
Cook, R. D., and S. Weisberg. 1983. “Diagnostics for Heteroscedasticity in Regression.” Biometrika 70: 1–10.
Cramér, H. 1946. Mathematical Methods of Statistics. Princeton: Princeton University Press.
Dunn, O.J. 1961. “Multiple Comparisons Among Means.” Journal of the American Statistical Association 56: 52–64.
Ellis, P. D. 2010. The Essential Guide to Effect Sizes: Statistical Power, Meta-Analysis, and the Interpretation of Research Results. Cambridge, UK: Cambridge University Press.
Ellman, Michael. 2002. “Soviet Repression Statistics: Some Comments.” Europe-Asia Studies 54 (7). Taylor & Francis: 1151–72.
Evans, J. St. B. T., J. L. Barston, and P. Pollard. 1983. “On the Conflict Between Logic and Belief in Syllogistic Reasoning.” Memory and Cognition 11: 295–306.
Evans, M., N. Hastings, and B. Peacock. 2011. Statistical Distributions (3rd Ed). Wiley.
Fisher, R. A. 1922a. “On the Interpretation of χ2 from Contingency Tables, and the Calculation of p.” Journal of the Royal Statistical Society 84: 87–94.
———. 1922b. “On the Mathematical Foundation of Theoretical Statistics.” Philosophical Transactions of the Royal Society A 222: 309–68.
———. 1925. Statistical Methods for Research Workers. Edinburgh, UK: Oliver; Boyd.
Fox, J., and S. Weisberg. 2011. An R Companion to Applied Regression. 2nd ed. Los Angeles: Sage.
Gelman, A., and H. Stern. 2006. “The Difference Between ‘Significant’ and ‘Not Significant’ Is Not Itself Statistically Significant.” The American Statistician 60: 328–31.
Gunel, Erdogan, and James Dickey. 1974. “Bayes Factors for Independence in Contingency Tables.” Biometrika, 545–57.
Hays, W. L. 1994. Statistics. 5th ed. Fort Worth, TX: Harcourt Brace.
Hedges, L. V. 1981. “Distribution Theory for Glass’s Estimator of Effect Size and Related Estimators.” Journal of Educational Statistics 6: 107–28.
Hedges, L. V., and I. Olkin. 1985. Statistical Methods for Meta-Analysis. New York: Academic Press.
Hogg, R. V., J. V. McKean, and A. T. Craig. 2005. Introduction to Mathematical Statistics. 6th ed. Upper Saddle River, NJ: Pearson.
Holm, S. 1979. “A Simple Sequentially Rejective Multiple Test Procedure.” Scandinavian Journal of Statistics 6: 65–70.
Hothersall, D. 2004. History of Psychology. McGraw-Hill.
Hsu, J. C. 1996. Multiple Comparisons: Theory and Methods. London, UK: Chapman; Hall.
Ioannidis, John P. A. 2005. “Why Most Published Research Findings Are False.” PLoS Med 2 (8). Public Library of Science: 697–701.
Jeffreys, Harold. 1961. The Theory of Probability. 3rd ed. Oxford.
Johnson, Valen E. 2013. “Revised Standards for Statistical Evidence.” Proceedings of the National Academy of Sciences, no. 48: 19313–7.
Kahneman, D., and A. Tversky. 1973. “On the Psychology of Prediction.” Psychological Review 80: 237–51.
Kass, Robert E., and Adrian E. Raftery. 1995. “Bayes Factors.” Journal of the American Statistical Association 90: 773–95.
Keynes, John Maynard. 1923. A Tract on Monetary Reform. London: Macmillan; Company.
Kruschke, J. K. 2011. Doing Bayesian Data Analysis: A Tutorial with R and BUGS. Burlington, MA: Academic Press.
Kruskal, W. H., and W. A. Wallis. 1952. “Use of Ranks in One-Criterion Variance Analysis.” Journal of the American Statistical Association 47: 583–621.
Kühberger, A, A Fritz, and T. Scherndl. 2014. “Publication Bias in Psychology: A Diagnosis Based on the Correlation Between Effect Size and Sample Size.” Public Library of Science One 9: 1–8.
Lee, Michael D, and Eric-Jan Wagenmakers. 2014. Bayesian Cognitive Modeling: A Practical Course. Cambridge University Press.
Lehmann, Erich L. 2011. Fisher, Neyman, and the Creation of Classical Statistics. Springer.
Levene, H. 1960. “Robust Tests for Equality of Variances.” In Contributions to Probability and Statistics: Essays in Honor of Harold Hotelling, edited by I. Olkin et al, 278–92. Palo Alto, CA: Stanford University Press.
Long, J.S., and L.H. Ervin. 2000. “Using Heteroscedasticity Consistent Standard Errors in Thee Linear Regression Model.” The American Statistician 54: 217–24.
Matloff, Norman, and Norman S Matloff. 2011. The Art of R Programming: A Tour of Statistical Software Design. No Starch Press.
McGrath, R. E., and G. J. Meyer. 2006. “When Effect Sizes Disagree: The Case of r and d.” Psychological Methods 11: 386–401.
McNemar, Q. 1947. “Note on the Sampling Error of the Difference Between Correlated Proportions or Percentages.” Psychometrika 12: 153–57.
Meehl, P. H. 1967. “Theory Testing in Psychology and Physics: A Methodological Paradox.” Philosophy of Science 34: 103–15.
Morey, Richard D., and Jeffrey N. Rouder. 2015. BayesFactor: Computation of Bayes Factors for Common Designs. http://CRAN.R-project.org/package=BayesFactor.
Pearson, K. 1900. “On the Criterion That a Given System of Deviations from the Probable in the Case of a Correlated System of Variables Is Such That It Can Be Reasonably Supposed to Have Arisen from Random Sampling.” Philosophical Magazine 50: 157–75.
Pfungst, O. 1911. Clever Hans (the Horse of Mr. von Osten): A Contribution to Experimental Animal and Human Psychology. Translated by C. L. Rahn. New York: Henry Holt.
R Core Team. 2013. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing.
Rosenthal, R. 1966. Experimenter Effects in Behavioral Research. New York: Appleton.
Rouder, J. N., P. L. Speckman, D. Sun, R. D. Morey, and G. Iverson. 2009. “Bayesian T-Tests for Accepting and Rejecting the Null Hypothesis.” Psychonomic Bulletin & Review 16: 225–37.
Sahai, H., and M. I. Ageel. 2000. The Analysis of Variance: Fixed, Random and Mixed Models. Boston: Birkhauser.
Shaffer, J. P. 1995. “Multiple Hypothesis Testing.” Annual Review of Psychology 46: 561–84.
Shapiro, S. S., and M. B. Wilk. 1965. “An Analysis of Variance Test for Normality (Complete Samples).” Biometrika 52: 591–611.
Spector, P. 2008. Data Manipulation with R. New York, NY: Springer.
Stevens, S. S. 1946. “On the Theory of Scales of Measurement.” Science 103: 677–80.
Stigler, S. M. 1986. The History of Statistics. Cambridge, MA: Harvard University Press.
Student, A. 1908. “The Probable Error of a Mean.” Biometrika 6: 1–2.
Teetor, P. 2011. R Cookbook. Sebastopol, CA: O’Reilly.
Welch, B. L. 1947. “The Generalization of ‘Student’s’ Problem When Several Different Population Variances Are Involved.” Biometrika 34: 28–35.
———. 1951. “On the Comparison of Several Mean Values: An Alternative Approach.” Biometrika 38: 330–36.
White, H. 1980. “A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity.” Econometrika 48: 817–38.
Yates, F. 1934. “Contingency Tables Involving Small Numbers and the χ2 Test.” Supplement to the Journal of the Royal Statistical Society 1: 217–35.