This document serves as an annotated bibliography of papers and other resources that I’ve found to be useful as an author and reviewer. The entries are loosely organized by topic.
I have also included selected slide decks for methods talks that I’ve given since September 1, 2020 and a link to Comptrasts, a web application for conducting custom contrast tests.
This started as a personal resource used to respond quickly to common stats questions. Please note that this list is not meant to be exhaustive in any way. If you have any additional suggestions for inclusion, please send them my way. I hope you find this helpful. Lastly, any errors are my own.
January
2023 - AAA MAS PhD Colloquium - Theory Testing and Process Evidence
Methods for Understanding How, Why, and When Causes Lead to Effects
March 2022
- Georgia Tech PhD Seminar - What’s going on with your data? Common
Pitfalls and a Few Solutions
December
2021 - Junior Accounting Scholars Organization (JASO) - How and Why?
Using Experiments to Test Process Theories
December
2021 - UMass Lowell Manning School of Business - How and Why? Using
Experiments to Test Process Theories
October 2020 - AAA ABO
PhD Consortium - Research Methods for the 2020s
This is a great reference for basic graduate-level experimental statistics. Covers most 1st-year stats topics understandably. Includes information on Expected Mean Squares derivation which is not always included in these texts.
Whenever I don’t know where to find something, I turn to this book first. This is a great reference for many analysis questions that don’t come up often, but need to considered.
This text is a reference for linear modeling that is fairly heavy on linear algebra, but does a great job at helping you understand how linear modeling works from an “under the hood” perspective. Keep in mind that if you aren’t comfortable with linear algebra and/or matrix notation, you may want to review that first.
This book is positioned as a reference for ANCOVA, which it does well, but it also has a lot of good information about ANOVA, effect sizes, comparisons between the Fisher and GLM approaches to covariance analysis, and even longitudinal designs. It seems to be frequently cited as a reference for papers in accounting that use an EE-based approach.
I recommend that this is read (and referenced) by anybody that designs experiments. This is a very detailed look at experimental and quasi-experimental designs and the threats to validity that come from these design choices.
This text also brings forward great checklists for threats to validity that should be considered when designing experiments.
An interesting discussion about whether manipulation checks do more harm than good. A good argument for not including manipulation checks when they aren’t really needed.
Classic papers related to experimental design in accounting. The early 1980s articles / book chapters are more difficult to track down, but well worth the effort - especially to obtain a high-level understanding of “early work” in experimental accounting research. Focused on financial and auditing research, but applicable to other topical areas.
This discusses the strengths and weaknesses of different types of research paradigms and provides advice for approaching the data gathering process.
A pretty scathing assessment of null hypothesis significance testing (NHST) and how its prevalence has affected research in many fields, especially social psychology. Interesting, but somewhat depressing, read.
A working paper that discusses the prevalence of process theory testing in accounting. The paper discusses how research design and analysis choices can affect the ability of researchers to draw strong causal inferences. Moderation, mediation, and multiple experiments-style approaches are discussed.
Paper on common methods bias that suggests some ways to control for these sources of bias. Certainly relevant to accounting research, as we commonly shy away from multi-method work.
This is another paper that should be a must read. Does an excellent job at discussing what makes something theory, and those things we call theory that may not be.
Gives a good overview on the pros and cons of between vs. within-subjects design. The paper is written from an econ perspective and is very approachable.
Whether you’re an R user or not, this text is a great reference for establishing a data science workflow. Having a data science workflow that is reproducible is useful for saving time and preserving accuracy when describing experimental results.
This paper makes a strong case for the importance of descriptive work and data visualization, in general.
Good focused text on ANOVA and ANCOVA from a GLM approach. Lots of good examples that can be worked through.
This paper makes the case for why ANOVA is incredibly important for experimental analysis. Gelman also describes how ANOVA and regression are not equivalent, contrary to what is taught in many stats courses. Finally, Gelman makes the case for thinking about ANOVA in a hierarchical modeling framework.
This paper provides a technique for reproducing an ANOVA table from summary statistics. This can be implemented in excel easily. This can be really useful when doing reviews. Note that there does appear to be a typo in the paper that makes it hard to double-check your work. Happy to help if you run into that issue.
These papers all address methodological concerns related to ANCOVA. Missteps in ANCOVA can have serious effects on Type I error with the types of designs that are common in experimental accounting research. If you’re considering using an ANCOVA, this is something you’ll want to take a close look at.
These are all resources for information about contrast testing. The Shieh paper talks about contrasts in ANCOVA, but is restricted to 1-way ANCOVA designs.
This is one of my favorite papers. It’s so tempting to for us heuristically compare two p-values and conclude more than we should. This is exactly why we can’t use simple effects to demonstrate a significant interaction.
These papers are all about t-tests and other simple comparisons. The main takeaway is that we should probably all be using Welch’s t instead of regular tests. There are also some papers here that discuss other considerations for what we think of as really simple tests.
These papers make the argument that Linear Mixed Effects Models (LMeMs) are more appropriate than Repeated Measures ANOVA for hypothesis testing. I don’t disagree with the sentiment, but also note that many of the designs that are common to experimental accounting research would generate mathematically-equivalent results. (Thanks to Amanda Carlson for pointing these out to me.)
This paper lays out some very common missteps made when using covariates - including why just “throwing” a covariate into an ANOVA doesn’t really answer the question that most people want to know when asking if a researcher has “controlled for X”. As of the time of this writing, the paper is forthcoming in AJPT.
This paper shows analytically and through simulation that OLS regression is more appropriate than logistic regression for a binary outcome variable when fixed-effects IVs are being used. This is the case much of time in accounting, so very applicable!
This paper is from the political econ literature and is mostly concerned with observational data, but does a great job at explaining the effects of model misspecification.
These papers examine performance of rank-transformed vs. parametric analyses in the context of ANOVA.
A reference for the Jonckheere-Terpstra test for ordered cell means.
This is the main reference for PROCESS analysis and covers everything from simple mediation to more complex conditional process analysis designs. Chapter 14 answers many very common questions and provides a good framework for approaching these analyses.
This paper discusses the choice of using SEM vs. PROCESS analysis, especially with respect to models without latent variable measurement. Takeaway is that many of the models we look at in accounting research (simple mediation, no measurement model) SEM and process lead to very similar results.
This paper extends PROCESS to account for multi-categorical IVs and continuous moderators using Johnson-Neyman regression. This paper is important when using PROCESS with more complex designs.
This paper addresses the problem that occurs when manifest (not latent) variables are used in path analysis. More specifically, even small amounts of measurement error can lead to problems with both Type I and Type II error, as well as biased path coefficients.
This paper discusses mediation, confounding, and suppression - all of which may occur when a researcher is looking to examine processes underlying causal relationships. The paper is premised on a Baron and Kenny (1986) sequential regression approach, but the concepts here are still important to understand.
This paper calls into question Hayes’ claim that the index approach (versus the components approach) of demonstrating mediation is superior. Yzerbyt argues that common implementations of the index approach increase Type I error.
These papers all discuss fit indices. The Hu and Bentler articles are commonly cited “cutoffs” for good fit in SEM models. However, Marsh et al. (2004) argues that these are mostly arbitrary and some of the assumptions they were built from are tenuous at best.
Classic articles that discuss Sobel tests (which have largely been abandoned in our literature).
This paper is Amanda Montoya’s dissertation. The paper goes into more detail about expanding Johnson-Neyman to categorical IVs. Should be looked at in concert with Hayes and Montoya (2017).
Large systematic review of the use of Bayesian analysis in psychology.
This pair of papers was instrumental in bringing the NHST vs. BDA debate into mainstream cognitive psychology. The Bem (2011) paper presents evidence that future events may retroactively affect responses (Psi). Wagenmakers et al. (2011) argue that Bem’s (2011) results are more evidence that our analysis techniques need to change than that evidence that pre-cognition exists.
This paper presents an overview of the relationships between different univariate distributions. Knowledge of these relationships can be very useful when conducting BDA.
This is a very approachable, but useful, textbook on conducting BDA from a psych methods perspective. The book comes with code examples in BUGS, JAGS, and STAN that can be adapted to suit many experimental designs. If you read a couple of the introductory BDA articles and want to know more, this is a good place to look.
This is a classic reference and textbook on BDA. The good news is that it is detailed and covers many nuances of BDA. The bad news is that it may or may not be approachable for those that are not very mathematically inclined. Highly recommend, but know that it may be a difficult read depending on your background.
Very approachable article that discusses how to use BDA to shed more light on (unexpected) null results. Applicable to our work as we sometimes see papers that argue that a null is due to a lack of power or that a null is due to the absence of an effect. This paper discusses both of those scenarios as well as the detection of additivity vs. interaction.
Discussion of ROPE (Region of Practical Equivalence) analysis with respect to Bayesian parameter estimation. Straightforward discussion of the ROPE decision rule process.
Excellent intro to BDA techniques to those who haven’t seen much of it.
This article talks about hypothesis testing and confidence (credible) intervals from both a NHST and BDA perspective. This is good for learning how to think about the benefits that BDA can give relative to NHST.
This paper is an introduction to the two paradigms of explicit tests of null hypotheses under BDA (Bayesian parameter estimation and Bayes Factor Analysis) and contrast the two approaches.
This paper provides accounting researchers with an approach to BDA that uses both model comparison and Bayes Factor analysis. The analysis is conducted using JASP and default priors, making it fairly painless for researchers less familiar with coding and syntax-based stats software.
These two papers provide a “user-friendly” introduction to conducting BDAs with the JASP software package.
Provides reporting guidelines for authors presenting BDAs (in psychology).
These papers are all related to Bayes Factor analysis, which provides information about the relative likelihood of too models (one of which can be a null effects model, allowing for explicit null hypothesis testing).
Information on the effects of small data sets on inferences drawn from BDA.
Even more so that the other sections, this section is far from exhaustive. These are papers that I have found useful when thinking about archival analyses (especially as they have related to research questions I am interested in). This is definitely a section where additional suggestions are welcomed.
Discussion of how windsorizing and other techniques used to deal with outliers and other influential observations can have effects on inference. Advocates for the use of robust regression in these circumstances.
Paper that discusses some of the issues that arise when using propensity score matching (PSM) in accounting settings, the rise of this method, and the sensitivity of results to small design choices.
This is paper that discusses the “fundamental role of theory in drawing causal inferences from empirical evidence.” I don’t know enough about the underlying paper and archival methods to conclude which conclusion I am in favor of, but the role of theory discussion is excellent.