What is a Bayes Factor Design Analysis?

Well-designed experimental setups maximize efficiency and informativeness of the experiment. This means that they yield conclusive evidence while being efficient in terms of sample size.

In a classical frequentist design, these goals are usually targeted with an a-priori power analysis by controlling long-term error rates and determining the minimum sample size for these error rates. However, this method cannot be applied to Bayesian analyses and falls short in measuring the actual evidential strength that can be expected from a design.

The Bayes Factor Design Analysis (BFDA) was developed by Schönbrodt and Wagenmakers (2017) to tackle these shortfalls. It aims at answering the following two questions:

  1. Which evidence strength can I expect for a specific research design?
  2. Which long-term rates of misleading evidence can I expect for a specific research design?

In a Bayesian framework, these questions can be answered using Monte-Carlo simulations, i.e. assume a population with certain properties, repeatedly (in our case 10,000 times) draw random samples from this population, and compute the analysis for each of the samples. The Bayes factor serves as a measure of evidence strength (you can find a short non-mathematical explanation of the interpretation of Bayes factors here).

As evidence is measured on a continuous scale, evidence thresholds (boundaries) have to be introduced if a binary decision analogous to null hypothesis significance testing should be made. If a Bayes factor is larger than the upper boundary (e.g., larger than 10), it is regarded as compelling evidence for the alternative hypothesis; if a Bayes factor is smaller than the lower boundary (e.g., smaller than 1/10), it is regardedd as compelling evidence for the null hypothesis. Misleading evidence is defined as obtaining a Bayes factor which exceeds the wrong boundary, i.e. is smaller than the lower boundary when H1 is correct or larger than the upper boundary when H0 is correct. The second question can therefore be be reframed as: How often do I obtain BF which exceeds the wrong boundary?

This app is designed to facilitate a Bayes Factor Design Analysis for a simple, but very common case: An independent-group t-test with directional hypotheses.

BFDA with Informed and Default Priors

Bayes factors are the ratio of two marginal likelihoods. You can imagine a marginal likelihood as a weighted sum of probability densities that you obtain for your data under the assumption of different population parameters. The weighting is determined by the prior distributions on parameters.

How do you determine prior distributions? Ask yourself the following question: Which parameter values would I expect if the respective hypothesis is true before conducting the study? For the null hypothesis, there is not much discussion about this issue. Null hypothesis means null effect, so you assign all prior mass to zero (resulting in a point prior on zero).

For the alternative hypothesis, matters become a little more difficult. There is an ongoing debate about how much information should be incorporated in this prior. So-called “objective” Bayesians prefer to incorporate as little information as possible in the prior, which results in the use of very broad prior distributions (Berger, 2006) . A typical prior distribution objective Bayesians use for Bayesian t-tests is a zero-centered Cauchy distribution with a scale of \(\sqrt{2}/2\). In contrast, so-called “subjective” Bayesians prefer to incorporate as much prior knowledge as possible in prior distributions (Fienberg, 2006) . This can mean including knowledge from previous studies or knowledge about restricting conditions of the experimental setup (“I use a 5-point Likert scale therefore the maximum difference between two results cannot exceed 4”). For a more detailed account of the debate between objective and subjective Bayesians, see Dienes (2008) .

In this app, users can choose between two different prior distributions. We will refer to them as the “default” prior and the “informed” prior. As suggested by the name, the “informed” prior follows the subjective Bayesian approach, while the “default” prior is a non-informative prior following the objective Bayesian approach. It is called “default”, because it corresponds to the default settings in the influential BayesFactor R-package. The subjective prior used in this app is an exemplary but in our opinion typical subjective prior developed by Gronau, Ly, and Wagenmakers (2017) for the context of (social) psychology using a prior elicitation technique.

Conducting a Fixed-N Design BFDA

The type of Bayes Factor Design Analysis resembling closest to the typical freqentist power analysis is a fixed-n design BFDA. Here, the sample size is determined before the experiment takes place. Questions you can ask in this kind of design analysis include:

  1. Given a specific sample size and an expected population effect size: What Bayes factors can I expect?
  2. What sample size do I need to have to obtain true positive or true negative evidence with a certain probability?

The part of this app covering the BFDA for the fixed-n design is built along these two questions. To answer the first question, it provides you with distributions of Bayes factors for sample sizes from 10 to 200 and true population effect sizes ranging from \(\delta\) = 0.2 to \(\delta\) = 1.2. You can either access the full distribution or certain characteristics of it (median, characteristic quantiles). You can also get information about the expected rates of false positive and false negative evidence for a certain sample size, (symmetric) boundary, and true population effect. To answer the second question, we provide you with a verbal output which gives you information about the minimum sample size you need per group to obtain a certain evidence strength with a certain probability.

Conducting a Sequential Design BFDA

In a Bayesian setup, it is possible to use designs with optional stopping without tampering the the rates of misleading evidence. These Bayesian designs are called “Sequential Designs” (Sch?nbrodt, Wagenmakers, Zehetleitner, & Perugini, 2015). The idea is to collect data until a certain upper or lower boundary of Bayes factors is reached. Questions you can ask in a design analysis for this kind of design include:

  1. What sample sizes can I expect given a desired level of evidence and the expected effect size?
  2. What is the probability of misleading evidence for certain boundaries?

We answer both questions in the Sequential Design part of the app. First, we give a graphical summary of the sample size distributions for symmetric boundaries from 3 to 30 (equivalent to lower boundaries from 1/3 to 1/30 and upper boundaries from 3 to 30). Like in a boxplot, we use medians as a statistic for the central tendency of the distribution, and the user can choose to fade-in the 25%-75% and the 5%-95% quantiles of the distribution. When users have gained a broad understanding of the relation between sample size and choice of boundary, they can proceed with the detailed analyses in the bottom part of the tab. Here, they can access the exact values of distinct features of the distribution (median, characteristic quantiles) as well as graphic overviews of the design features (violinplots, boxplots, histograms).

Also, the second question of the BFDA sequential designs is answered in this panel. Users can access information on the expected rates of misleading evidence for boundaries ranging from 3 to 30.

Details of the App

The app facilitates a BFDA for an independent-group t-test with directional hypotheses. It relies on the results of Monte Carlo simulations with 10,000 iterations. We computed Bayes factors based on a t-statistic. The t-statistic was obtained using the t.test function from the stats package in R. The default Bayes factor was computed using the ttest.tstat function of the BayesFactor R package. The null interval was set to [0;Inf] and the r scale to “medium”. The informed Bayes factors were computed using the functions created by Gronau, Ly, and Wagenmakers (2017). The informed prior was set to a t-distribution with a non-centrality parameter of 0.35, 3 degrees of freedom, and a scale parameter of 0.102.

Tutorial Paper

For more information on the Bayes Factor Design Analysis, have a look at our tutorial paper.

BFDA R-Package

If the functionality of the BFDA-app does not fit your needs, take a look at our BFDA R-package. You can download it from GitHub on


General Settings for the Fixed N Design

Part 1: What Bayes factors can I expect?

Part 2: What sample size do I need to obtain true positive or true negative evidence with a certain probability?

Step 1: Expected N Given Different Boundaries

Step 2: Selected boundaries and rates of misleading evidence

Select boundary and prior(s)

Download Report for
Sequential Design

Select information to display