Package 'sprtt' reference manual

Title:	Sequential Probability Ratio Tests Toolbox
Description:	It is a toolbox for Sequential Probability Ratio Tests (SPRT), Wald (1945) <doi:10.2134/agronj1947.00021962003900070011x>. SPRTs are applied to the data during the sampling process, ideally after each observation. At any stage, the test will return a decision to either continue sampling or terminate and accept one of the specified hypotheses. The seq_ttest() function performs one-sample, two-sample, and paired t-tests for testing one- and two-sided hypotheses (Schnuerch & Erdfelder (2019) <doi:10.1037/met0000234>). The seq_anova() function allows to perform a sequential one-way fixed effects ANOVA (Steinhilber et al. (2023) <doi:10.31234/osf.io/m64ne>). Learn more about the package by using vignettes "browseVignettes(package = "sprtt")" or go to the website <https://meikesteinhilber.github.io/sprtt/>.
Authors:	Meike Steinhilber [aut, cre] , Martin Schnuerch [aut, ths] , Anna-Lena Schubert [aut, ths]
Maintainer:	Meike Steinhilber <[email protected]>
License:	AGPL (>= 3)
Version:	0.2.0
Built:	2025-04-02 04:22:29 UTC
Source:	https://github.com/meikesteinhilber/sprtt

Test data to run the examples

Description

A dataset that includes 120 individuals.

Usage

df_cancer
df_cancer

Format

A data frame with 2 variables:

treatment_group
control_group

Test data to run the examples

Description

A dataset that includes 120 individuals with sex gender and monthly income.

Usage

df_income
df_income

Format

A data frame with 2 variables:

monthly_income
sex

Test data to run the examples

Description

A dataset that includes 120 individuals.

Usage

df_stress
df_stress

Format

A data frame with 2 variables:

baseline_stress
one_year_stress

Draw Samples from a Gaussian Mixture Distribution

Description

Draws exemplary samples with a certain effect size for the sequential one-oway ANOVA or the sequential t-test, see Steinhilber et al. (2023) doi:10.31234/osf.io/m64ne

Usage

draw_sample_mixture(k_groups, f, max_n, counter_n = 100, verbose = FALSE)
draw_sample_mixture(k_groups, f, max_n, counter_n = 100, verbose = FALSE)

Arguments

`k_groups`	number of groups (levels of factor_A)
`f`	Cohen's f. The simulated effect size.
`max_n`	sample size for the groups (total sample size = max_n*k_groups)
`counter_n`	number of times the function tries to find a possible parameter combination for the distribution. Default value is set to 100.
`verbose`	`TRUE` or `FALSE.` Print out more information about the internal process of sampling the parameters (the internal counter that was reached, some additional hints and the drawn parameters for the Gaussian Mixture distributions.)

Value

returns a data.frame with the columns y (observations) and x (factor_A).

Examples

set.seed(333)

data <- sprtt::draw_sample_mixture(
  k_groups = 2,
  f = 0.40,
  max_n = 2
)
data

data <- sprtt::draw_sample_mixture(
  k_groups = 4,
  f = 1.2, # very large effect size
  max_n = 4,
  counter_n = 1000, # increase of counter is necessary
  verbose = TRUE # prints more information to the console
)
data
set.seed(333)

data <- sprtt::draw_sample_mixture(
  k_groups = 2,
  f = 0.40,
  max_n = 2
)
data

data <- sprtt::draw_sample_mixture(
  k_groups = 4,
  f = 1.2, # very large effect size
  max_n = 4,
  counter_n = 1000, # increase of counter is necessary
  verbose = TRUE # prints more information to the console
)
data

Draw Samples from a Normal Distribution

Description

Draws exemplary samples with a certain effect size for the sequential one-oway ANOVA or the sequential t-test, see Steinhilber et al. (2023) doi:10.31234/osf.io/m64ne

Usage

draw_sample_normal(k_groups, f, max_n, sd = NULL, sample_ratio = NULL)
draw_sample_normal(k_groups, f, max_n, sd = NULL, sample_ratio = NULL)

Arguments

`k_groups`	number of groups (levels of factor_A)
`f`	Cohen's f. The simulated effect size.
`max_n`	sample size for the groups (total sample size = max_n*k_groups)
`sd`	vector of standard deviations of the groups. Default value is 1 for each group.
`sample_ratio`	vector of sample ratios between th groups. Default value is 1 for each group.

Value

returns a data.frame with the columns y (observations) and x (factor_A).

Examples

set.seed(333)

data <- sprtt::draw_sample_normal(
  k_groups = 2,
  f = 0.20,
  max_n = 2
)
data

data <- sprtt::draw_sample_normal(
  k_groups = 4,
  f = 0,
  max_n = 2,
  sd = c(1, 2, 1, 8)
)
data

data <- sprtt::draw_sample_normal(
  k_groups = 3,
  f = 0.40,
  max_n = 2,
  sd = c(1, 0.8, 1),
  sample_ratio = c(1, 2, 3)
)
data
set.seed(333)

data <- sprtt::draw_sample_normal(
  k_groups = 2,
  f = 0.20,
  max_n = 2
)
data

data <- sprtt::draw_sample_normal(
  k_groups = 4,
  f = 0,
  max_n = 2,
  sd = c(1, 2, 1, 8)
)
data

data <- sprtt::draw_sample_normal(
  k_groups = 3,
  f = 0.40,
  max_n = 2,
  sd = c(1, 0.8, 1),
  sample_ratio = c(1, 2, 3)
)
data

Plot Sequential ANOVA Results

Description

Creates plots for the results of the seq_anova() function.

Usage

plot_anova(
  anova_results,
  labels = TRUE,
  position_labels_x = 0.15,
  position_labels_y = 0.075,
  position_lr_x = 0.05,
  position_lr_y = NULL,
  font_size = 25,
  line_size = 1.5,
  highlight_color = "#CD2626"
)
plot_anova(
  anova_results,
  labels = TRUE,
  position_labels_x = 0.15,
  position_labels_y = 0.075,
  position_lr_x = 0.05,
  position_lr_y = NULL,
  font_size = 25,
  line_size = 1.5,
  highlight_color = "#CD2626"
)

Arguments

`anova_results`	result object of the seq_anova() function (argument must be of class `seq_anova_results`).
`labels`	show labels in the plot.
`position_labels_x`	position of the boundary labels on the x-axis. 0 positions the center on the 0 of the x-axis.
`position_labels_y`	position of the boundary labels on the y-axis. 0 positions the labels on the dotted lines.
`position_lr_x`	scales the position of the LR label on the x-axis. 0 positions the label directly under the last calculated LR.
`position_lr_y`	scales the position of the LR label on the x-axis. 0 positions the label on the 0 of the y-axis
`font_size`	font size of the plot.
`line_size`	line size of the plot.
`highlight_color`	highlighting color, default is "#CD2626" (red).

Value

returns a plot

Examples

# simulate data for the example ------------------------------------------------
set.seed(333)
data <- sprtt::draw_sample_normal(3, f = 0.25, max_n = 30)

# calculate the SPRT -----------------------------------------------------------
anova_results <- sprtt::seq_anova(y~x, f = 0.25, data = data, plot = TRUE)

# plot the results -------------------------------------------------------------
sprtt::plot_anova(anova_results)

sprtt::plot_anova(anova_results,
                 labels = TRUE,
                 position_labels_x = 0.5,
                 position_labels_y = 0.1,
                 position_lr_x = -0.5,
                 font_size = 25,
                 line_size = 2,
                 highlight_color = "green"
                 )

sprtt::plot_anova(anova_results,
                 labels = FALSE
                 )

# further information ----------------------------------------------------------
# run this code:
vignette("one_way_anova", package = "sprtt")
# simulate data for the example ------------------------------------------------
set.seed(333)
data <- sprtt::draw_sample_normal(3, f = 0.25, max_n = 30)

# calculate the SPRT -----------------------------------------------------------
anova_results <- sprtt::seq_anova(y~x, f = 0.25, data = data, plot = TRUE)

# plot the results -------------------------------------------------------------
sprtt::plot_anova(anova_results)

sprtt::plot_anova(anova_results,
                 labels = TRUE,
                 position_labels_x = 0.5,
                 position_labels_y = 0.1,
                 position_lr_x = -0.5,
                 font_size = 25,
                 line_size = 2,
                 highlight_color = "green"
                 )

sprtt::plot_anova(anova_results,
                 labels = FALSE
                 )

# further information ----------------------------------------------------------
# run this code:
vignette("one_way_anova", package = "sprtt")

Sequential Analysis of Variance

Description

Performs a sequential one-way fixed effects ANOVA, see Steinhilber et al. (2023) doi:10.31234/osf.io/m64ne for more information. The repeated measurement ANOVA is not implemented yet in this function. For more information check out the vignette vignette("one_way_anova", package = "sprtt")

Usage

seq_anova(
  formula,
  f,
  alpha = 0.05,
  power = 0.95,
  data,
  verbose = TRUE,
  plot = FALSE,
  seq_steps = "single"
)
seq_anova(
  formula,
  f,
  alpha = 0.05,
  power = 0.95,
  data,
  verbose = TRUE,
  plot = FALSE,
  seq_steps = "single"
)

Arguments

`formula`	A formula specifying the model.
`f`	Cohen's f (expected minimal effect size or effect size of interest).
`alpha`	the type I error. A number between 0 and 1.
`power`	1 - beta (beta is the type II error probability). A number between 0 and 1.
`data`	A data frame in which the variables specified in the formula will be found.
`verbose`	a logical value whether you want a verbose output or not.
`plot`	calculates the ANOVA sequentially on the data and saves the results in the slot called plot. This calculation is necessary for the plot_anova() function.
`seq_steps`	Defines the sequential steps for the sequential calculation if `plot = TRUE`. Argument takes either a vector of numbers or the argument `single` or `balanced`. A vector of numbers specifies the sample sizes at which the anova is calculated. `single` specifies that after each single point the test statistic is calculated (step size = 1). Attention: the calculation starts at the number of groups times two. If the data do not fit to this, you have to specify the sequential steps yourself in this argument. `balanced` specifies that the step size is equal to the number of groups. Attention: the calculation starts at the number of groups times two. If the data do not fit to this, you have to specify the sequential steps yourself in this argument.

Value

An object of the S4 class seq_anova_results. Click on the class link to see the full description of the slots. To get access to the object use the @-operator or ⁠[]⁠-brackets instead of $. See the examples below.

Examples

# simulate data ----------------------------------------------------------------
set.seed(333)
data <- sprtt::draw_sample_normal(k_groups = 3,
                    f = 0.25,
                    sd = c(1, 1, 1),
                    max_n = 50)


# calculate sequential ANOVA ---------------------------------------------------
results <- sprtt::seq_anova(y ~ x, f = 0.25, data = data)
# test decision
results@decision
# test results
results

# calculate sequential ANOVA ---------------------------------------------------
results <- sprtt::seq_anova(y ~ x,
                            f = 0.25,
                            data = data,
                            alpha = 0.01,
                            power = .80,
                            verbose = TRUE)
results

# calculate sequential ANOVA ---------------------------------------------------
results <- sprtt::seq_anova(y ~ x,
                            f = 0.15,
                            data = data,
                            alpha = 0.05,
                            power = .80,
                            verbose = FALSE)
results
# simulate data ----------------------------------------------------------------
set.seed(333)
data <- sprtt::draw_sample_normal(k_groups = 3,
                    f = 0.25,
                    sd = c(1, 1, 1),
                    max_n = 50)


# calculate sequential ANOVA ---------------------------------------------------
results <- sprtt::seq_anova(y ~ x, f = 0.25, data = data)
# test decision
results@decision
# test results
results

# calculate sequential ANOVA ---------------------------------------------------
results <- sprtt::seq_anova(y ~ x,
                            f = 0.25,
                            data = data,
                            alpha = 0.01,
                            power = .80,
                            verbose = TRUE)
results

# calculate sequential ANOVA ---------------------------------------------------
results <- sprtt::seq_anova(y ~ x,
                            f = 0.15,
                            data = data,
                            alpha = 0.05,
                            power = .80,
                            verbose = FALSE)
results

Sequential Probability Ratio Test using t-statistic

Description

Performs one and two sample sequential t-tests on vectors of data. For more information on the sequential t-test, see Schnuerch & Erdfelder (2019) doi:10.1037/met0000234.

Usage

seq_ttest(
  x,
  y = NULL,
  data = NULL,
  mu = 0,
  d,
  alpha = 0.05,
  power = 0.95,
  alternative = "two.sided",
  paired = FALSE,
  na.rm = TRUE,
  verbose = TRUE
)
seq_ttest(
  x,
  y = NULL,
  data = NULL,
  mu = 0,
  d,
  alpha = 0.05,
  power = 0.95,
  alternative = "two.sided",
  paired = FALSE,
  na.rm = TRUE,
  verbose = TRUE
)

Arguments

`x`	Works with two classes: `numeric` and `formula`. Therefore you can write `"x"` or `"x~y"`. `"numeric input"`: a (non-empty) numeric vector of data values. `"formula input"`: a formula of the form lhs ~ rhs where lhs is a numeric variable giving the data values and rhs either 1 for a one-sample test or a factor with two levels giving the corresponding groups.
`y`	an optional (non-empty) numeric vector of data values.
`data`	an optional `data.frame`, which you can use only in combination with a `"formula input"` in argument `x`.
`mu`	a number indicating the true value of the mean (or difference in means if you are performing a two sample test).
`d`	a number indicating the specified effect size (Cohen's d)
`alpha`	the type I error. A number between 0 and 1.
`power`	1 - beta (beta is the type II error probability). A number between 0 and 1.
`alternative`	a character string specifying the alternative hypothesis, must be one of `two.sided` (default), `greater` or `less`. You can specify just the initial letter.
`paired`	a logical indicating whether you want a paired t-test.
`na.rm`	a logical value indicating whether `NA` values should be stripped before the computation proceeds.
`verbose`	a logical value whether you want a verbose output or not.

Value

An object of the S4 class seq_ttest_results. Click on the class link to see the full description of the slots. To get access to the object use the @-operator or ⁠[]⁠-brackets instead of $. See the examples below.

Examples

# set seed --------------------------------------------------------------------
set.seed(333)

# load library ----------------------------------------------------------------
library(sprtt)

# one sample: numeric input ---------------------------------------------------
treatment_group <- rnorm(20, mean = 0, sd = 1)
results <- seq_ttest(treatment_group, mu = 1, d = 0.8)

# get access to the slots -----------------------------------------------------
# @ Operator
results@likelihood_ratio

# [] Operator
results["likelihood_ratio"]

# two sample: numeric input----------------------------------------------------
treatment_group <- stats::rnorm(20, mean = 0, sd = 1)
control_group <- stats::rnorm(20, mean = 1, sd = 1)
seq_ttest(treatment_group, control_group, d = 0.8)

# two sample: formula input ---------------------------------------------------
stress_level <- stats::rnorm(20, mean = 0, sd = 1)
sex <- as.factor(c(rep(1, 10), rep(2, 10)))
seq_ttest(stress_level ~ sex, d = 0.8)

# NA in the data --------------------------------------------------------------
stress_level <- c(NA, stats::rnorm(20, mean = 0, sd = 2), NA)
sex <- as.factor(c(rep(1, 11), rep(2, 11)))
seq_ttest(stress_level ~ sex, d = 0.8, na.rm = TRUE)

# work with dataset (data are in the package included) ------------------------
seq_ttest(monthly_income ~ sex, data = df_income, d = 0.8)
# set seed --------------------------------------------------------------------
set.seed(333)

# load library ----------------------------------------------------------------
library(sprtt)

# one sample: numeric input ---------------------------------------------------
treatment_group <- rnorm(20, mean = 0, sd = 1)
results <- seq_ttest(treatment_group, mu = 1, d = 0.8)

# get access to the slots -----------------------------------------------------
# @ Operator
results@likelihood_ratio

# [] Operator
results["likelihood_ratio"]

# two sample: numeric input----------------------------------------------------
treatment_group <- stats::rnorm(20, mean = 0, sd = 1)
control_group <- stats::rnorm(20, mean = 1, sd = 1)
seq_ttest(treatment_group, control_group, d = 0.8)

# two sample: formula input ---------------------------------------------------
stress_level <- stats::rnorm(20, mean = 0, sd = 1)
sex <- as.factor(c(rep(1, 10), rep(2, 10)))
seq_ttest(stress_level ~ sex, d = 0.8)

# NA in the data --------------------------------------------------------------
stress_level <- c(NA, stats::rnorm(20, mean = 0, sd = 2), NA)
sex <- as.factor(c(rep(1, 11), rep(2, 11)))
seq_ttest(stress_level ~ sex, d = 0.8, na.rm = TRUE)

# work with dataset (data are in the package included) ------------------------
seq_ttest(monthly_income ~ sex, data = df_income, d = 0.8)

Package 'sprtt'

Help Index

Test data to run the examples

Description

Usage

Format

Test data to run the examples

Description

Usage

Format

Test data to run the examples

Description

Usage

Format

Draw Samples from a Gaussian Mixture Distribution

Description

Usage

Arguments

Value

Examples

Draw Samples from a Normal Distribution

Description

Usage

Arguments

Value

Examples

Plot Sequential ANOVA Results

Description

Usage

Arguments

Value

Examples

Sequential Analysis of Variance

Description

Usage

Arguments

Value

Examples

Sequential Probability Ratio Test using t-statistic

Description

Usage

Arguments

Value

Examples