Setup

First, you have to install the package.

# install.packages("devtools")
devtools::install_github("marton-balazs-kovacs/tppr")

Second you have to load the package.

Setting the analysis parameters

The analysis depends on preset parameters that are saved as a list of values in the analysis_params datafile. This datafile is part of the package. If you want to know more about the default analysis parameters check ?tppr::analysis_params.

After installing and loading the package the analysis parameters datafile can be called directly. You can check out the names of the parameters in the following way.

names(tppr::analysis_params)
#>  [1] "inference_threshold_bf_high"                 
#>  [2] "inference_threshold_bf_low"                  
#>  [3] "inference_threshold_nhst"                    
#>  [4] "inference_threshold_robustness_bayes_par_est"
#>  [5] "inference_threshold_robustness_nhst"         
#>  [6] "m0_prob"                                     
#>  [7] "minimum_effect_threshold_bayes_par_est"      
#>  [8] "minimum_effect_threshold_nhst"               
#>  [9] "n_prior"                                     
#> [10] "p_equiv_test"                                
#> [11] "rope"                                        
#> [12] "scale"                                       
#> [13] "sim_null_participant_num"                    
#> [14] "success_proportions_theoretical"             
#> [15] "trial_size_per_participant"                  
#> [16] "when_to_check"                               
#> [17] "y_prior"

You can also modify the parameters after loading.

Getting the probability of M0:

tppr::analysis_params$m0_prob
#> [1] 0.5

Changing the value:

tppr::analysis_params$m0_prob <- 0.53

Import data

You can run the analysis on several different data types:

  • Example dataset where M0 is simulated to be true
  • Example dataset where M1 is simulated to be true
  • Example data generated by yourself
  • The collected pilot dataset
  • The collected test dataset
  • The collected live dataset

In the following we show you how to load each type of data.

Example datasets

The example datasets are included in the package. If you want to know more about the datasets you can use the ?tppr::exampl_m0 and the ?tppr::exampl_m1 functions to read the documentation.

# To load the dataset where M0 is simulated to be true
tppr::example_m0

# To load the dataset where M1 is simulated to be true
tppr::example_m1

Generate example dataset

It is also possible to generate your own example dataset. Generating the example dataset can take a few minutes. If you call the function without parameters it will use the preset parameters that correspond to the analysis parameters of the paper.

my_tppr_data <- generate_example_data()

Collected datasets

You can download and read the collected data from the projects Github repository. There are three types of collected data:

  • test: The data are collected by testing the data collection program.
  • pilot: The data are collected during the pilot study.
  • live: The data are collected during live testing. These data will go in the analysis and the results will be presented in the paper.
# Read live data
tpp_raw_data <- tppr::read_data(type = "live")
#> There are 1463 participants who started the experiment.

# For testing purposes we will use the example_m0 dataset from now on
tpp_raw_data <- tppr::example_m0

Cleaning the dataset

After reading the raw data that contains all the trials during data cleaning the empty trials are excluded and only the erotic trials are kept.

However, this step is included in the primary confirmatory analysis function tppr::analysis_confirmatory, therefore you do not have to run it separately.

tpp_processed_data <- tppr::clean_data(raw_data = tpp_raw_data)

Calculate study descriptives

All the descriptive results of the study can be calculated with the ?tppr::sample_descriptives function. The function returns a list of the descriptive results that can be later used to populate the research paper.

The function runs at the latest passed checkpoint, unless it is indicated differently in the which_checkpoint argument. If none of the checkpoints are passed the function throws an error.

tppr::sample_descriptives(raw_data = tpp_raw_data, which_checkpoint = NA_integer_)

In case you want to see the descriptive results for all the collected trials regardless of a checkpoint you should use the following function.

tppr::sample_descriptives_current(processed_data = tpp_processed_data)

Checking the current and the next checkpoint

To get information about the currently passed checking point based on the number of trials (either all trials or only the erotic trials) the ?tppr::tell_checkpoint function can be be used.

checkpoint <- tppr::tell_checkpoint(df = tpp_processed_data)

The number of valid erotic trials:

checkpoint$total_n
#> [1] 51457

The closest checkpoint that is passed:

checkpoint$current_checkpoint
#> [1] 1

Next closest checkpoint:

checkpoint$next_checkpoint
#> [1] 2

Run the primary confirmatory analysis

The primary confirmatory analysis consist of a mixed-effects logistic regression and Bayes factors with three different priors.

These calculations can be run separately. * Mixed-effects logistic regression: ?tppr::confirmatory_mixed_effect * Bayes factor with three different priors: ?tppr:confirmatory_bayes_factor

Based on the results of the 4 primary analysis the tppr::inference_confirmatory_combined function makes an inference. * M1: If all the four analysis supports the M1 model * M0: If all the four analysis supports the M0 model * Ongoing: If the four analysis do not support the same model and the last checking point is not reached * Inconclusive: If the four analysis do not support the same model and the last checking point is reached

All of the above describe functionality is included in the tppr::analysis_confirmatory function. This function will run all these subfunctions and return the result as a list.

confirmatory_results <- tppr::analysis_confirmatory(df = tpp_raw_data)

Note: To run the primary analysis the number of processed trials have to pass at least the first checkpoint.

Running the confirmatory analysis at multiple checkpoints

If multiple checkpoint are passed and you want to run the primary confirmatory analysis iteratively for each checkpoint you first have to prepare the data with the tppr::split_data function.

This function splits the data at each passed checking point and saves the results into a list of dataframes. The function also calculates the necessary summary statistics for the mixed-effects logistic regression and the Bayes factor analyses.

tpp_split_data <- split_data(df = tpp_raw_data)

Second, you can iterate through the confirmatory analysis at the list of dataframes to get the results and the final inference at each checking point.

Note: If the primary confirmatory analysis is calculated at multiple checkpoints this calculation can take a few minutes.

tpp_split_data %>% 
  dplyr::mutate(confirmatory_res = purrr::map(splitted_data, tppr::analysis_confirmatory))

Calculate the cumulative Bayes factor

For the primary analysis plot we have to calculate the cumulative Bayes factors with the three priors for every trial.

We can calculate the number of successful guesses for each trial separately with the ?tppr::cumulative_success function.

tppr::cumulative_success(df = tpp_raw_data)

However, if we provide the raw dataset to the tppr::cumulative_bayes_factor function, the function automatically cleans the dataset, and calculates the number of successful guesses. In addition, it calculates the Bayes factors for each number of trials, number of successes pair.

Note: If there are a lot of trials calculating the Bayes factors can take a few minutes.

cumulative_results <- tppr::cumulative_bayes_factor(df = tpp_raw_data)

Visualise the result

To visualize the results of the primary confirmatory analysis the ?tppr::plot_confrimatory can be used. You can decide whether to output an interactive plot made with the plotly package, or a simple ggplot2 plot.

tppr::plot_confirmatory(cumulative_results = cumulative_results, animated = TRUE)
#> Warning: Transformation introduced infinite values in continuous y-axis

Run the robustness analysis

If the primary confirmatory analysis did not reach a conclusion (either support for M0 or for M1), the robustness analysis inference will be NA.

In our example case the analysis supports M0 at the first checkpoint. If multiple checkpoints were reached we could run the robustness analysis for multiple checkpoints, or we could extract the results of the confirmatory analysis for the last checkpoint with the following code.

By default, the analysis_robustness and the analysis_exploratory functions extract the last row of th primary confirmatory analysis results output.

We could run the robustness analysis with the following code on the last row of the primary analysis results:

robustness_results <- tppr::analysis_robustness(confirmatory_results = confirmatory_results)

Visualise the result

When visualizing the results of the robustness analysis you can decide whether to include the confidence interval of the mixed logistic regression or not. Using include_nhst = TRUE means that the confidence interval will be present on the plot.

tppr::plot_robustness(posterior_density = robustness_results$robustness_bayes_res$posterior_density,
                      hdi_mode = robustness_results$robustness_bayes_res$hdi_mode,
                      hdi_l = robustness_results$robustness_bayes_res$hdi_l,
                      hdi_u = robustness_results$robustness_bayes_res$hdi_u,
                      mixed_ci_width = robustness_results$confirmatory_mixed_ci_width,
                      mixed_ci_l = robustness_results$confirmatory_mixed_ci_l,
                      mixed_ci_u = robustness_results$confirmatory_mixed_ci_u,
                      include_nhst = TRUE)

Run the exploratory analysis

exploratory_results <- tppr::analysis_exploratory(df = tpp_raw_data)

Visualize the results

tppr::plot_exploratory(success_rates_theoretical_prop = exploratory_results$success_rates_theoretical_prop,
                       success_rates_empirical_prop = exploratory_results$success_rates_empirical_prop,
                       possible_success_rates = exploratory_results$possible_success_rates)