ValidateOutput Class for Contributors Table Validation
ValidateOutput Class for Contributors Table Validation
Details
The ValidateOutput class runs both column-based validation (ensuring required columns exist)
and data-based validation (checking correctness of values) for a contributors table.
It integrates two validation classes:
ColumnValidator: Ensures required columns are present.Validator: Runs content-based validation checks on contributor data.
This validation process is configured via a YAML file. The inst/config/ package di contains predefined YAML configuration files
for each of the six output types.
Column Validation
The ColumnValidator ensures that required columns exist before running data-based checks.
If a required column is missing, validation stops immediately with an error.
Example YAML Configuration (inst/config/title_validation.yaml):
column_config:
rules:
minimal:
operator: "AND"
columns:
- Firstname
- Middle name
- Surname
- Order in publication
severity: "error"
affiliation:
operator: "OR"
columns:
- Primary affiliation
- Secondary affiliation
regex: "^Affiliation [0-9]+$"
severity: "error"
title:
operator: "AND"
columns:
- Corresponding author?
- Email address
severity: "warning"
General Data Validation
The Validator runs content-based validation checks after column validation passes.
Example Validation Configuration (inst/config/title_validation.yaml):
validation_config:
validations:
- name: check_missing_order
- name: check_duplicate_order
- name: check_missing_surname
- name: check_missing_firstname
- name: check_duplicate_initials
- name: check_missing_corresponding
dependencies:
- '"Corresponding author?"
- name: check_missing_email
dependencies:
- '"Corresponding author?"
- 'self$results[["check_missing_corresponding"]]$type == "success"'
- '"Email address"
- name: check_duplicate_names
- name: check_affiliation
- name: check_affiliation_consistency
Dependencies:
Some validation checks only run if other conditions are met.
Example:
check_missing_emailonly runs if:"Corresponding author?"exists.check_missing_correspondinghas passed."Email address"is in the dataset.
Integration
The class runs in the following order:
Column validation (via
ColumnValidator).If columns are valid → Run content validation (via
Validator).If column validation fails → Stop and return column validation errors.
Usage
# Load a validation configuration file
config_path <- "inst/config/title_validation.yaml"
# Create a ValidateOutput instance
validate_output <- ValidateOutput$new(config_path = config_path)
# Run validation on the contributors table
results <- validate_output$run_validations(contributors_table)
print(results)Public fields
validatorInstance of the
Validatorclass for data validation.column_validatorInstance of the
ColumnValidatorclass for column validation.configStores the combined YAML validation configuration.
