ValidateOutput Class for Contributors Table Validation
ValidateOutput Class for Contributors Table Validation
Details
The ValidateOutput
class runs both column-based validation (ensuring required columns exist)
and data-based validation (checking correctness of values) for a contributors table.
It integrates two validation classes:
ColumnValidator
: Ensures required columns are present.Validator
: Runs content-based validation checks on contributor data.
This validation process is configured via a YAML file. The inst/config/
package di contains predefined YAML configuration files
for each of the six output types.
Column Validation
The ColumnValidator
ensures that required columns exist before running data-based checks.
If a required column is missing, validation stops immediately with an error.
Example YAML Configuration (inst/config/title_validation.yaml
):
column_config:
rules:
minimal:
operator: "AND"
columns:
- Firstname
- Middle name
- Surname
- Order in publication
severity: "error"
affiliation:
operator: "OR"
columns:
- Primary affiliation
- Secondary affiliation
regex: "^Affiliation [0-9]+$"
severity: "error"
title:
operator: "AND"
columns:
- Corresponding author?
- Email address
severity: "warning"
General Data Validation
The Validator
runs content-based validation checks after column validation passes.
Example Validation Configuration (inst/config/title_validation.yaml
):
validation_config:
validations:
- name: check_missing_order
- name: check_duplicate_order
- name: check_missing_surname
- name: check_missing_firstname
- name: check_duplicate_initials
- name: check_missing_corresponding
dependencies:
- '"Corresponding author?"
- name: check_missing_email
dependencies:
- '"Corresponding author?"
- 'self$results[["check_missing_corresponding"]]$type == "success"'
- '"Email address"
- name: check_duplicate_names
- name: check_affiliation
- name: check_affiliation_consistency
Dependencies:
Some validation checks only run if other conditions are met.
Example:
check_missing_email
only runs if:"Corresponding author?"
exists.check_missing_corresponding
has passed."Email address"
is in the dataset.
Integration
The class runs in the following order:
Column validation (via
ColumnValidator
).If columns are valid → Run content validation (via
Validator
).If column validation fails → Stop and return column validation errors.
Usage
# Load a validation configuration file
config_path <- "inst/config/title_validation.yaml"
# Create a ValidateOutput instance
validate_output <- ValidateOutput$new(config_path = config_path)
# Run validation on the contributors table
results <- validate_output$run_validations(contributors_table)
print(results)
Public fields
validator
Instance of the
Validator
class for data validation.column_validator
Instance of the
ColumnValidator
class for column validation.config
Stores the combined YAML validation configuration.