Synthetic Tabular Data Generation with Gaussian Copulas


[Up] [Top]

Documentation for package ‘rsdv’ version 0.2.0

Help Pages

add_constraint Add a constraint to metadata
adult_income Adult Income dataset (500-row sample)
attribute_disclosure_risk Attribute disclosure risk
autoplot.rsdv_diagnostic_report Plot a diagnostic report
autoplot.rsdv_privacy_report Plot a privacy report
autoplot.rsdv_quality_report Plot a quality report
check_constraint Check a single constraint against each row of a data frame
check_constraints Check all constraints in metadata against a data frame
contingency_similarity Contingency similarity between real and synthetic categorical column pairs
correlation_similarity Correlation similarity between real and synthetic numerical column pairs
custom_constraint Constraint: arbitrary row-wise predicate
diagnostic_report Generate a diagnostic (validity) report for synthetic data
equality_constraint Constraint: two columns must be equal row-wise
fixed_combinations_constraint Constraint: only observed column combinations are valid
gaussian_copula_synthesizer Create a Gaussian Copula synthesizer
inequality_constraint Constraint: col_a must be less than / greater than col_b
is_fitted Check whether a synthesizer has been fitted
ks_similarity Kolmogorov-Smirnov similarity score per numerical column
load_metadata Load metadata from a JSON file
metadata Create a metadata object describing a dataset's column types
metadata_from_json Deserialize metadata from a JSON string
metadata_to_json Serialize metadata to a JSON string
ml_efficacy ML efficacy: train-on-synthetic / test-on-real accuracy ratio (TSTR)
nndr Nearest-Neighbor Distance Ratio privacy score
print.custom_constraint Print method for a custom_constraint
print.equality_constraint Print method for an equality_constraint
print.fixed_combinations_constraint Print method for a fixed_combinations_constraint
print.inequality_constraint Print method for an inequality_constraint
print.rsdv_diagnostic_report Print method for rsdv_diagnostic_report
print.rsdv_metadata Print method for rsdv_metadata
print.rsdv_privacy_report Print method for rsdv_privacy_report
print.rsdv_quality_report Print method for rsdv_quality_report
privacy_report Generate a privacy report comparing real and synthetic data
quality_report Generate a quality report comparing real and synthetic data
sample Sample synthetic rows from a fitted synthesizer
sample_conditions Sample synthetic rows that match fixed column values (conditional sampling)
save_metadata Save metadata to a JSON file
set_column_type Set the type of a column in metadata
set_primary_key Set the primary key column of the metadata
tvd_similarity Total variation distance similarity score per categorical column
validate_data Validate that a data frame is compatible with metadata