Skip to content

Validation Data

Prospective benchmarks. Reproducible protocols. Open data.

Every performance metric on this page was computed on held-out prospective test sets — compounds not in the training data that were evaluated after model training was complete. Test set SMILES and evaluation code are published with each benchmark paper.

r² 0.82FEP correlation vs. experiment
14Clinical targets benchmarked
312Congeneric pairs in FEP test set
Scatter plot showing predicted vs. experimental binding free energy values with r²=0.82 correlation line

FEP Benchmark

Predicted vs. experimental ΔΔG across 14 targets.

The DrugSynq FEP benchmark uses publicly available congeneric series from ChEMBL and literature sources. Each target was tested prospectively — the model had no access to the test series during training or parameterization.

Target Class n pairs RMSE Source
CDK2Kinase280.870.89Schindler 2020
p38α MAPKKinase250.850.94Wang 2015
ThrombinProtease220.841.02Wang 2015
Tyk2Kinase160.910.78Patel 2025
BACEAspartyl protease310.791.15Schindler 2020
MCL1BCL-2 family240.761.28Patel 2025
Showing 6 of 14 benchmarked targets. Full table in published paper. RMSE in kcal/mol. r² = Pearson correlation squared on ΔΔG pairs.

Retrospective Benchmark

312 pairs. One scatter plot. No cherry-picking.

The scatter plot shows all 312 congeneric pairs across 14 targets plotted as predicted ΔΔG (x-axis) vs. experimental ΔΔG (y-axis). Teal dots within 1.5 kcal/mol of the diagonal are correctly ranked; amber dots are mispredicted outliers (13.1% of total).

Outliers cluster around scaffold hops and compounds with unusual binding modes — both expected failure cases for perturbation-based FEP. The protocol document (published) lists each outlier with structural rationale.

Predicted ΔΔG (kcal/mol) Exp. ΔΔG (kcal/mol) r² = 0.82 n = 312 pairs Within 1.5 kcal/mol Outliers (13.1%)

ADMET Benchmarks

Prospective ADMET model performance.

Endpoint AUROC MCC Sensitivity Specificity
hERG Inhibition0.910.730.840.89
CYP3A4 Inhibition0.880.680.810.87
Metabolic Stability (HLM)0.850.610.770.84
Aqueous Solubility0.870.660.800.86
Caco-2 Permeability0.890.710.830.90
Prospective test sets. Full benchmark in published paper. See Publications.

Dataset Details

Where the data comes from.

Source Data Type Records
ChEMBL 33Binding affinity (Ki, Kd, IC50)1.2M
PubChem BioAssayADMET in vitro panels247K
Literature (curated)FEP congeneric series312 pairs
QM reference calculationsML potential correction terms18K conformers

Reproducibility

All benchmarks are reproducible from published code.

Every benchmark on this page corresponds to a published paper with data splits, evaluation code, and test set SMILES. Links to code repositories and supporting information are provided in each publication.

Open Evaluation Code

Benchmark evaluation scripts are published to a public repository. Clone, run, verify. Any discrepancy between our reported numbers and your reproducibility run should be reported as an issue.

Prospective Not Retrospective

Test compounds were evaluated after models were fully trained. We explicitly publish the train/test split date boundary so you can verify no future compounds entered the training set.

Independent Review

Benchmark methodology was reviewed by independent computational chemists before publication. Reviewer comments and author responses are included in the published supporting information.

Numbers you can trust. Science you can verify.

Schedule a methodology review with our team to evaluate accuracy on your specific target class before committing to a subscription.