BIOPHYSICS · DISORDER METRICS
Physics First: Explainable Protein Disorder with EWCL
Protein disorder isn't binary—it's a continuum driven by charge, hydropathy, flexibility, and curvature. Unlike black-box AI models, Entropy-Weighted Collapse Likelihood (EWCL) delivers a transparent, physics-grounded [0,1] score per residue, inspired by Uversky's charge-hydropathy framework. Capture shifts from stability to collapse for cross-protein comparisons, evolutionary insights, and triage of flexible regions. Transform vague disorder into precise, reproducible metrics to advance protein function, disease research, and design.
Methods summary
EWCL (Entropy-Weighted Collapse Likelihood) estimates residue-level collapse/disorder on [0,1] by combining hydropathy & charge entropies, mobility proxies (B-factors/pLDDT-derived weights), and curvature penalties. A bounded transform with per-protein z-normalisation preserves magnitude and enables cross-protein comparisons.
Working score: EWCLi = α·HydroDi + β·Hcharge,i + ν·Flexi − δ·Curvi
Coloring: 5 EWCL bins from stable → collapse-prone. Hover: shows residue, EWCL, and AF2 pLDDT. Guardrail: high-confidence (pLDDT ≥ 90, EWCL ≤ 0.4) is never flagged.
Why EWCL?
The Scientific Basis of EWCL Disorder Prediction
EWCL advances beyond black-box modeling by integrating a physics-based framework, yielding a continuous [0,1] score per residue derived from charge–hydropathy dispersion, mobility, and curvature. This approach ensures a consistent, interpretable metric across sequence analyses, PDB structural validations, and AlphaFold hallucination audits, with no need for label recalibration.
Per-protein z-normalization enables score comparability across datasets, supporting evolutionary analyses, precise identification of flexible or collapse-prone regions, and reliable disorder assessments. With a validated 0.922 ROC-AUC on the DisProt dataset, EWCL outperforms conventional methods, providing reproducible, actionable insights for researchers and bioinformaticians.
Core Methodology
Understanding EWCL: The Core Methodology
Entropy-Weighted Collapse Likelihood (EWCL) quantifies protein disorder through a physics-grounded framework. It integrates hydropathy and charge entropies, flexibility proxies (B-factors or pLDDT-derived weights), and curvature penalties into a continuous [0,1] per-residue score. A bounded transform with per-protein z-normalization ensures values remain interpretable and comparable across proteins, enabling systematic study of disorder and collapse.
EWCLi = α·Hhydro(i) + β·Hcharge(i) + γ·Flexi − δ·Curvi
Applications:
- • Calibrated thresholds & triage: Preserves magnitude for ranked evaluation of regions.
- • Unified trace: Powers Sequencer, PDB overlays, and AlphaFold2 audits from a single consistent metric.
- • Hallucination rule: H = σ(λ·(EWCL − (1 − pLDDT/100))) flags collapse/disorder when H ≥ τ and pLDDT ≥ pLDDTmin.
Hydropathy / Charge Entropy
Captures solvent exposure and baseline disorder tendencies.
Flexibility Proxies
Identify motion-prone regions via B-factors or pLDDT.
Curvature Penalties
Highlight loop kinks and collapse-prone bends.
Why EWCL? — Evidence across curated disorder and structure sets
Metrics below are pooled over residues against MobiDB ground truths (DisProt, Merge, IDEAL). Sequencer uses sequence-only scores; PDB Physics uses structure-aware overlays. PR-AUC is prevalence-aware; counts shown per set.
EWCLv1, sequence-only analysis across 3,740 proteins (2.29M residues). PR-AUC is prevalence-aware.
DisProt
disorderMerge
disorderIDEAL
disorderAcknowledgements
We thank data providers including UniProt, AlphaFold DB, DisProt, IDEAL, and MobiDB. The development of EWCL was shaped by conversations with Vladimir Uversky, whose pioneering insights into intrinsic disorder and transitions between order and flexibility inspired both the scope and presentation of this work.
Ready when you are
Run EWCL Across Your Pipelines Without Losing Context
EWCL delivers a unified residue-level trace on [0,1] that remains consistent across workflows. Upload FASTA sequences, stream PDB structures, or connect via API to apply the same physics-grounded scoring to triage, structure overlays, and AlphaFold audits. By preserving context across inputs, EWCL enables reproducible comparisons and seamless integration into research pipelines.