The Entropy-Weighted Collapse Likelihood (EWCL) model integrates structural physics, hydropathy and charge entropy, and geometric penalties to predict collapse/flexibility in proteins. EWCL is benchmarked against gold-standard experimental datasets: X-ray B-factors, AlphaFold pLDDT, and DisProt disorder annotations.
EWCLi = α·Hhydro(i) + β·Hcharge(i) + γ·Flexi - δ·Curvi
Benchmarked on 29,000+ proteins across AlphaFold, DisProt, and high-res X-ray datasets.
β-galactoside-binding lectin
The EWCL (Entropy-Weighted Collapse Likelihood) method enables residue-level prediction of disorder beyond traditional annotation, revealing "cryptic" disordered regions that are not captured in DisProt or UniProt disorder databases.
Key findings: Cryptic residues (orange markers) represent putative disordered or highly flexible regions that are missed by current annotation but are supported by low complexity or compositional bias signals (magenta). This approach highlights "hidden" disorder, especially in low-complexity or bias-prone regions, which may be functionally relevant for regulation, phase separation, or protein-protein interaction.
Methodology: By integrating entropy-aware collapse likelihood (reverse EWCL scores > 0.7), experimental confidence (pLDDT, B-factor), and compositional bias overlays, we systematically identify residues and segments that exhibit high predicted disorder but lack existing disorder annotation.
Applications: 3D structure overlays allow interactive visualization of cryptic disorder, compositional bias, and all underlying scores per residue. This comprehensive view helps address annotation gaps and provides a basis for experimental validation of novel disordered regions.