Help
1. Introduction
We have developed a sequence-based protein-protein interaction (PPI) site prediction tool. Through systematic evaluation of multiple pre-trained models and hyperparameter optimization, we selected the best-performing model, which has been made publicly available on Hugging Face. Our predictor requires only the target protein sequence as input (no binding partner information needed), with benchmark tests demonstrating excellent prediction performance.
Using this predictor, we conducted a comprehensive analysis of 20,841 human protein sequences at the whole-proteome scale. All prediction results have been standardized and integrated into our web-accessible database. We have also incorporated variant data from three authoritative databases: ClinVar, gnomAD, and UK Biobank.
2. Mutation
Variants were annotated using Variant Effect Predictor (VEP) to obtain corresponding amino acid change information, enabling assessment of their potential impact on PPI sites. The dataset includes these key fields:
- Core Information: POS (genomic position)、Uploaded_variation (variant ID)、REF/ALT (nucleotide change)
- Clinical Annotation: CLNREVSTAT (ClinVar-specific review status)、clin_sig (clinical significance)、Consequence (variant effect)
- Protein Impact: Protein (Swissprot)、Protein_position (amino acid position)、Amino_acids (residue change)
- Functional Predictions: SIFT/PolyPhen scores and prediction types
- PPI : interaction site prediction
- Structural Mapping (3Dmapper_result):
- UniProt sequences were mapped to PDB structures through sequence alignment
- Thresholds were set for sequence identity (Pident) and coverage
- The PDB structure with highest sequence identity was selected
- Mapped positions were classified as:Structure (structural region)、Interface (interaction interface)、Unmap (unmapped)、 Noncoding (non-coding region)
3. Proteome
We apply PPI site predictions for all 20,841 human proteins.
4. Download
Bulk downloads available in CSV format.
Contact Us
For technical support or inquiries, please contact us at::
Email: houqingzhen@sdu.edu.cn
Address: The Center for Integrative Bioinformatics National Institute of Health Data Science of China Shandong University, Shandong, P. R. China.