General
Protein Interaction Network Analysis - Claude MCP Skill
Analyze protein-protein interaction networks using STRING, BioGRID, and SASBDB databases. Maps protein identifiers, retrieves interaction networks with confidence scores, performs functional enrichment analysis (GO/KEGG/Reactome), and optionally includes structural data. No API key required for core functionality (STRING). Use when analyzing protein networks, discovering interaction partners, identifying functional modules, or studying protein complexes.
SEO Guide: Enhance your AI agent with the Protein Interaction Network Analysis tool. This Model Context Protocol (MCP) server allows Claude Desktop and other LLMs to analyze protein-protein interaction networks using string, biogrid, and sasbdb databases. maps prote... Download and configure this skill to unlock new capabilities for your AI workflow.
Documentation
SKILL.md# Protein Interaction Network Analysis
Comprehensive protein interaction network analysis using ToolUniverse tools. Analyzes protein networks through a 4-phase workflow: identifier mapping, network retrieval, enrichment analysis, and optional structural data.
## Features
β
**Identifier Mapping** - Convert protein names to database IDs (STRING, UniProt, Ensembl)
β
**Network Retrieval** - Get interaction networks with confidence scores (0-1.0)
β
**Functional Enrichment** - GO terms, KEGG pathways, Reactome pathways
β
**PPI Enrichment** - Test if proteins form functional modules
β
**Structural Data** - Optional SAXS/SANS solution structures (SASBDB)
β
**Fallback Strategy** - STRING primary (no API key) β BioGRID secondary (if key available)
## Databases Used
| Database | Coverage | API Key | Purpose |
|----------|----------|---------|---------|
| **STRING** | 14M+ proteins, 5,000+ organisms | β Not required | Primary interaction source |
| **BioGRID** | 2.3M+ interactions, 80+ organisms | β
Required | Fallback, curated data |
| **SASBDB** | 2,000+ SAXS/SANS entries | β Not required | Solution structures |
## Quick Start
### Basic Usage
```python
from tooluniverse import ToolUniverse
from python_implementation import analyze_protein_network
# Initialize ToolUniverse
tu = ToolUniverse()
# Analyze protein network
result = analyze_protein_network(
tu=tu,
proteins=["TP53", "MDM2", "ATM", "CHEK2"],
species=9606, # Human
confidence_score=0.7 # High confidence
)
# Access results
print(f"Mapped: {len(result.mapped_proteins)} proteins")
print(f"Network: {result.total_interactions} interactions")
print(f"Enrichment: {len(result.enriched_terms)} GO terms")
print(f"PPI p-value: {result.ppi_enrichment.get('p_value', 1.0):.2e}")
```
### Expected Output
```
π Phase 1: Mapping 4 protein identifiers...
β
Mapped 4/4 proteins (100.0%)
πΈοΈ Phase 2: Retrieving interaction network...
β
STRING: Retrieved 6 interactions
𧬠Phase 3: Performing enrichment analysis...
β
Found 245 enriched GO terms (FDR < 0.05)
β
PPI enrichment significant (p=3.45e-05)
β
Analysis complete!
```
## Use Cases
### 1. Single Protein Analysis
Discover interaction partners for a protein of interest:
```python
result = analyze_protein_network(
tu=tu,
proteins=["TP53"], # Single protein
species=9606,
confidence_score=0.7
)
# Top 5 partners will be in the network
for edge in result.network_edges[:5]:
print(f"{edge['preferredName_A']} β {edge['preferredName_B']} "
f"(score: {edge['score']})")
```
### 2. Protein Complex Validation
Test if proteins form a functional complex:
```python
# DNA damage response proteins
proteins = ["TP53", "ATM", "CHEK2", "BRCA1", "BRCA2"]
result = analyze_protein_network(tu=tu, proteins=proteins)
# Check PPI enrichment
if result.ppi_enrichment.get("p_value", 1.0) < 0.05:
print("β
Proteins form functional module!")
print(f" Expected edges: {result.ppi_enrichment['expected_number_of_edges']:.1f}")
print(f" Observed edges: {result.ppi_enrichment['number_of_edges']}")
else:
print("β οΈ Proteins may be unrelated")
```
### 3. Pathway Discovery
Find enriched pathways for a protein set:
```python
result = analyze_protein_network(
tu=tu,
proteins=["MAPK1", "MAPK3", "RAF1", "MAP2K1"], # MAPK pathway
confidence_score=0.7
)
# Show top enriched processes
print("\nTop Enriched Pathways:")
for term in result.enriched_terms[:10]:
print(f" {term['term']}: p={term['p_value']:.2e}, FDR={term['fdr']:.2e}")
```
### 4. Multi-Protein Network Analysis
Build complete interaction network for multiple proteins:
```python
# Apoptosis regulators
proteins = ["TP53", "BCL2", "BAX", "CASP3", "CASP9"]
result = analyze_protein_network(
tu=tu,
proteins=proteins,
confidence_score=0.7
)
# Export network for Cytoscape
import pandas as pd
df = pd.DataFrame(result.network_edges)
df.to_csv("apoptosis_network.tsv", sep="\t", index=False)
```
### 5. With BioGRID Validation
Use BioGRID for experimentally validated interactions:
```python
# Requires BIOGRID_API_KEY in environment
result = analyze_protein_network(
tu=tu,
proteins=["TP53", "MDM2"],
include_biogrid=True # Enable BioGRID fallback
)
print(f"Primary source: {result.primary_source}") # "STRING" or "BioGRID"
```
### 6. Including Structural Data
Add SAXS/SANS solution structures:
```python
result = analyze_protein_network(
tu=tu,
proteins=["TP53"],
include_structure=True # Query SASBDB
)
if result.structural_data:
print(f"\nFound {len(result.structural_data)} SAXS/SANS entries:")
for entry in result.structural_data:
print(f" {entry.get('sasbdb_id')}: {entry.get('title')}")
```
## Parameters
### `analyze_protein_network()` Parameters
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `tu` | ToolUniverse | Required | ToolUniverse instance |
| `proteins` | list[str] | Required | Protein identifiers (gene symbols, UniProt IDs) |
| `species` | int | 9606 | NCBI taxonomy ID (9606=human, 10090=mouse) |
| `confidence_score` | float | 0.7 | Min interaction confidence (0-1). 0.4=low, 0.7=high, 0.9=very high |
| `include_biogrid` | bool | False | Use BioGRID if STRING fails (requires API key) |
| `include_structure` | bool | False | Include SASBDB structural data (slower) |
| `suppress_warnings` | bool | True | Suppress ToolUniverse loading warnings |
### Species IDs (Common)
- `9606` - Homo sapiens (human)
- `10090` - Mus musculus (mouse)
- `10116` - Rattus norvegicus (rat)
- `7227` - Drosophila melanogaster (fruit fly)
- `6239` - Caenorhabditis elegans (worm)
- `7955` - Danio rerio (zebrafish)
- `559292` - Saccharomyces cerevisiae (yeast)
### Confidence Score Guidelines
| Score | Level | Description | Use Case |
|-------|-------|-------------|----------|
| 0.15 | Very low | All evidence | Exploratory, hypothesis generation |
| 0.4 | Low | Medium evidence | Default STRING threshold |
| 0.7 | High | Strong evidence | **Recommended** - reliable interactions |
| 0.9 | Very high | Strongest evidence | Core interactions only |
## Results Structure
### `ProteinNetworkResult` Object
```python
@dataclass
class ProteinNetworkResult:
# Phase 1: Identifier mapping
mapped_proteins: List[Dict[str, Any]]
mapping_success_rate: float
# Phase 2: Network retrieval
network_edges: List[Dict[str, Any]]
total_interactions: int
# Phase 3: Enrichment analysis
enriched_terms: List[Dict[str, Any]]
ppi_enrichment: Dict[str, Any]
# Phase 4: Structural data (optional)
structural_data: Optional[List[Dict[str, Any]]]
# Metadata
primary_source: str # "STRING" or "BioGRID"
warnings: List[str]
```
### Network Edge Format (STRING)
```python
{
"stringId_A": "9606.ENSP00000269305", # Protein A STRING ID
"stringId_B": "9606.ENSP00000258149", # Protein B STRING ID
"preferredName_A": "TP53", # Protein A name
"preferredName_B": "MDM2", # Protein B name
"ncbiTaxonId": 9606, # Species
"score": 0.999, # Combined confidence (0-1)
"nscore": 0.0, # Neighborhood score
"fscore": 0.0, # Gene fusion score
"pscore": 0.0, # Phylogenetic profile score
"ascore": 0.947, # Coexpression score
"escore": 0.951, # Experimental score
"dscore": 0.9, # Database score
"tscore": 0.994 # Text mining score
}
```
### Enrichment Term Format
```python
{
"category": "Process", # GO category
"term": "GO:0006915", # GO term ID
"description": "apoptotic process", # Term description
"number_of_genes": 4, # Genes in your set
"number_of_genes_in_background": 1234, # Genes in genome
"p_value": 1.23e-05, # Enrichment p-value
"fdr": 0.0012, # FDR correction
"inputGenes": "TP53,MDM2,BAX,CASP3" # Matching genes
}
```
## Workflow Details
### 4-Phase Analysis Pipeline
```
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Phase 1: Identifier Mapping β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β STRING_map_identifiers() β
β β’ Validates protein names exist in database β
β β’ Converts to STRING IDs for consistency β
β β’ Returns mapping success rate β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Phase 2: Network Retrieval β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β PRIMARY: STRING_get_network() (no API key needed) β
β β’ Retrieves all pairwise interactions β
β β’ Returns confidence scores by evidence type β
β β
β FALLBACK: BioGRID_get_interactions() (if enabled) β
β β’ Used if STRING fails or for validation β
β β’ Requires BIOGRID_API_KEY β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Phase 3: Enrichment Analysis β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β STRING_functional_enrichment() β
β β’ GO terms (Process, Component, Function) β
β β’ KEGG pathways β
β β’ Reactome pathways β
β β’ FDR-corrected p-values β
β β
β STRING_ppi_enrichment() β
β β’ Tests if proteins interact more than random β
β β’ Returns p-value for functional coherence β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Phase 4: Structural Data (Optional) β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β SASBDB_search_entries() β
β β’ SAXS/SANS solution structures β
β β’ Protein flexibility and conformations β
β β’ Complements crystal/cryo-EM data β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
```
## Installation & Setup
### Prerequisites
```bash
# Install ToolUniverse (if not already installed)
pip install tooluniverse
# Or with extras
pip install tooluniverse[all]
```
### Optional: BioGRID API Key
For BioGRID fallback functionality:
1. Register for free API key: https://webservice.thebiogrid.org/
2. Add to `.env` file:
```bash
BIOGRID_API_KEY=your_key_here
```
### Skill Files
```
tooluniverse-protein-interactions/
βββ SKILL.md # This file
βββ python_implementation.py # Main implementation
βββ QUICK_START.md # Quick reference
βββ DOMAIN_ANALYSIS.md # Design rationale
βββ KNOWN_ISSUES.md # ToolUniverse limitations
```
## Known Limitations
### 1. ToolUniverse Verbose Output
**Issue**: ToolUniverse prints 40+ warning messages during analysis.
**Workaround**: Filter output when running:
```bash
python your_script.py 2>&1 | grep -v "Error loading tools"
```
See `KNOWN_ISSUES.md` for details.
### 2. BioGRID Requires API Key
BioGRID fallback requires free API key. STRING works without any API key.
### 3. SASBDB May Have API Issues
SASBDB endpoints occasionally return errors. Structural data is optional.
## Performance
### Typical Execution Times
| Operation | Time | Notes |
|-----------|------|-------|
| Identifier mapping | 1-2 sec | For 5 proteins |
| Network retrieval | 2-3 sec | Depends on network size |
| Enrichment analysis | 3-5 sec | For 374 terms |
| Full 4-phase analysis | 6-10 sec | Excluding ToolUniverse overhead |
**Note**: Add 4-8 seconds per tool call for ToolUniverse loading (framework limitation).
### Optimization Tips
1. **Disable structural data** if not needed: `include_structure=False`
2. **Use higher confidence scores** to reduce network size: `confidence_score=0.9`
3. **Filter output** to avoid processing warning messages
4. **Reuse ToolUniverse instance** across multiple analyses
## Troubleshooting
### "Error: 'protein_ids' is a required property"
β
**Fixed in this skill** - All parameter names verified in Phase 2 testing.
### No interactions found
- Check protein names are correct (case-sensitive)
- Try lower confidence score: `confidence_score=0.4`
- Verify species ID is correct
- Check if proteins actually interact (not all proteins have known interactions)
### BioGRID not working
- Ensure `BIOGRID_API_KEY` is set in environment
- Check API key is valid at https://webservice.thebiogrid.org/
- BioGRID is optional - STRING works without it
### Slow performance
- This is expected (see KNOWN_ISSUES.md)
- ToolUniverse framework reloads tools on every call
- Use output filtering to reduce processing time
## Examples
See `python_implementation.py` for:
- `example_tp53_analysis()` - Complete TP53 network analysis
- `analyze_protein_network()` - Main function with all options
- `ProteinNetworkResult` - Result data structure
## References
- **STRING**: https://string-db.org/ (14M+ proteins, 5,000+ organisms)
- **BioGRID**: https://thebiogrid.org/ (2.3M+ interactions, experimentally validated)
- **SASBDB**: https://www.sasbdb.org/ (2,000+ SAXS/SANS entries)
- **ToolUniverse**: https://github.com/mims-harvard/ToolUniverse
## Support
For issues with:
- **This skill**: Check KNOWN_ISSUES.md and troubleshooting section
- **ToolUniverse framework**: See TOOLUNIVERSE_BUG_REPORT.md
- **API errors**: Check database status pages (STRING, BioGRID, SASBDB)
## License
Same as ToolUniverse framework license.Signals
Information
- Repository
- mims-harvard/ToolUniverse
- Author
- mims-harvard
- Last Sync
- 3/12/2026
- Repo Updated
- 3/12/2026
- Created
- 2/19/2026
Reviews (0)
No reviews yet. Be the first to review this skill!
Related Skills
upgrade-nodejs
Upgrading Bun's Self-Reported Node.js Version
cursorrules
CrewAI Development Rules
cn-check
Install and run the Continue CLI (`cn`) to execute AI agent checks on local code changes. Use when asked to "run checks", "lint with AI", "review my changes with cn", or set up Continue CI locally.
CLAUDE
CLAUDE.md
Related Guides
Bear Notes Claude Skill: Your AI-Powered Note-Taking Assistant
Learn how to use the bear-notes Claude skill. Complete guide with installation instructions and examples.
Mastering tmux with Claude: A Complete Guide to the tmux Claude Skill
Learn how to use the tmux Claude skill. Complete guide with installation instructions and examples.
OpenAI Whisper API Claude Skill: Complete Guide to AI-Powered Audio Transcription
Learn how to use the openai-whisper-api Claude skill. Complete guide with installation instructions and examples.