General

tooluniverse gwas trait to gene - Claude MCP Skill

GWAS Trait-to-Gene Discovery

SEO Guide: Enhance your AI agent with the tooluniverse gwas trait to gene tool. This Model Context Protocol (MCP) server allows Claude Desktop and other LLMs to gwas trait-to-gene discovery... Download and configure this skill to unlock new capabilities for your AI workflow.

๐ŸŒŸ16 stars โ€ข 161 forks
๐Ÿ“ฅ0 downloads

Documentation

SKILL.md
# GWAS Trait-to-Gene Discovery

**Discover genes associated with diseases and traits using genome-wide association studies (GWAS)**

## Overview

This skill enables systematic discovery of genes linked to diseases/traits by analyzing GWAS data from two major resources:
- **GWAS Catalog** (EBI/NHGRI): Curated catalog of published GWAS with >500,000 associations
- **Open Targets Genetics**: Fine-mapped GWAS signals with locus-to-gene (L2G) predictions

## Use Cases

**Clinical Research**
- "What genes are associated with type 2 diabetes?"
- "Find genetic risk factors for coronary artery disease"
- "Which genes contribute to Alzheimer's disease susceptibility?"

**Drug Target Discovery**
- Identify genes with strong genetic evidence for disease causation
- Prioritize targets based on L2G scores and replication across studies
- Find genes with genome-wide significant associations (p < 5e-8)

**Functional Genomics**
- Map disease-associated variants to candidate genes
- Analyze genetic architecture of complex traits
- Understand polygenic disease mechanisms

## Workflow

```
1. Trait Search โ†’ Search GWAS Catalog by disease/trait name
       โ†“
2. SNP Aggregation โ†’ Collect genome-wide significant SNPs (p < 5e-8)
       โ†“
3. Gene Mapping โ†’ Extract mapped genes from associations
       โ†“
4. Evidence Ranking โ†’ Score by p-value, replication, fine-mapping
       โ†“
5. Annotation (Optional) โ†’ Add L2G predictions from Open Targets
```

## Key Concepts

**Genome-wide Significance**
- Standard threshold: p < 5ร—10โปโธ
- Accounts for multiple testing burden across ~1M common variants
- Higher confidence: p < 5ร—10โปยนโฐ or replicated across studies

**Gene Mapping Methods**
- **Positional**: Nearest gene to lead SNP
- **Fine-mapping**: Statistical refinement to credible variants
- **Locus-to-Gene (L2G)**: Integrative score combining multiple evidence types

**Evidence Confidence Levels**
- **High**: L2G score > 0.5 OR multiple studies with p < 5e-10
- **Medium**: 2+ studies with p < 5e-8
- **Low**: Single study or marginal significance

## Required ToolUniverse Tools

### GWAS Catalog (11 tools)
- `gwas_get_associations_for_trait` - Get all associations for a trait (sorted by p-value)
- `gwas_search_snps` - Search SNPs by gene mapping
- `gwas_get_snp_by_id` - Get SNP details (MAF, consequence, location)
- `gwas_get_study_by_id` - Get study metadata
- `gwas_search_associations` - Search associations with filters
- `gwas_search_studies` - Search studies by trait/cohort
- `gwas_get_associations_for_snp` - Get all associations for a SNP
- `gwas_get_variants_for_trait` - Get variants for a trait
- `gwas_get_studies_for_trait` - Get studies for a trait
- `gwas_get_snps_for_gene` - Get SNPs mapped to a gene
- `gwas_get_associations_for_study` - Get associations from a study

### Open Targets Genetics (6 tools)
- `OpenTargets_search_gwas_studies_by_disease` - Search studies by disease ontology
- `OpenTargets_get_study_credible_sets` - Get fine-mapped loci for a study
- `OpenTargets_get_variant_credible_sets` - Get credible sets for a variant
- `OpenTargets_get_variant_info` - Get variant annotation (frequencies, consequences)
- `OpenTargets_get_gwas_study` - Get study metadata
- `OpenTargets_get_credible_set_detail` - Get detailed credible set information

## Parameters

**Required**
- `trait` - Disease/trait name (e.g., "type 2 diabetes", "coronary artery disease")

**Optional**
- `p_value_threshold` - Significance threshold (default: 5e-8)
- `min_evidence_count` - Minimum number of studies (default: 1)
- `max_results` - Maximum genes to return (default: 100)
- `use_fine_mapping` - Include L2G predictions (default: true)
- `disease_ontology_id` - Disease ontology ID for Open Targets (e.g., "MONDO_0005148")

## Output Schema

```python
{
  "genes": [
    {
      "symbol": str,              # Gene symbol (e.g., "TCF7L2")
      "min_p_value": float,       # Most significant p-value
      "evidence_count": int,      # Number of independent studies
      "snps": [str],              # Associated SNP rs IDs
      "studies": [str],           # GWAS study accessions
      "l2g_score": float | null,  # Locus-to-gene score (0-1)
      "credible_sets": int,       # Number of credible sets
      "confidence_level": str     # "High", "Medium", or "Low"
    }
  ],
  "summary": {
    "trait": str,
    "total_associations": int,
    "significant_genes": int,
    "data_sources": ["GWAS Catalog", "Open Targets"]
  }
}
```

## Example Results

**Type 2 Diabetes**
```
TCF7L2:  p=1.2e-98, 15 studies, L2G=0.82 โ†’ High confidence
KCNJ11:  p=3.4e-67, 12 studies, L2G=0.76 โ†’ High confidence
PPARG:   p=2.1e-45, 8 studies,  L2G=0.71 โ†’ High confidence
FTO:     p=5.6e-42, 10 studies, L2G=0.68 โ†’ High confidence
IRS1:    p=8.9e-38, 6 studies,  L2G=0.54 โ†’ High confidence
```

**Alzheimer's Disease**
```
APOE:    p=1.0e-450, 25 studies, L2G=0.95 โ†’ High confidence
BIN1:    p=2.3e-89,  18 studies, L2G=0.88 โ†’ High confidence
CLU:     p=4.5e-67,  16 studies, L2G=0.82 โ†’ High confidence
ABCA7:   p=6.7e-54,  14 studies, L2G=0.79 โ†’ High confidence
CR1:     p=8.9e-52,  13 studies, L2G=0.75 โ†’ High confidence
```

## Best Practices

**1. Use Disease Ontology IDs for Precision**
```
# Instead of:
discover_gwas_genes("diabetes")  # Ambiguous

# Use:
discover_gwas_genes(
    "type 2 diabetes",
    disease_ontology_id="MONDO_0005148"  # Specific
)
```

**2. Filter by Evidence Strength**
```
# For drug targets, require strong evidence:
discover_gwas_genes(
    "coronary artery disease",
    p_value_threshold=5e-10,    # Stricter than GWAS threshold
    min_evidence_count=3,       # Multiple independent studies
    use_fine_mapping=True       # Include L2G predictions
)
```

**3. Interpret Results Carefully**
- **Association โ‰  Causation**: GWAS identifies correlated variants, not necessarily causal genes
- **Linkage Disequilibrium**: Lead SNP may tag the true causal variant in a nearby gene
- **Fine-mapping**: L2G scores provide better causal gene evidence than positional mapping
- **Functional Evidence**: Validate with orthogonal data (eQTLs, knockout models, etc.)

## Limitations

1. **Gene Mapping Uncertainty**
   - Positional mapping assigns SNPs to nearest gene (may be incorrect)
   - Fine-mapping available for only a subset of studies
   - Intergenic variants difficult to map

2. **Population Bias**
   - Most GWAS in European populations
   - Effect sizes may differ across ancestries
   - Rare variants often under-represented

3. **Sample Size Dependence**
   - Larger studies detect more associations
   - Older small studies may have false negatives
   - p-values alone don't indicate effect size

4. **Validation Bug**
   - Some ToolUniverse tools have oneOf validation issues
   - Use `validate=False` parameter if needed
   - This is automatically handled in the Python implementation

## Related Skills

- **Variant-to-Disease Association**: Look up specific SNPs (e.g., rs7903146 โ†’ T2D)
- **Gene-to-Disease Links**: Find diseases associated with known genes
- **Drug Target Prioritization**: Rank targets by genetic evidence
- **Population Genetics Analysis**: Compare allele frequencies across populations

## Data Sources

**GWAS Catalog**
- Curator: EBI and NHGRI
- URL: https://www.ebi.ac.uk/gwas/
- Coverage: 100,000+ publications, 500,000+ associations
- Update Frequency: Weekly

**Open Targets Genetics**
- Curator: Open Targets consortium
- URL: https://genetics.opentargets.org/
- Coverage: Fine-mapped GWAS, L2G predictions, QTL colocalization
- Update Frequency: Quarterly

## Citation

If you use this skill in research, please cite:

```
Buniello A, et al. (2019) The NHGRI-EBI GWAS Catalog of published genome-wide
association studies. Nucleic Acids Research, 47(D1):D1005-D1012.

Mountjoy E, et al. (2021) An open approach to systematically prioritize causal
variants and genes at all published human GWAS trait-associated loci.
Nature Genetics, 53:1527-1533.
```

## Support

For issues with:
- **Skill functionality**: Open issue at tooluniverse/skills
- **GWAS data**: Contact GWAS Catalog or Open Targets support
- **Tool errors**: Check ToolUniverse tool status

Signals

Avg ratingโญ 0.0
Reviews0
Favorites0

Information

Repository
mims-harvard/ToolUniverse
Author
mims-harvard
Last Sync
2/20/2026
Repo Updated
2/20/2026
Created
2/19/2026

Reviews (0)

No reviews yet. Be the first to review this skill!