Development

tooluniverse-noncoding-rna - Claude MCP Skill

Analyze non-coding RNAs (miRNAs, lncRNAs, circRNAs) using miRBase, LNCipedia, RNAcentral, Rfam, and target prediction databases. Covers ncRNA identification, target prediction, disease associations, expression profiling, and functional annotation. Use when asked about microRNAs, long non-coding RNAs, RNA interference, miRNA targets, lncRNA function, or ncRNA-disease associations.

SEO Guide: Enhance your AI agent with the tooluniverse-noncoding-rna tool. This Model Context Protocol (MCP) server allows Claude Desktop and other LLMs to analyze non-coding rnas (mirnas, lncrnas, circrnas) using mirbase, lncipedia, rnacentral, rfam, and ... Download and configure this skill to unlock new capabilities for your AI workflow.

🌟11 stars • 205 forks
📥0 downloads

Documentation

SKILL.md
# Non-Coding RNA Analysis

Pipeline for identifying, annotating, and interpreting non-coding RNAs and their biological roles. Covers microRNAs (miRNAs), long non-coding RNAs (lncRNAs), and other ncRNA classes.

**Key principles**:
1. **Class determines function** — miRNAs repress mRNA translation; lncRNAs have diverse mechanisms (scaffolds, guides, decoys, enhancers); rRNAs/tRNAs are structural
2. **Targets matter more than the ncRNA itself** — for miRNAs, the regulated mRNA targets determine the phenotype
3. **Expression context is critical** — ncRNAs are highly tissue/cell-type specific
4. **Conservation indicates function** — deeply conserved ncRNAs (miR-let-7, MALAT1) have well-established roles
5. **Evidence grading** — T1: validated targets (reporter assay, CLIP-seq), T2: high-confidence computational prediction, T3: expression correlation, T4: sequence-based prediction only

**Type-based reasoning — look up, don't guess**:
Non-coding RNA function depends on type: miRNA silences target mRNAs (look up targets in miRTarBase/TargetScan), lncRNA has diverse functions (scaffolding, guiding, decoying — check literature for the specific lncRNA), circRNA may sponge miRNAs.

For any ncRNA query: first identify the class from the name/sequence, then select the appropriate evidence source. Do not assume function based on name alone — a gene named "LINC" may have a characterized mechanism, or none at all. Always search PubMed for the specific ncRNA before interpreting. For miRNAs, validated targets (T1) from miRTarBase outweigh any computational prediction — a predicted target with no experimental support is a hypothesis, not a finding. For lncRNAs, mechanism is almost always determined by experimental studies; use `PubMed_search_articles` with the lncRNA name + "mechanism" or "function" to find relevant evidence. For circRNAs, miRNA sponging is the most common proposed mechanism but is frequently over-claimed — look for CLIP-seq or reporter assay evidence before asserting it.

---

## When to Use

- "What are the targets of miR-21?"
- "Find lncRNAs associated with breast cancer"
- "Is this lncRNA conserved across species?"
- "What miRNAs regulate TP53?"
- "Annotate these non-coding RNA IDs"
- "Which miRNAs are biomarkers for [disease]?"

**Not this skill**: For mRNA expression analysis, use `tooluniverse-rnaseq-deseq2`. For CRISPR screens, use `tooluniverse-crispr-screen-analysis`.

---

## Core Tools

| Tool | Use For |
|------|---------|
| `miRBase_search_mirna` | Search miRNAs by name, accession, or sequence |
| `miRBase_get_mirna` | Detailed miRNA info (sequence, genomic location, family) |
| `miRBase_get_mature_mirna` | Mature miRNA sequences and annotations |
| `PubMed_search_articles` | Search for validated miRNA targets in literature (e.g., "miR-21 target validation") |
| `LNCipedia_search_lncrna` | Search lncRNAs by name, gene symbol, or transcript ID |
| `LNCipedia_get_lncrna` | Detailed lncRNA transcript info (sequence, structure, conservation) |
| `LNCipedia_get_lncrna_xrefs` | lncRNA gene info with all transcript variants |
| `LNCipedia_search_ncrna_by_type` | List all transcripts for a lncRNA gene |
| `LNCipedia_get_lncrna_publications` | lncRNA sequence (FASTA format) |
| `RNAcentral_search` | Search all ncRNA types across databases |
| `RNAcentral_get_rna` | Detailed ncRNA annotations from 40+ databases |
| `Rfam_get_family` | RNA family details (structure, alignment, species distribution) |
| `Rfam_search` | Search RNA families by keyword |
| `DisGeNET_search_gene` | ncRNA-disease associations |
| `PubMed_search_articles` | ncRNA literature |
| `GTEx_get_median_gene_expression` | Tissue expression of ncRNA genes |

---

## Workflow

```
Phase 0: ncRNA Identity & Classification
  Name/ID → miRBase/LNCipedia/RNAcentral → class, sequence, genomic location
    |
Phase 1: Target & Interaction Analysis
  miRNA → target mRNAs; lncRNA → interacting proteins/RNAs/chromatin
    |
Phase 2: Expression & Tissue Specificity
  GTEx/GEO → where is it expressed? Tissue-specific or ubiquitous?
    |
Phase 3: Disease Associations
  DisGeNET/PubMed/CTD → ncRNA-disease links with evidence
    |
Phase 4: Functional Interpretation
  Pathway enrichment of targets → biological role → clinical significance
```

### Phase 0: ncRNA Identity & Classification

ncRNA classes by size and database:
- **miRNA** (~22 nt, miRBase): Post-transcriptional silencing via 3'UTR binding
- **lncRNA** (>200 nt, LNCipedia): Diverse — chromatin remodeling, transcription regulation, miRNA sponges
- **rRNA** (120-5000 nt, RNAcentral/Rfam): Ribosome components
- **tRNA** (~76 nt, RNAcentral): Amino acid delivery
- **snoRNA** (60-300 nt, Rfam): rRNA modification (methylation, pseudouridylation)
- **snRNA** (~150 nt, Rfam): Spliceosome components
- **piRNA** (26-31 nt, RNAcentral): Transposon silencing in germline
- **circRNA** (variable, RNAcentral): miRNA sponges, protein scaffolds (experimental evidence required)

**Identification workflow**:
- Name starts with `miR-` or `hsa-mir-` → search miRBase
- Name starts with `LINC`, `MALAT`, `HOTAIR`, `XIST`, or ends in `-AS1` → search LNCipedia
- Any ncRNA type → search RNAcentral (aggregates all databases)
- RNA family question → search Rfam

### Phase 1: Target & Interaction Analysis

**For miRNAs** — the targets determine the biology:

**NOTE**: There is no dedicated miRNA target lookup tool in ToolUniverse. To find miRNA targets:

1. **Literature search** (most reliable): `PubMed_search_articles(query="miR-21 target validation luciferase")`
2. **Cross-references**: `miRBase_get_mirna_xrefs(accession="MIMAT0000076")` — may link to external target databases
3. **Known targets for well-studied miRNAs**: Use the reference table below, then validate via STRING/Reactome
4. **For novel miRNAs**: Search PubMed for "[miRNA] target" and extract validated targets from papers

Well-studied miRNA targets (for common oncomiRs/tumor suppressors):
- **miR-21**: PTEN, PDCD4, TPM1, RECK, SPRY1, SPRY2, BTG2
- **miR-155**: SOCS1, SHIP1, AID, TP53INP1
- **miR-122**: SLC7A1, ADAM17 (also HCV IRES cofactor)
- **let-7**: RAS, HMGA2, MYC, LIN28

**Target interpretation framework**:
- **Validated** (T1): Luciferase reporter, CLIP-seq, degradome-seq — base conclusions on these
- **High-confidence prediction** (T2): TargetScan conserved sites, DIANA-microT score > 0.9 — support validated findings
- **Prediction only** (T3-T4): miRanda, PicTar, RNA22 — hypothesis generation only; do not report as findings

**For lncRNAs** — the mechanism varies:

| lncRNA Mechanism | Example | How to Investigate |
|---|---|---|
| **Chromatin modifier** | HOTAIR, XIST | Check interacting proteins (PRC2, LSD1) via PubMed |
| **Transcription regulator** | NEAT1, MEG3 | Check nearby genes (cis-regulation) via genomic location |
| **miRNA sponge** | MALAT1, circRNAs | Search for miRNA binding sites |
| **Scaffold** | NKILA, BCAR4 | Check protein interactions |
| **Enhancer RNA** | eRNAs | Check ENCODE enhancer annotations |

### Phase 2: Expression & Tissue Specificity

```python
GTEx_get_median_gene_expression(gene_symbol="MIR21")  # miRNA host gene expression
# Note: GTEx measures RNA-seq; miRNA expression may need miRNA-seq data from GEO
```

**Interpretation**: Tissue-restricted ncRNAs are often functionally important in that tissue. Ubiquitous ncRNAs (like MALAT1) tend to have housekeeping roles.

### Phase 3: Disease Associations

```python
DisGeNET_search_gene(query="MIR21")  # miR-21 disease associations
PubMed_search_articles(query="miR-21 biomarker cancer")
```

**Key ncRNA-disease associations** (well-established T1 examples — always verify via DisGeNET or PubMed for the specific ncRNA):
- miR-21: OncomiR in multiple cancers; targets PTEN, PDCD4, TPM1 (hundreds of T1 studies)
- miR-155: B-cell lymphoma, inflammation — immune regulation
- miR-122: Hepatitis C liver disease — HCV replication cofactor; therapeutic target (miravirsen)
- let-7 family: Lung cancer, stem cell differentiation — tumor suppressor targeting RAS, HMGA2
- HOTAIR: Breast/colorectal cancer — recruits PRC2, promotes metastasis
- MALAT1: Lung cancer/metastasis — splicing regulation
- XIST: X-inactivation, cancer — chromatin silencing
- H19: Beckwith-Wiedemann syndrome, cancer — imprinted lncRNA, miR-675 host
- ANRIL: CVD, diabetes, cancer — CDKN2A/B locus regulation (GWAS-validated)

### Phase 4: Functional Interpretation

After identifying miRNA targets (Phase 1), run pathway enrichment:

```python
# Collect validated target gene symbols
targets = ["PTEN", "PDCD4", "TPM1", "RECK", "SPRY1"]  # miR-21 targets

# Pathway enrichment
ReactomeAnalysis_pathway_enrichment(identifiers="PTEN PDCD4 TPM1 RECK SPRY1")
STRING_get_network(identifiers="PTEN\rPDCD4\rTPM1\rRECK\rSPRY1", species=9606)
```

**Interpretation**: If miR-21 targets are enriched in apoptosis and PI3K-AKT signaling → miR-21 is an oncomiR that promotes survival by simultaneously suppressing multiple tumor suppressors.

**Report structure**:
1. **ncRNA Identity** — class, sequence, genomic location, conservation
2. **Targets/Interactions** — validated targets with evidence grades
3. **Expression Profile** — tissue specificity, disease-specific expression changes
4. **Disease Associations** — evidence-graded disease links
5. **Pathway Analysis** — enriched pathways among targets
6. **Mechanistic Model** — how this ncRNA contributes to disease biology
7. **Clinical Potential** — biomarker utility, therapeutic target potential (antagomirs, ASOs)

---

## Limitations

### Computational Procedure: TargetScan Predicted Targets (Download-and-Process)

TargetScan provides the best computational miRNA target predictions but has no REST API. Download and process locally:

```python
# Step 1: Download TargetScan predicted targets (one-time, ~10MB zipped)
# URL: https://www.targetscan.org/vert_80/vert_80_data_download/Summary_Counts.default_predictions.txt.zip
import pandas as pd
import zipfile, io, requests

url = "https://www.targetscan.org/vert_80/vert_80_data_download/Summary_Counts.default_predictions.txt.zip"
resp = requests.get(url, timeout=60)
with zipfile.ZipFile(io.BytesIO(resp.content)) as z:
    fname = z.namelist()[0]
    df = pd.read_csv(z.open(fname), sep='\t')

# Step 2: Query for a specific miRNA family
mirna = "miR-21-5p"  # or "miR-21/590-5p" (TargetScan uses family names)
targets = df[df['miRNA Family'].str.contains("miR-21", case=False, na=False)]

# Step 3: Rank by cumulative weighted context++ score
targets_ranked = targets.sort_values('Cumulative weighted context++ score', ascending=True)
print(f"Top 20 predicted targets of {mirna}:")
for _, row in targets_ranked.head(20).iterrows():
    print(f"  {row['Target Gene']:10s} score={row['Cumulative weighted context++ score']:.3f}  "
          f"sites={row['Total num conserved sites']}")
```

**Interpretation**: More negative context++ score = stronger predicted repression. Conserved sites (>1) are higher confidence.

### Computational Procedure: miRTarBase Validated Targets (Download-and-Process)

miRTarBase has Cloudflare protection blocking programmatic access. Use the R/Bioconductor data package or bulk download:

```python
# Option 1: Download from miRTarBase bulk export (requires browser download first)
# Go to: https://mirtarbase.cuhk.edu.cn/~miRTarBase/miRTarBase_2025/
# Download: hsa_MTI.xlsx (human miRNA-target interactions)

# Option 2: Use the GitHub data dump
# https://github.com/jorainer/mirtarbase — R package with cached data

# Once you have the file:
import pandas as pd
mti = pd.read_excel("hsa_MTI.xlsx")  # or read_csv if TSV

# Filter for your miRNA
mir21_targets = mti[mti['miRNA'].str.contains('hsa-miR-21', case=False, na=False)]
print(f"miR-21 validated targets: {len(mir21_targets)}")

# Filter by evidence strength
strong = mir21_targets[mir21_targets['Support Type'].str.contains(
    'Luciferase|Reporter|Western|CLIP', case=False, na=False
)]
print(f"  Strong evidence (reporter/CLIP): {len(strong)}")
for _, row in strong.head(10).iterrows():
    print(f"    {row['Target Gene']:10s} — {row['Support Type']}")
```

**When download is not available**: Use the built-in reference table in Phase 1 for well-studied miRNAs, or search PubMed for validated targets.

---

## Limitations

- **miRNA target prediction is noisy** — even the best algorithms have >50% false positive rates; always prioritize experimentally validated targets
- **lncRNA function is poorly characterized** — only ~5% of annotated lncRNAs have known functions
- **Expression measurement varies** — miRNA-seq, RNA-seq, and microarray capture different ncRNA classes; check the assay type
- **Species differences** — miRNAs are often conserved but lncRNAs are frequently species-specific; cross-species lncRNA comparisons are unreliable

Signals

Avg rating0.0
Reviews0
Favorites0

Information

Repository
mims-harvard/ToolUniverse
Author
mims-harvard
Last Sync
5/10/2026
Repo Updated
5/10/2026
Created
3/27/2026

Reviews (0)

No reviews yet. Be the first to review this skill!