Testing
mem0-test-integration - Claude MCP Skill
Verify a Mem0 integration produced by /mem0-integrate. Runs in the same workspace on the same branch (loose coupling) — installs dependencies, runs the repo's native test suite, then exercises a real end-to-end smoke flow against the user's API key. Produces a scorecard. TRIGGER when: user has just run /mem0-integrate and says "verify", "test the integration", "run /mem0-test-integration", or when a .mem0-integration/ directory exists and tests have not been run yet on the current branch. DO NOT TRIGGER when: the user wants to run general project tests (defer to the repo's native test command), or when no prior /mem0-integrate run exists in the current branch (ask them to run /mem0-integrate first). This skill ONLY catches compile and runtime bugs by design. Logical integration errors — wrong data stored, wrong time retrieved, wrong user scoping — are on the human reviewer.
SEO Guide: Enhance your AI agent with the mem0-test-integration tool. This Model Context Protocol (MCP) server allows Claude Desktop and other LLMs to verify a mem0 integration produced by /mem0-integrate. runs in the same workspace on the same branch... Download and configure this skill to unlock new capabilities for your AI workflow.
Documentation
SKILL.md# mem0-test-integration
Verifies what `/mem0-integrate` produced. Runs in the same workspace,
on the same feature branch. Loose coupling — fast, catches compile and
runtime bugs, does not catch logical errors.
## Canonical sources (use these, not ambient knowledge)
All static checks and smoke-test shapes validate against these URLs.
`WebFetch` each before running step 3.
- Scope-tagged docs index: https://docs.mem0.ai/llms.txt
- OpenAPI (Platform REST): https://docs.mem0.ai/openapi.json
- Published SDK skill (canonical call patterns): https://raw.githubusercontent.com/mem0ai/mem0/main/skills/mem0/SKILL.md
- Vercel AI SDK skill (if the target repo uses `@ai-sdk/*`): https://raw.githubusercontent.com/mem0ai/mem0/main/skills/mem0-vercel-ai-sdk/SKILL.md
- SDK source (cross-check version against frontmatter `mem0_tested_versions`):
- Repo root: https://github.com/mem0ai/mem0
- Python: https://github.com/mem0ai/mem0/tree/main/mem0
- TypeScript: https://github.com/mem0ai/mem0/tree/main/mem0-ts
Read the `Delegated skill:` field in `.mem0-integration/plan.md` — if it
names a skill URL, fetch that skill and use its example blocks as the
reference for both static checks (step 3) and the smoke test (step 5).
## Non-invasiveness contract
Every check in this skill assumes the integration is **additive and
feature-flagged** (see `/mem0-integrate` "Integration principles").
Specifically:
- `product.json` must contain a `feature_flag` field.
- Steps 4–6 run in two passes:
- **Pass A — flag unset.** All pre-existing tests must pass, smoke/E2E
skip. The repo must behave like `main`. Any failure here is a
**hard fail** — do not let the self-heal loop attempt a patch.
- **Pass B — flag set.** New tests must pass, smoke and E2E run.
- If Pass A fails, the scorecard marks `non_invasive: false` and sets
`overall: fail` with a distinct reason code the integrator's heal
loop refuses to touch.
## Preconditions
Refuse to start unless ALL of the following are true:
- `.mem0-integration/` directory exists in the repo root.
- `.mem0-integration/product.json`, `goal.md`, and `plan.md` are readable
and internally consistent (JSON parses, docs non-empty).
- Current branch name begins with `mem0-integrate/` (set by the companion
skill). Prevents accidental runs on unrelated branches.
- Working tree is clean. The skill never modifies source files; any dirty
state means the integration is mid-edit and not ready to verify.
- The same API key the integration used is available in the environment
(`MEM0_API_KEY` for Platform, `OPENAI_API_KEY` for OSS — read which from
`product.json`). Interactive mode asks if missing; CI mode exits 2.
Exit with a written rationale on any precondition failure. Never attempt
to "fix up" state.
## Pipeline
### 1. Read the contract
Load:
- `product.json` → which language, which product (Platform vs OSS), which
mem0 version, `write_site`, `read_site`.
- `plan.md` → the mechanical contract (write pattern, read pattern,
preserved behavior).
- `goal.md` → the intent (displayed in the scorecard only; not tested).
### 2. Install dependencies
Route by language from `product.json`:
| Language | Command |
|---|---|
| Python | `pip install -e .` if editable, else `pip install -r requirements.txt`. Then `pip install mem0ai` if not already present at the pinned version. |
| TypeScript / JavaScript | `npm install` (or `pnpm install` / `yarn install` if detected by lockfile). |
If install fails → exit code 2 with stderr tail. Never move to testing
if dependencies don't resolve.
### 3. Static sanity checks (fast, local, no API calls)
- **Import check**: does the write-site file import the expected Mem0
surface? Authoritative list comes from `## Identify the User's Setup`
in `https://docs.mem0.ai/llms.txt`:
- Platform Python → `from mem0 import MemoryClient`
- Platform TS → `import MemoryClient from "mem0ai"`
- OSS Python → `from mem0 import Memory`
- OSS TS → `import { Memory } from "mem0ai/oss"`
If `plan.md` names a delegated skill (e.g., Vercel AI), use *that*
skill's import signature instead of the list above. Mismatch → fail
with line number.
- **Version check**: installed `mem0ai` version falls in the range from
this skill's `mem0_tested_versions`. Out of range → warn but continue.
- **Type check** (TS tracks only): run `tsc --noEmit` or `tsup --dts`.
Non-zero → fail.
- **Lint** (if the repo has a linter configured): run the repo's own
lint command. Lint failures from this skill's changes → fail; pre-existing
lint failures → surface as a warning.
- **Eager-init check**: grep the `write_site` and `read_site` files (paths
from `product.json`) for `MemoryClient(` or `Memory(` at module scope —
i.e., not inside a function, method, or class body. `MemoryClient()`
validates the API key in `__init__` (network call) and OSS `Memory()`
can eagerly initialize embedding/LLM providers — module-level
instantiation hits the wire on import and breaks Pass A's test
collection whenever the key is unset. Hit → fail with `file:line` and
the lazy-init guidance from `/mem0-integrate` step 8 constraint #7.
### 4. Run the repo's native test suite (two passes)
| Language | Test command (in priority order) |
|---|---|
| Python | `pytest` with the test files from step 5 of the companion skill, else `python -m unittest discover`. |
| TypeScript / JavaScript | `npm test` if defined in package.json; else auto-detect `vitest` or `jest`. |
**Pass A — `feature_flag` unset.** Run the *entire* pre-existing suite
(excluding the new `test_mem0_*` files). **Must be 100% green.** Any
failure here marks `non_invasive: false` in the scorecard and is
a **hard fail** — the integrator's self-heal loop refuses to touch it.
**Pass B — `feature_flag` set** (value from `product.json`). Run the
full suite including the new tests. All must pass.
Isolate integration-introduced failures using `git diff main..HEAD
--name-only`. A test file that exists on `main` and fails only under
the integration branch (flag set *or* unset) counts against the
scorecard regardless of pass. A test file that already failed on `main`
is surfaced as `pre_existing_unrelated` and does not count — but is
still reported so the user can clean it up.
Capture output to `.mem0-integration/test-stdout-flag-off.log` and
`.mem0-integration/test-stdout-flag-on.log`. Scorecard reports pass/fail
per pass.
### 5. Smoke test (real API call, shortest round-trip)
Scripted end-to-end flow tailored to `product.json`. The call shapes
below are the minimal ones; if `plan.md` names a delegated skill, use
*that skill's* minimal example verbatim instead — it is the canonical
shape for the detected stack.
**Platform (Python):**
from mem0 import MemoryClient
c = MemoryClient() # uses MEM0_API_KEY
uid = f"mem0-test-integration-{os.urandom(4).hex()}"
c.add([{"role": "user", "content": "I prefer aisle seats"}], user_id=uid)
hits = c.search("seat preference", user_id=uid)
assert any("aisle" in h.get("memory", "") for h in hits), hits
c.delete_all(user_id=uid) # clean up
**Platform (TS):** same shape with `MemoryClient` from `"mem0ai"`.
**OSS (Python / TS):** uses `Memory()` / `new Memory()` with default config
(OpenAI LLM via `OPENAI_API_KEY`, local Qdrant). If the repo ships a
`docker-compose.yml` with a Qdrant service, the skill starts it first and
tears it down after. If no backing store is reachable → fail with a
clear message naming the fix.
The smoke test always uses a **disposable random user_id** prefixed with
`mem0-test-integration-` so a failed cleanup doesn't pollute the user's
real data. A background tidy step deletes any prefix-matching entries
older than 24 hours on the next run.
Capture output to `.mem0-integration/smoke-stdout.log`.
### 6. E2E integration test (run the app, exercise the flow)
Unit tests + smoke prove the SDK works in isolation. This step is the
real signal: **does memory actually appear in the app's user-visible
output when the integration runs end-to-end?**
Requires `plan.md` to contain an `E2E recipe:` section (authored by
`/mem0-integrate` step 5). If absent → status `skipped` (not `fail`),
note in scorecard that the repo has no runnable entry point.
Recipe fields the skill reads:
- `start` — shell command to launch the app using `$PORT` for any network
port. Run in background with stdout/stderr teed to
`.mem0-integration/e2e-app.log`.
- `ready_probe` — how to detect readiness. `url=... status=...` polls an
HTTP endpoint; `log="..."` waits for a substring in `e2e-app.log`;
`sleep=N` waits N seconds (last resort). 60-second hard timeout.
- `compose_services` — optional. If set, bring them up via
`docker compose up -d <services>` before `start`, tear them down with
`docker compose down` at the end.
- `write_call` — triggers the Mem0 write path exactly once. Output is
captured and surfaced on failure. 60-second hard timeout.
- `write_async_wait_ms` — pause after `write_call` to let async memory
flushes land. Default 0.
- `read_call` — triggers the Mem0 read path. Typically a fresh session
or new request that should surface the stored memory.
- `read_assert` — substring, `regex=...`, or `jsonpath=<expr>=<value>`
that must appear in `read_call`'s stdout. This is the E2E pass gate.
Execution order:
1. Allocate an ephemeral TCP port; export as `PORT`.
2. Set `MEM0_USER_ID` to a disposable `mem0-test-integration-<rand>` value
and export it, so the app can use the same scoping the smoke test does
if the recipe wants cleanup.
3. Bring up `compose_services` if named.
4. Run `start` in the background.
5. Poll `ready_probe` until success or 60s timeout. Timeout → fail.
6. Run `write_call`. Non-zero exit → fail (but continue to cleanup).
7. Sleep `write_async_wait_ms`.
8. Run `read_call`.
9. Evaluate `read_assert` against `read_call`'s stdout. Miss → fail.
10. Cleanup (always, even on failure): SIGTERM the app, SIGKILL after
5s, `docker compose down` if services were started, `delete_all`
memories matching `mem0-test-integration-*` on Platform scenarios.
On any failure, the scorecard includes:
- Last 40 lines of `e2e-app.log`
- Full `write_call` output
- Full `read_call` output
- The expected vs actual for `read_assert`
### 7. Scorecard
Write `.mem0-integration/scorecard.md` and `.mem0-integration/scorecard.json`:
{
"timestamp": "2026-04-20T14:03:11Z",
"branch": "mem0-integrate/remember-user-preferences",
"product": "platform",
"language": "python",
"mem0_version": "2.0.0",
"non_invasive": true,
"feature_flag": "MEM0_ENABLED",
"results": {
"install": {"status": "pass", "duration_ms": 12043},
"static_checks":{"status": "pass", "duration_ms": 812},
"unit_tests_flag_off": {"status": "pass", "duration_ms": 3920, "count": 47,
"reason": "all pre-existing tests green with flag unset"},
"unit_tests_flag_on": {"status": "pass", "duration_ms": 4321, "count": 49},
"smoke_test": {"status": "pass", "duration_ms": 2890, "memory_id": "mem_..."},
"e2e_test": {"status": "pass", "duration_ms": 14200,
"ready_probe_ms": 3100, "write_exit": 0,
"read_assert_matched": true}
},
"friction": {
"dependency_install_retries": 0,
"pre_existing_test_failures": 0,
"warnings": ["mem0ai 2.0.0 pinned; consider 2.0.1 for fix X"]
},
"overall": "pass"
}
The markdown version is human-readable and includes:
- Goal doc + plan doc reprinted at top (so reviewers don't have to hunt).
- Each check with pass/fail + log excerpt.
- Friction summary.
- Verbatim warnings from mem0 SDK (if any — e.g., deprecated field usage).
- **Explicit "NOT checked" section** listing what loose coupling misses:
"Whether the stored data is what the user wants stored. Whether search
runs at the right moment. Whether user_id matches the actual session
scope. Human review required."
### 8. Report + exit
- Print the scorecard path + overall pass/fail to stdout.
- **Do not commit the scorecard files.** They live in `.mem0-integration/`,
which is gitignored. The user can inspect and optionally pin.
- On fail: print the first failing step's log tail (last 40 lines) and
stop. Do not attempt to fix anything.
## Artifacts (all under `.mem0-integration/`)
| File | Purpose | Retention |
|---|---|---|
| `scorecard.md` | Human-readable verdict. | Overwritten per run. |
| `scorecard.json` | Machine-readable verdict. Consumed by the CI scorecard workflow later. | Overwritten per run. |
| `test-stdout-flag-off.log` | Step 4 Pass A (pre-existing suite, flag unset). | Overwritten per run. |
| `test-stdout-flag-on.log` | Step 4 Pass B (full suite, flag set). | Overwritten per run. |
| `smoke-stdout.log` | Full output from step 5. | Overwritten per run. |
| `e2e-app.log` | Background app stdout/stderr from step 6. | Overwritten per run. |
| `e2e-calls.log` | write_call + read_call invocations and outputs. | Overwritten per run. |
## Modes
| Mode | Trigger | Behavior |
|---|---|---|
| Interactive (default) | TTY present, `MEM0_TEST_CI` unset | Asks for missing keys, prints friendly summaries. |
| CI | `MEM0_TEST_CI=1` | Keys must be in env, no prompts, non-zero exit on any fail. JSON scorecard goes to stdout's tail for workflow parsing. |
## Invocation
/mem0-test-integration # interactive, all steps
/mem0-test-integration --ci # non-interactive
/mem0-test-integration --skip-smoke # no API calls, no E2E
/mem0-test-integration --skip-e2e # unit + smoke only (faster CI)
/mem0-test-integration --only-smoke # just smoke
/mem0-test-integration --only-e2e # just E2E (assumes deps installed)
Composition: `--skip-*` can stack (`--skip-smoke --skip-e2e` = static +
unit only, zero API cost). `--only-*` is mutually exclusive with all
other flags.
## Exit codes
| Code | Meaning |
|---|---|
| 0 | All checks passed. |
| 1 | Precondition failed (no `.mem0-integration/`, wrong branch, dirty tree). |
| 2 | Missing env key (CI mode) or dependency install failure. |
| 3 | Static sanity check failed (wrong import, type error). |
| 4 | Unit tests failed (Pass B — integration itself broken). |
| 5 | Smoke test failed. |
| 6 | E2E test failed (ready_probe timeout, write/read call failed, or read_assert miss). |
| 7 | Non-invasiveness violation: Pass A failed (pre-existing tests broke). Integrator's heal loop refuses to touch this. |
| 8 | Internal error (skill bug — report it). |
## Explicitly out of scope
- **Modifying source files.** The skill is read-only against the repo.
If verification exposes a bug, re-run `/mem0-integrate` on the same
goal + plan; do not hand-patch.
- **Fixing broken tests.** Failing unit tests are a signal that the
integration is wrong, not that the tests are wrong. The skill does
not "try a different test."
- **Deep logical correctness.** The E2E step proves "something the user
said earlier comes back later," which is a useful but shallow signal.
It does NOT prove the integration picks the *right* facts to store,
scopes `user_id` correctly across real users, or handles conflict
resolution well. That's human review territory.
- **Self-healing.** This skill never modifies source files. The paired
`/mem0-integrate` skill in its default `--heal` mode consumes the
scorecard produced here and drives its own remediation loop. Exit
code 7 (non-invasiveness violation) is the explicit signal the heal
loop must stop and surface to the user.
- **Cross-branch comparisons.** No `main` baseline diffing. The
scorecard reflects this branch only.
- **Running against production data.** Every smoke test uses a disposable
random user_id and cleans up after. Never touches any other user's data.Signals
Information
- Repository
- mem0ai/mem0
- Author
- mem0ai
- Last Sync
- 5/10/2026
- Repo Updated
- 5/9/2026
- Created
- 5/6/2026
Reviews (0)
No reviews yet. Be the first to review this skill!
Related Skills
upgrade-nodejs
Upgrading Bun's Self-Reported Node.js Version
upgrade-webkit
Upgrade Bun's Webkit fork to the latest upstream version of Webkit.
cursorrules
CrewAI Development Rules
cn-check
Install and run the Continue CLI (`cn`) to execute AI agent checks on local code changes. Use when asked to "run checks", "lint with AI", "review my changes with cn", or set up Continue CI locally.
Related Guides
Bear Notes Claude Skill: Your AI-Powered Note-Taking Assistant
Learn how to use the bear-notes Claude skill. Complete guide with installation instructions and examples.
Mastering tmux with Claude: A Complete Guide to the tmux Claude Skill
Learn how to use the tmux Claude skill. Complete guide with installation instructions and examples.
OpenAI Whisper API Claude Skill: Complete Guide to AI-Powered Audio Transcription
Learn how to use the openai-whisper-api Claude skill. Complete guide with installation instructions and examples.