Best Eval AI Skills & MCP Servers
59 curated Eval skills and MCP servers — install any of them into Claude, Cursor, ChatGPT, n8n, or any AI stack with one command.
Merch Connector
MCP server that gives merchandising agents eyes on any storefront — scrape, audit, compare, roundtable analysis, and eval tracking via 11 tools.
Paper Search Agent
MCP server for paper-search-agent: academic paper discovery, access planning, and full-text retrieval via campus network
Server
A Model Context Protocol (MCP) server for Ragie
Ai Agent Guidelines
MCP server exposing public instruction workflows as tools, backed by hidden AI agent skills for requirements, orchestration, quality, research, evaluation, governance, resilience, and physics-inspired analysis
Enquire
MCP server giving AI agents (Claude Code, Claude Desktop, Cursor, ChatGPT, Codex, OpenClaw) persistent long-term memory backed by your local Obsidian markdown vault. Hybrid retrieval (BM25 + ML embeddings + BGE reranker, RRF-fused), HNSW + int8 quantizati
Pdf Reader
MCP server for efficient PDF text extraction, search, and metadata retrieval for Claude Code
Mnemo
Structured fact memory MCP server — SQLite + FTS5, trust scoring, entity graph, bilingual retrieval for Claude Code & Codex
Superlocalmemory
Information-geometric agent memory with mathematical guarantees. 4-channel retrieval, Fisher-Rao similarity, zero-LLM mode, EU AI Act compliant. Works with Claude, Cursor, Windsurf, and 17+ AI tools.
Prism
Prism Coder — Cognitive memory + tool-calling intelligence for AI agents. Mind Palace persistent memory (BFCL Gold Certified, 100% Tool-Call Accuracy, 54 Agent Skills, Zero-Search HDC/HRR retrieval, HIPAA-hardened local-first storage, SLERP-optimized GRPO
Judges
45 specialized judges that evaluate AI-generated code for security, cost, and quality.
Clawmem
On-device memory layer for AI agents. Claude Code, OpenClaw, and Hermes. Hooks + MCP server + hybrid RAG search.
Calculator
Evaluate, simplify, and differentiate mathematical expressions via MCP. STDIO or Streamable HTTP.
Recourse Cli
MCP server for AI agents to evaluate consequences before destructive actions. Analyzes Terraform plans, shell commands, and MCP tool calls.
Tuningengines Cli
Tuning Engines CLI, MCP server, and Python agent runtime adapters for governed model, agent, skill, and MCP workflows. Fine-tune open-source LLMs, run inference, manage datasets/evaluations, and connect LangGraph or Temporal while Tuning Engines handles p
Server
The agent eval standard for MCP. Score every agent output for quality, safety, and cost.
Cogmemai
CogmemAi — Autonomous Cognitive Memory for Any Ai System. 95.10% on LongMemEval (top published score on the field's hardest long-term memory benchmark) and 91% on LoCoMo (above human performance). Autonomous memory capture: your Ai's work is saved even wh
Md Feedback
MCP server for markdown plan review — companion to the MD Feedback VS Code extension. AI agents read annotations, mark tasks done, evaluate quality gates, and generate session handoffs. 27 tools for Claude Code, Cursor, and other MCP-compatible clients.
Sigil
Persistent memory for AI coding agents. Local-first knowledge engine with atomic facts, entity graph, and hybrid retrieval. Auto-integrated with Claude Code via hooks; MCP-native for Cursor, Continue, Cline, Windsurf, and any other MCP client.
Skar
Skar turns a captured AI agent trace into a committed pytest regression test. MCP server + CLI. Use when a tool-using agent run fails and you want to lock the failure as an executable test.
Lightrag
Model Context Protocol (MCP) server for LightRAG - 30 fully working tools with complete RAG and Knowledge Graph integration
Formulon
MCP server for Formulon Excel-compatible formula and workbook evaluation
Ori Memory
Cognitive architecture for persistent AI agent memory. Knowledge graph with learning retrieval, ACT-R decay, and spreading activation. Markdown-native, local-first, zero cloud. MCP server + CLI.
Memory Lancedb
MCP server for LanceDB-backed long-term memory with hybrid retrieval (Vector + BM25), cross-encoder rerank, multi-scope isolation, and memory lifecycle management
Mcp
Model Context Protocol server for digitalcalculator.info financial calculators. v0.3.0 ships 9 calculator tools (mortgage monthlyPayment, compound-interest futureValue, retirement401k projection, Social Security estimatedBenefit, paycheck netPay, IRA cont
About Eval skills on iClaude
iClaude is the universal install layer for AI skills. Every Eval skill on this page can be installed into Claude Code, Claude Desktop, Cursor, ChatGPT, n8n, Codex, and more — using a single copy-paste command. No config drift, no per-stack adapters, no manual MCP wiring.