Best Evals AI Skills & MCP Servers
8 curated Evals skills and MCP servers — install any of them into Claude, Cursor, ChatGPT, n8n, or any AI stack with one command.
Evals
GitHub Action for evaluating MCP server tool calls using LLM-based scoring
Sdk
MCP server unit testing, end to end (e2e) testing, and server evals
Server Tester
Playwright-based testing and evaluation framework for MCP servers
Confused Ai
Fast TypeScript AI agent framework — per-request agents, 30+ model providers, 100+ integrations, 20+ vector DBs, 10+ databases, sessions, memory, knowledge, tracing, evals, HITL, teams, and workflows.
Server
An MCP server that serves information from Braintrust documentation as well as your evals and logs.
Mcplab
MCPLab - Test and evaluate MCP servers with LLMs — run evals, compare agents and launch the MCPLab web app
Skar
Skar turns a captured AI agent trace into a committed pytest regression test. MCP server + CLI. Use when a tool-using agent run fails and you want to lock the failure as an executable test.
Clinical Trials Data Server
Provide structured access to ClinicalTrials.gov data for searching, retrieving, and analyzing clinical trial information. Enable multi-parameter searches, detailed trial retrievals, and statistical analyses to support medical research and healthcare decision-making. Deliver robus
About Evals skills on iClaude
iClaude is the universal install layer for AI skills. Every Evals skill on this page can be installed into Claude Code, Claude Desktop, Cursor, ChatGPT, n8n, Codex, and more — using a single copy-paste command. No config drift, no per-stack adapters, no manual MCP wiring.