Best Evaluation AI Skills & MCP Servers

13 curated Evaluation skills and MCP servers — install any of them into Claude, Cursor, ChatGPT, n8n, or any AI stack with one command.

Tuningengines Cli

MCP Registry

Tuning Engines CLI, MCP server, and Python agent runtime adapters for governed model, agent, skill, and MCP workflows. Fine-tune open-source LLMs, run inference, manage datasets/evaluations, and connect LangGraph or Temporal while Tuning Engines handles p

MCP Registry · ★ 5.0free

Server

MCP Registry

The agent eval standard for MCP. Score every agent output for quality, safety, and cost.

MCP Registry · ★ 5.0free

Formulon

MCP Registry

MCP server for Formulon Excel-compatible formula and workbook evaluation

MCP Registry · ★ 5.0free

Mcplab

MCP Registry

MCP server that exposes MCPLab evaluation tools — query runs, results, and traces via the Model Context Protocol

MCP Registry · ★ 5.0free

Server Tester

MCP Registry

Playwright-based testing and evaluation framework for MCP servers

MCP Registry · ★ 5.0free

Ai Agent Guidelines

MCP Registry

MCP server exposing public instruction workflows as tools, backed by hidden AI agent skills for requirements, orchestration, quality, research, evaluation, governance, resilience, and physics-inspired analysis

MCP Registry · ★ 5.0free

Nia Web Eval Agent

MCP Registry

NIA AI Web Evaluation Agent MCP Server - Autonomous browser testing and debugging

MCP Registry · ★ 5.0free

Mcplab Core

MCP Registry

Core evaluation engine for MCPLab — agent adapters, MCP client, scenario runner, and result types

MCP Registry · ★ 5.0free

Mcplab

MCP Registry

MCPLab - Test and evaluate MCP servers with LLMs — run evals, compare agents and launch the MCPLab web app

MCP Registry · ★ 5.0free

Mcp

MCP Registry

Official MCP server for Autousers — UX evaluation, calibrated AI personas, side-by-side design review.

MCP Registry · ★ 5.0free

Mcplab Reporting

MCP Registry

HTML report generation for MCPLab evaluation runs

MCP Registry · ★ 5.0free

Evals

MCP Registry

GitHub Action for evaluating MCP server tool calls using LLM-based scoring

MCP Registry · ★ 5.0free

Xcomet

MCP Registry

MCP Server for xCOMET translation quality evaluation

MCP Registry · ★ 5.0free

About Evaluation skills on iClaude

iClaude is the universal install layer for AI skills. Every Evaluation skill on this page can be installed into Claude Code, Claude Desktop, Cursor, ChatGPT, n8n, Codex, and more — using a single copy-paste command. No config drift, no per-stack adapters, no manual MCP wiring.

← browse the full catalog