Leaderboard/codeitall

MCP ServerScored via MCP protocol probing: initialize handshake, tools/list conformance, and ping + tool invocation performance.

codeitall

api.codeitall.dev/for-ai-agents.html↗|v0.1.0

Behavioral oracle for Rust crate APIs: runtime behavior, yank/advisory, deprecation, sig search

89/100

Operational Score

Score Breakdown

Availability30/30

Conformance30/30

Performance29/40

Key Metrics

Uptime 30d

100.0%

P95 Latency

300.1ms

Conformance

Pass

Trend

What's Being Tested

Availability

HTTP health check to the service endpoint

Responded with HTTP 405 in 121ms

Conformance

MCP initialize handshake + tools/list

Valid MCP server info returned, tools/list responded

Performance

MCP ping + zero-arg tool invocation benchmarking

P95 latency: 300ms, task completion: 100%

Skills

answer_api_question

Use this when an agent asks open-ended Rust API questions like 'how do I parse a duration string', 'which crate gives me HMAC verification', 'is this function deprecated', 'what's the current way to do JWT auth', or 'compare base64 crates'. Hand off free-text questions verbatim; the substrate routes them deterministically via rule-based intent detection (no LLM in the request path) and dispatches into signature_search, behavior_lookup, compare_implementations, or find_modern_equivalent as appropriate, falling back to bge-m3 cosine retrieval if no rule matches. Returns a structured verdict with routed_via, the primary recommendation, evidence, and caveats.

signature_search

Use when the agent already knows the signature shape it wants — e.g. `fn(&str) -> Result<Vec<u8>, _>`, `fn(&[u8]) -> Result<String, std::str::Utf8Error>`, or `impl Iterator<Item=Result<...>>`. Returns ranked candidate functions with the crate they live in, downloads, yank status, and a compact behavior summary if probed. Pass the exact signature verbatim; the substrate normalises whitespace + identifiers on its end.

behavior_lookup

Use when the agent has a specific (crate, fn_name) pair and wants to know what inputs it actually accepts at runtime — e.g. `(crate='jiff', fn_name='Timestamp::from_str')` for methods, `(crate='ascii85', fn_name='decode')` for free functions. `fn_name` accepts BOTH qualified (`Type::method` / `module::fn`) and bare (`method` / `fn`) forms — the matcher tries the exact input first, then the alternate form; the `matched_fn_name` response field records the substitution when one happened. Returns the probe observation table verbatim from the substrate: each row is (input, outcome=ok|err|panic, value or error variant). Pass an optional `inputs` array to filter to specific input strings. On a zero-hit the response carries a `diagnostics` block (`received_crate`, `received_fn_name`, `closest_crates`, `closest_fns_in_crate`, `hint`) so the agent can self-correct without a dead-end round-trip. The substrate's discrimination findings live here — runtime behaviour the docs are silent or wrong about.

compare_implementations

Use when the agent asks about a task category — e.g. 'how do I parse JSON in rust', 'compare base64 crates', 'which datetime library handles RFC 3339 timezones right' — and wants the cross-implementation behavior table. The substrate returns side-by-side observations on the canonical input set: for each implementation (crate, fn_name), each input in the family's input set is paired with the observed (outcome, value_or_error_variant). Optional `crates` / `fns` arrays restrict the returned set; optional `summary=true` replaces per-input `observations` with an `n_observations` count for index-only listings (bounds response size by family-member count, not member × input count). Optional `subfamily` narrows to a registered sub-tag (e.g. `task='base64', subfamily='base64'` returns only canonical base64 crates, not ascii85 / base58 / hex / …) — call `list_families` to see available tags. Non-core family members (per a small hand-authored allowlist) carry an advisory `caveat` field warning that the entry was probed on shared inputs that may not reflect its typical usage. On a zero-hit family (`n_attempted=0`) the response carries a `diagnostics` block (`received_family`, `available_families`, `closest_families`, `hint`) so the agent can recover without guessing. Discrimination signal lives here — the docs-silent runtime behaviour pattern Guiding Principle #8 names.

find_modern_equivalent

Use when the agent suspects the code it's about to write uses a deprecated API — e.g. `tokio::Runtime::new()` (now `tokio::runtime::Runtime::new()`), `chrono::DateTime::from_str` (now feature-gated), or any path that points to a yanked or RustSec-advisory'd crate version. The substrate returns the current canonical alternative with a behavior summary if probed, and surfaces yank-status / advisory rows verbatim. For TypeScript/Next.js pass language='ts' with a deprecated Next.js API — e.g. `next/router` (now `next/navigation`), `getServerSideProps` (now an async Server Component), `@next/font` (now `next/font`), `next/legacy/image` (now `next/image`) — and the substrate returns the modern equivalent with the official upgrade-guide citation and how prevalent the deprecated vs modern form is across the live Next.js corpus.

list_families

Use when the agent wants to enumerate the probe families codeitall has covered — typically before constructing a `compare_implementations(task=…)` call, or to recover from a typo flagged by `compare_implementations`'s near-miss diagnostics. Returns the canonical family list with one-line descriptions, per-family counts, and the registered sub-family tags (Phase 1.12 W5; pass any as `subfamily` to `compare_implementations`): `{ families: [{ name, description, n_implementations, n_observations, sub_families }, …] }`. The family names returned here are exactly the strings accepted by `compare_implementations`. No required parameters.

pattern_consensus

Use when the agent wants the cross-project CONSENSUS — 'how do most current Next.js projects actually do X?' — rather than a single example. Returns precomputed consensus records ('X% of projects matching predicate P do Y') for a category: `auth-library` (which auth library projects use), `data-fetching-style` (legacy page-level data fetching vs Server Actions vs Server Components), or `routing-hooks` (modern next/navigation vs legacy next/router). Each record carries its honest denominator N, the full outcome distribution with percentages, and the exact SQL predicate behind every number. Pass `category` to filter, or omit it to get all. Next.js corpus — pass language='ts'.

package_trust

Use before recommending or installing an npm package — to check whether it is safe and current. Pass the package name (and optionally a version, e.g. `4.17.0`) with language='ts'. The substrate returns every OSV/GHSA security advisory affecting it (with severity, CVE aliases, the exact affected version range, and the advisory URL) and, if you pass a version, whether THAT version is affected; plus npm registry signals — whether the package is deprecated (and the maintainer's deprecation message), its latest version and last-publish date, and a maintained flag; plus how many real Next.js projects in the corpus depend on it and which declared version ranges the advisories actually bite. This is the cross-referenced security + maintenance signal a stale model cannot reliably know.

Tools

8 tools verified via live probe

verified 2h ago

Server: codeitall-serverVersion: phase-2.C/v0.4.0Protocol: 2024-11-05

Recent Probe Results

Timestamp	Status	Latency	Conformance	Details
Jun 10, 2026	success	121.4ms	Pass	-
Jun 9, 2026	success	103.3ms	Pass	-
Jun 5, 2026	success	252.7ms	Pass	-
Jun 5, 2026	success	300.1ms	Pass	-
Jun 4, 2026	success	253.1ms	Pass	-
Jun 3, 2026	success	109.7ms	Pass	-
May 30, 2026	success	161.7ms	Pass	-
May 29, 2026	success	172.8ms	Pass	-
May 29, 2026	success	156.4ms	Pass	-

Source Registries

mcp-registry

First Seen

May 27, 2026

Last Seen

Jun 9, 2026

Last Probed

Jun 10, 2026