QueueSim
Run M/M/c queue simulations and four scenarios (call center, ER, coffee shop, single server).
Run a generic M/M/c queue simulation. Provide an arrival rate (λ, arrivals/hour), a service rate per server (μ, customers/hour each server can finish), and a server count (c). Optional: distribution shapes, service coefficient of variation, run length. Returns per-hour metrics and an overall summary (avg wait, queue length, offered load, throughput). This is the primary tool for 'how many servers do I need?' / 'what's my average wait?' style questions. ALSO preferred over simulate_scenario for what-if questions about scheduled scenarios (Coffee Shop) when the user wants flat uniform numbers — pull the peak params from describe_scenario and run them here. That usually matches user intent better than collapsing a schedule. ANTI-FABRICATION: the returned numbers come from a real discrete-event simulation run. Quote them VERBATIM in your reply. Do not round, estimate, or compute derived figures from training-data recall. If the user asks a follow-up about the same configuration, re-call this tool rather than recalling numbers from earlier in the conversation.
List the four pre-built QueueSim scenarios. Returns key, title, and one-line description for each (Single Server, Coffee Shop, Grocery Checkout, Call Center). Call this when the user's problem matches one of the preset shapes — use describe_scenario for more detail and simulate_scenario to run one.
Return full details for one preset scenario: title, description, teaching note, peak parameters, and per-hour arrival + staffing arrays. Use this before simulate_scenario to understand the default shape and what overrides make sense.
Run one of the four preset scenarios (single, coffee, grocery, callcenter) with optional overrides. Overrides apply UNIFORMLY across open hours — e.g. setting servers=5 on 'coffee' replaces the 4/6/4 staffing pattern with a flat 5 during open hours (closed hours stay at zero). Use this for (a) faithful reproduction of a scenario's defaults, or (b) uniform scaling (everywhere it was open, use these new numbers). Do NOT use this when the user wants to keep a scheduled scenario's shape but tweak just one part — there's no per-hour override here, and collapsing a 4/6/4 pattern to 5 often isn't what the user meant. For flat what-if analysis on scheduled scenarios, prefer simulate_mmc using peak params from describe_scenario. ANTI-FABRICATION: returned numbers come from a real discrete-event simulation run. Quote them VERBATIM in your reply. Do not round, estimate, or compute derived figures from training-data recall. If the user asks a follow-up about the same scenario+overrides, re-call this tool rather than recalling numbers from earlier in the conversation.
Run a queueing simulation against an arbitrary 24-hour staffing schedule. Take this when the user describes a custom day shape that doesn't match a preset (e.g., 'my coffee shop is open 6am–10pm with 4 baristas off-peak, 7 at the 8am rush, 5 at the 4pm rush'). Inputs: `arrivalRates` (24-element array of customers/hr per hour-of-day) and `staffing` (24-element array of servers/hr); optional uniform `serviceTimeMinutes`. Use 0 in both arrays for closed hours (terminating system). Returns the same per-hour metrics + summary shape as simulate_mmc / simulate_scenario. Stronger fit than simulate_scenario when the user's shape doesn't match the four presets; stronger fit than simulate_mmc when they need per-hour variation. ANTI-FABRICATION: numbers come from a real DES run. Quote them VERBATIM. Do not round, estimate, or derive from training-data recall.
Run the classic operations-research teaching demo: pooled queueing (one shared queue, c servers) vs separate queues (c independent queues, one server each, λ/c traffic to each). Both runs have identical total capacity (c × μ) and identical total arrivals (λ), so the offered load ρ is the same; the only structural difference is whether arrivals share a queue or split into c isolated streams. The pooled configuration ALWAYS produces shorter waits — that's the whole teaching point. Use this when the user asks 'should we pool our resources?' / 'should we cross-train?' / 'why do banks have one line instead of c?' / 'what's the cost of siloing my call center into specialist queues?'. Returns both runs side by side with the pooled-vs-separate wait delta. ANTI-FABRICATION: numbers come from two real DES runs. Quote them VERBATIM.
Return a ~500-word educational explainer of M/M/c queueing theory: Little's Law, utilization, why averages mislead, how simulation relates to Erlang-C. No inputs. Use this when the user asks a conceptual 'why' or 'how does this work' question rather than asking for a number.
Return a textbook-level description of six queueing complexity patterns beyond basic M/M/c: abandonment/reneging, priority tiers, overflow routing, skills-based routing, compound service, and server outages. Use this when the user describes real-world complexity (customers hanging up, VIP queues, specialist escalation, agent breaks, transfers) that plain M/M/c doesn't model. The tool frames each pattern conceptually and points users at ChiAha for custom modeling.
INVERSE of simulate_mmc — given an arrival rate, service rate, and a target average wait time, returns the SMALLEST number of servers needed to meet the target. Use this when the user asks 'how many servers do I need?' / 'what staffing keeps wait under N minutes?'. The tool runs a binary search over candidate server counts (up to maxServers, default 50), invoking the simulator for each candidate. Saves Claude from iterating simulate_mmc 3-5 times by hand. If even maxServers servers can't meet the target, the recommendation is null and the response includes the achieved wait so Claude can explain that the target is infeasible at the given load. ANTI-FABRICATION: `recommendedServers` and `achievedAvgWaitMinutes` come from real DES runs. Quote them VERBATIM. Do not propose a different number you think 'feels right'; this tool already binary-searches for the minimum that meets the target. If the user asks 'what if c=N?' for a specific N, call simulate_mmc with that c.
Given an M/M/c configuration (arrivalRate, serviceRate, servers) and optionally an observed average wait, returns a queueing-theory framed interpretation: where you sit on the utilization curve, what ρ means in plain language, what one more or fewer server would qualitatively do, and which complexity factors (priority, abandonment, skills routing) might be hiding in real data the M/M/c model can't see. Use this to TEACH while answering — when the user wants context around a number, not just the number itself. Pure text computation, no simulation, no RNG — deterministic output.
Run the same M/M/c configuration through BOTH the closed-form Erlang-C formula AND the discrete-event simulator, returning a side-by-side comparison with deltas. Use this when the user is validating QueueSim's engine against textbook values, learning queueing theory by watching simulation converge on the formula, or auditing a result that 'feels off' — agreement within ~5%% is the canonical sanity check for an M/M/c run. Pure-Exponential M/M/c only; the closed-form Erlang-C is undefined for other service distributions. Large deltas usually mean the simulation run was too short for steady-state — raise simulationDays. ANTI-FABRICATION: both sides come from real computation — closed-form is deterministic, simulation is stochastic but engine-backed. Quote both verbatim. Do not synthesize an 'average of the two' or recompute the formula from training-data recall.
| Timestamp | Status | Latency | Conformance |
|---|---|---|---|
| Jun 12, 2026 | success | 28.1ms | Pass |
| Jun 11, 2026 | success | 5.7ms | Pass |
| Jun 11, 2026 | success | 6.6ms | Pass |
| Jun 10, 2026 | success | 5.2ms | Pass |
| Jun 9, 2026 | success | 13.1ms | Pass |
| Jun 5, 2026 | success | 7.1ms | Pass |
| Jun 5, 2026 | success | 6.2ms | Pass |
| Jun 4, 2026 | success | 15.6ms | Pass |
| Jun 3, 2026 | success | 11ms | Pass |
| May 30, 2026 | success | 11.9ms | Pass |