Last verified · v1.0

Calculator · math

Queueing Theory Calculator (M/M/S)

Calculate M/M/S queue metrics: average wait time, queue length, server utilization, and idle probability for multi-server Poisson arrival systems.

FreeInstantNo signupOpen source

Inputs

Arrival Rate (λ, customers per unit time)

Service Rate per Server (μ, customers per unit time)

Number of Servers (s)

Performance Metric to Compute

Queueing Metric Value

—

Get a plain-English breakdown of your result with practical next steps.

Queueing Metric Value—

The formula

How the
result is
computed.

M/M/S Queueing Theory: The Erlang C Model Explained

The M/M/S queueing model — also called the Erlang C model — describes a waiting-line system where customers arrive according to a Poisson process at mean rate λ, receive service from s identical parallel servers each operating at exponential rate μ, and wait in a single shared queue when all servers are busy. Operations researchers, network engineers, hospital administrators, and manufacturing planners rely on this model to size service capacity, minimize customer wait times, and balance staffing costs against congestion penalties. The queueing theory calculator automates these computations, returning all key steady-state metrics instantly.

Core Variables and Notation

λ (lambda) — Mean customer arrival rate per unit time; arrivals follow a Poisson process.
μ (mu) — Mean service rate per server per unit time; individual service durations are exponentially distributed with mean 1/μ.
s — Number of identical parallel servers sharing one common queue.
a = λ/μ — Offered load in Erlangs: total work arriving per unit time, independent of server count.
ρ = λ/(sμ) — Traffic intensity (server utilization): the long-run fraction of time each server is busy.

Stability Requirement

The M/M/S system reaches a valid steady state only when ρ < 1, meaning λ < sμ. When arrivals equal or exceed combined service capacity, the queue grows without bound and none of the steady-state metrics are meaningful. As established in Chapter 8: Queueing Theory (Eastern Michigan University), verifying this stability condition is the mandatory first step before applying any Erlang C formula. For example, a call center receiving 20 calls/hr with 4 agents each handling 6 calls/hr gives ρ = 20/(4×6) ≈ 0.833 — stable, with servers occupied 83.3% of the time.

Step-by-Step Formula Derivation

Step 1 — Idle-System Probability P₀

P₀ is derived by solving the global balance equations of the birth-death Markov chain underlying the M/M/S system:

P₀ = [ Σ(k=0 to s−1)(a^k / k!) + a^s / (s! (1−ρ)) ]⁻¹

The finite sum accounts for states with fewer than s busy servers; the second term aggregates all states where all servers are occupied and customers queue. In the call center example (λ=20, μ=6, s=4): a=10/3, ρ≈0.833, yielding P₀≈0.0213 — the system is completely idle only about 2.1% of the time.

Step 2 — Mean Queue Length L_q

L_q = P₀ · a^s · ρ / (s! · (1−ρ)²)

This quantity rises sharply as ρ approaches 1. In the 4-server example above, L_q≈3.29 customers wait on average. Adding a fifth agent (s=5, ρ≈0.667) drops L_q to roughly 0.65 — a fivefold reduction from a single additional server. This nonlinear sensitivity near saturation is the defining characteristic of M/M/S systems and the primary motivation for using the Erlang C calculator before making staffing decisions.

Step 3 — Mean Wait Time in Queue W_q

By Little's Law (UCSD ECE 158A), the average time a customer waits before any server becomes free is:

W_q = L_q / λ

In the example: W_q = 3.29 / 20 ≈ 0.165 hours ≈ 9.9 minutes. This is the figure most relevant to customer-facing service-level agreements, since it represents pure waiting delay attributable to congestion rather than service complexity.

Step 4 — Mean Sojourn Time W and Mean System Population L

Total time in system adds the service duration: W = W_q + 1/μ. With μ=6 calls/hr, one service averages 10 minutes, so W≈19.9 minutes. Applying Little's Law again yields the average number of customers present system-wide at any moment: L = λW ≈ 20 × 0.332 ≈ 6.63 customers. This equals L_q + λ/μ = 3.29 + 3.33 = 6.62, providing a useful cross-check.

Practical Applications

Call centers: Determine the minimum agent count keeping average wait below a 2-minute SLA target at 85% utilization.
Emergency departments: Model patient arrivals and triage staffing to reduce boarding times and diversion events.
Bank branches: Compare the performance of a pooled multi-teller queue against dedicated per-teller lines.
Network routers: Estimate packet queuing delay at a switching node under variable offered load.
Manufacturing cells: Dimension work-in-process buffers and machine staffing to meet throughput targets.

Interpreting Results

Utilization above 0.90 is a warning sign: L_q grows super-linearly in this region, so minor traffic spikes cause disproportionate congestion. The decision framework is direct — compare the per-period cost of one additional server against the economic cost of the predicted average wait (lost revenue, SLA penalties, customer churn). The Essentials of Stochastic Processes (Duke University, Durrett 2021) provides a rigorous proof of steady-state existence and convergence for M/M/s systems, confirming that these formulas are exact under the Poisson-exponential assumptions when ρ < 1.

Reference

Frequently asked questions

What does the M/M/S notation mean in queueing theory?

The M/M/S label follows Kendall's notation: the first M means Markovian (Poisson) arrivals, the second M means Markovian (exponential) service times, and S is the number of parallel servers. The model assumes one shared queue feeding all s servers, infinite waiting room, and a first-come first-served discipline. It is the most widely used multi-server model in operations research, applicable to call centers, banks, clinics, and network nodes.

Why must server utilization ρ be strictly less than 1 for the M/M/S calculator to produce valid results?

When ρ = λ/(sμ) reaches or exceeds 1, the total arrival rate matches or surpasses combined service capacity, so the queue grows indefinitely and never stabilizes. All Erlang C formulas — P₀, L_q, W_q — assume the system has reached statistical equilibrium, which only exists when ρ < 1. For instance, if λ=25 calls/hr and sμ=24 calls/hr, the queue balloons without limit and no finite average wait time exists. Reducing λ or adding a server restores stability.

How does adding one more server change the average wait time in an M/M/S queue?

Adding one server reduces wait time nonlinearly, with the largest gains occurring near saturation. In a 4-server system with λ=20 and μ=6 (ρ≈0.833), the mean queue length L_q≈3.29, giving a wait of about 9.9 minutes. Adding a fifth server drops ρ to 0.667 and L_q to approximately 0.65 — a fivefold reduction in queue length. This demonstrates why the M/M/S calculator is essential: intuition underestimates the dramatic benefit of one extra server near full utilization.

What is Little's Law and how does the queueing theory calculator apply it?

Little's Law states that in any stable queuing system, the long-run average number of customers L equals the arrival rate λ multiplied by the average time W each customer spends in the system: L = λW. This relationship holds regardless of arrival distribution or service discipline. The M/M/S calculator applies it twice: first to convert queue length L_q into waiting time W_q = L_q/λ, and second to compute total system population L = λW from total sojourn time W = W_q + 1/μ, providing a consistent and verifiable set of steady-state metrics.

What is the difference between W_q and W in the M/M/S queueing model?

W_q is the average time a customer spends waiting in queue before a server becomes free, while W is the total average time in the system including both queueing and service. The relationship is W = W_q + 1/μ, where 1/μ is mean service duration. For example, with W_q≈9.9 minutes and service averaging 10 minutes (μ=6/hr), W≈19.9 minutes total. Service-level agreements typically target W_q specifically, since that component represents avoidable delay caused directly by insufficient server capacity.

When does the M/M/S model not apply, and what alternative models should be considered?

The M/M/S model requires Poisson arrivals, exponential service times, unlimited queue capacity, and homogeneous servers. It breaks down when service times follow a different distribution (use M/G/s or G/G/s), when the queue has a hard capacity limit (use M/M/s/K), when customers abandon after waiting too long (use M/M/s with balking or reneging extensions), or when arrivals are bursty and correlated (use batch-arrival or Markov-modulated models). For interconnected systems such as hospital patient pathways, Jackson network theory or discrete-event simulation tools provide more accurate predictions.