Documentation · Scoring

ServerScore

Master chapter — Hipparchus (v.H.1.x)

Every sub-test catalogued. No synthetic single number.

A node's ServerScore is the weighted sum of 18 sub-tests across CPU, RAM, Network, and Disk, plus a logarithmic Commitment contribution. Each sub-test carries its own weight, its own 100% baseline (where it earns full slice contribution), and — for Network — a chainweb-minimum threshold below which it contributes 0. There is no opaque score blob: every bench step renders its raw value, both baselines, normalised score, and contribution in the operator UI.

Dimension weights

Five top-level dimensions partition the ServerScore. CPU, RAM, and Network each split further into per-subtest slices; Disk is equal-thirds across 4Kr / 4Kw / 1Mseq; Commitment is a single logarithmic contribution against committed storage. Weights sum to 100% when every sub-test meets its 100% baseline.

Dimension	Sub-test	Weight	100% baseline	Chainweb-min
CPU		22%	Sum of 8 sub-tests below
↳	sysbench-ST	4%	5,000 events/sec	—
↳	sysbench-MT	4%	5,000 events/sec	—
↳	IPC	2%	1.5 instructions/cycle	—
↳	Blake2s	2%	600 MB/s	—
↳	secp256k1	3%	30,000 verify ops/sec	—
↳	Ed25519	2%	25,000 verify ops/sec	—
↳	pact-proxy	3%	1,000 ops/sec	—
↳	hash-verify	2%	4,000 ops/sec	—
RAM		19%	Sum of 2 sub-tests below
↳	RAM quantity	15%	8 GB committed	—
↳	RAM speed	4%	8,000 MB/s random-access	—
Network		20%	Sum of 5 sub-tests below
↳	download	8%	500 Mbps per region	50 Mbps
↳	latency	7%	30 ms per region	150 ms
↳	upload	3%	250 Mbps to hub sink	50 Mbps
↳	jitter	1%	5 ms	50 ms
↳	loss	1%	0 %	2 %
Disk		21%	Equal-thirds across 3 sub-tests (unchanged from prior formula)
↳	4Kr (4K random read)	7%	fio reference (unchanged from prior formula)	—
↳	4Kw (4K random write)	7%	fio reference (unchanged from prior formula)	—
↳	1Mseq (1M sequential)	7%	fio reference (unchanged from prior formula)	—
Commitment		18%	Logarithmic contribution; uncapped above baseline
Total		100%	at 100% baselines (Commitment uncapped)

Source of truth: lib/scoring-formula.ts exports the per-subtest WEIGHT_* and BASELINE_* constants used above. Recalibration of any baseline lands as a const edit there plus an update to this page in the same commit.

Chainweb-minimum baselines

Network sub-tests carry a paired chainweb-minimum threshold alongside their 100% baseline. The chainweb-min is the operate-but-degraded floor: a node that meets it can still join chainweb p2p reliably, but the sub-test contributes 0 to the ServerScore at exactly that value. Sub-test contribution scales linearly between chainweb-min (0% contribution) and 100% baseline (full slice weight).

Download: chainweb-minimum 50 Mbps per region (vs. 500 Mbps at 100%).
Latency: chainweb-minimum 150 ms per region (vs. 30 ms at 100%; lower is better).
Upload: chainweb-minimum 50 Mbps to hub sink (vs. 250 Mbps at 100%).
Jitter: chainweb-minimum 50 ms (vs. 5 ms at 100%; lower is better).
Loss: chainweb-minimum 2 % (vs. 0 % at 100%; lower is better).

CPU and RAM 100% baselines (8 GB RAM, 8,000 MB/s RAM-random, 5,000 sysbench events/sec, 1.5 IPC, 600 MB/s Blake2s, 30,000 secp256k1 ops/sec, 25,000 Ed25519 ops/sec, 1,000 pact-proxy ops/sec, 4,000 hash-verify ops/sec) live in the same registry; CPU and RAM do not yet have chainweb-min thresholds because the hub enforces an integer-floor minimum (≥ 3 vCPU, ≥ 6 GB RAM) at install time rather than via scoring.

Reuse matrix — Prime vs Segregated

Each dimension can be measured fresh (the Prime path — a full-host bench on a standalone master or hypervisor) or inherited from a sibling/parent (the Segregated slice path — a container or VM sharing hardware with siblings). The matrix below names the source for each dimension under each path. Choosing that source is now a real decision made every time a bench is triggered: a gold-and-fitting upstream is reused — its bench skipped, not re-run — while a red, missing, or stale upstream is re-benched fresh (see the next section).

Dimension	Prime	Segregated slice
CPU	Fresh full-host sysbench + crypto bench.	Ghost-CPU cache lookup by (host, cpu_model, vcpu) triple. When a trigger fires it is a real bench-time decision: a cache entry that is gold (its stamped version equals the live greenlit version) and fits the slice triple is reused — its bench is skipped, not re-run. A red, missing, or stale-version entry enqueues a fresh ghost-CPU bench in an stoa-bench container sized to the slice.
RAM	Host RAM total (committed_gb).	Slice RAM allocation × parent host's measured RAM speed — inherited per slice.
Network	Fresh per-bencher RTT-classified bench across the 8 reference regions.	Inherited from parent's bench. Network path is shared by every slice on the host; result is attributed, not re-measured.
Disk	Disk subscore = the drive entity score. A Prime container owns its whole drive, so this is the N=1 case: the full disk-weight-scaled, uncapped score for the drive, read from the one drive entity at read time. When a bench is triggered, a gold drive on that same physical drive is reused — its bench is skipped, not re-run; a red, missing, or stale-version drive is re-benched fresh.	Disk subscore = the drive entity score ÷ N, where N is the number of containers on that drive. Same drive entity, same ÷N divisor as before — only the source changed: every slice now reads the one drive entity at read time, no copied value. The reuse is an executable bench-time decision now: a gold drive that fits (the same physical drive) is reused and its bench skipped; a red, missing, or stale-version drive is re-benched fresh. Skipping a re-run never copies a value — every slice still reads the one drive entity at read time.
Commitment	Log-curve from committed_gb against the per-host minimum.	Log-curve from committed_gb against the per-slice minimum.

Disk as a benchmark entity

A drive is a first-class benchmark entity, surfaced the same way a container entity is: it carries its own Score, its own Stamped version, a Last-benched timestamp, a Time-since value, and a Status. The disk math itself is unchanged — the equal-thirds 4Kr / 4Kw / 1Mseq weighting in the table above still applies. What changed is wherethe disk figure comes from: there is now one drive entity, and every disk-subscore reader reads it at the moment it renders rather than from a value copied into a container’s saved breakdown.

One source — headline equals detail. The disk number shown in a container’s headline score and the disk number shown in its “details” breakdown are always the same number for the same drive. This removes by construction the prior class of divergence where one panel showed 1.738291 for a drive while another showed 0.366113 for the very same drive. There is no resync step and no operator action — both figures read the one drive entity, so they cannot disagree.
N = 1 Prime / ÷ N slave equivalence. A Prime container that owns its whole drive gets exactly the drive entity’s score (the N = 1 case). A slave container gets that score divided by the number of containers on the drive. It is one formula whether the drive is dedicated or shared — the shared-drive ÷ N split is the same arithmetic as before; only the source (one drive entity) changed.
Greenlight & rebench-drift discipline. A drive bench carries a bench-version stamp on exactly the same terms as a container. A bench run under a greenlit version stamps that version and grades gold; an absent or non-live stamp grades red and is retestable; an inherited or old stamp also grades red; an in-between version grades amber. Pre-existing drive benchmarks that predate the stamp read as red/retestable — there is no historical backfill. A rebench of any container on the drive, or a direct disk bench, refreshes and re-stamps the drive entity, and every reader then sees the refreshed value. The Status follows the same classifier containers use, with no disk-specific rule.

See the Demeter release notes for the full operator-facing diff of the disk-as-an-entity change.

Greenlight-gated bench reuse & bulk orchestration

Reuse is now a real decision made the moment a bench is triggered, not just a read-time accounting note. When a trigger fires, the upstream it would inherit from — a drive entity for disk, a ghost-CPU cache entry for CPU — is checked: if it is gold (its stamped bench-version equals the live greenlit version) and it fits the consumer, that upstream is reused — its bench is skipped, not re-run. If it is red, missing, or stamped under a stale version, it is re-benched fresh. “Fits” means the same physical drive for a disk, and the same host × CPU model × vCPU-count ghost-CPU triple for a CPU. This is the same “red ⇒ must rebench” discipline disks and slices already followed at read time, now also applied at bench-execution time. Reuse is a skip of execution only — no upstream value is ever copied into a consumer; every reader still reads the shared drive entity or cache entry at read time, so the single-source guarantee is unchanged.

Reuse-aware by default. Every bench trigger — single-entity and bulk — now defaults to reuse-aware: a gold-and-fitting upstream is skipped automatically. A single explicit “force fresh even if gold” toggle, present on every trigger and defaulting off, overrides the gold-skip and re-benches regardless of gold status. The old 30-day recency window is now only a secondary safety; the primary gate is version-gold, not age.
Four “Bench all” buttons. Bench all Disks, Bench all Virtual Container Entities, Bench all Real Container Entities, and a master Bench all Entities. Every bulk run is wave-ordered Disk → Virtual → Real so a downstream wave reuses the gold results the upstream wave just freshened in the same run. Each wave skips gold entities and benches only red or missing ones (unless that button’s force-fresh toggle is on, which benches every entity of the target kind). The per-kind buttons auto-cascade only the upstream waves they actually need — Bench all Disks runs the Disk wave only; Bench all Virtual runs Disk (red deps only) then Virtual; Bench all Real runs Disk then Virtual (both red deps only) then Real; the master is the full reconcile of all three.
Bulk is Ancient-admin only. The bulk “bench all” buttons are restricted to Ancient-admin operators and audit-logged, with per-kind and master runs distinguishable in the trail. The per-entity triggers are unchanged except that the universal force-fresh toggle now defaults off (reuse-aware) instead of always forcing a fresh re-measure.

See the Triptolemus release notes for the full operator-facing diff of greenlight-gated reuse and the bulk orchestration buttons.

Red → no-stoicism economic gate

A red score is not just a colour. While a node is red it accrues exactly zerostoicism: both the per-tick scoring accrual path and the lower-level tip-accrual path refuse to add stoicism for a red node, and a red Tunnelee zeroes the whole accrual including the Tunneler’s fee slice. This is the same “red ⇒ you are not earning until you re-bench” discipline the greenlight model always implied, now enforced economically and not just shown cosmetically. A red node that is skipped for accrual emits a dedicated accrual-skip event so the operator can see they stopped earning because the score is red and the remedy is to re-bench.

One single-source red determination. A node is red if any contributing sub-component is red — a visible bench-version mismatch (minor, major, or unknown), a red disk entity, or a red/failed/stale CPU, RAM, or network contribution. A missing or pre-versioning stamp is treated as red (worst-case not-current until a re-bench proves otherwise), and a never-benched node resolves to red — not a separate fourth state. That one verdict is consumed by every surface — the economic gate, the ServerScoreCard headline, the Nodes page, the Stoicism page, and the Argus and Triton surfaces — so they always reach the same red/gold answer for the same node at the same moment.
Gold requires every sub-component current — no partial gold. There is no partial gold: gold requires every contributing sub-component to be current/greenlit, and any single red component is decisivefor the whole node. A mostly-gold node with one red component is red and earns zero — there is no “close enough” accrual.
Forward-only — no clawback. The gate is strictly forward-only. When a node turns red, future accrual stops, but stoicism already accrued (pending or current) and already redeemed is never reduced, reversed, recomputed, or clawed back. A node that returns to gold (re-benched, current again) resumes accrual on the very next tick with no manual step beyond the re-bench; the red period is a permanent gap with no retroactive backfill.
Red is shown red, and a red Earn Score is 0.0. The ServerScoreCard headline renders red when any sub-component is red, the Nodes-page ServerScore is gold/red by the same determination, a red node’s Earn Score is forced to 0.0 in its usual yellow tone, and the Stoicism leaderboard shows red for a red node. The displayed historical totals do not change in value — the red signal is a status, not a restatement of accrued or redeemed history.

See the Nemesis release notes for the full operator-facing diff of the red→no-stoicism economic gate and the red-visibility surfaces.

Ghost-CPU cache rule

Segregated slices on the same physical host that share the same CPU model and vCPU count get the same CPU subscore. The cache key is the triple (host_node_id, cpu_model, vcpu_count). Cache behaviour:

Cache hit — the slice immediately renders an inherited per-step row referencing the prior ghost-CPU bench result. No fresh container is launched.
Cache miss — the slice enqueues a fresh stoa-bench container sized to the slice’s vCPU count. The result stamps a new cache row keyed by the triple; future slices on the same triple read it.
forceFullRebench invalidation — the “Force-fresh fleet rebench” action on /hub/fleet-maintenance drops every cache row before re-running the wave. Every slice on every host re-benches under the new formula.
CPU model auto-invalidation — when a host’s reported cpu_model changes (e.g. after a hardware swap), the cache rows keyed by the old model are no longer matched. The new triple misses; a fresh ghost-CPU bench runs automatically.

The cache is per-host: two physical hosts running identical CPUs do not share rows because the host_node_id column differs. This is intentional — thermal envelope, BIOS settings, and background load on the host can shift the ghost-CPU bench result even on nominally identical silicon.

Pre-flight distance classification

Network sub-tests run against eight reference Linode regions. Before the full download bench fires, a lightweight TCP-connect RTT pass classifies each region into one of four distance buckets. The bucket determines what download payload (if any) is fetched.

Bucket	RTT range	Download behaviour
near	≤ 80 ms	Full 100 MB payload fetched.
medium	80 – 200 ms	Full 100 MB payload fetched.
distant	> 200 ms	25 MiB byte-range fetch only.
unreachable	timeout	Download skipped. Per-region download contribution scores 0 for that slice.

The pre-flight RTT pass adds ~3-4 s to total bench wall-time but keeps the download bench representative — a slow link to a single far region cannot dominate the per-region average, and an unreachable region is honestly recorded as a zero rather than artificially zeroed by a timeout mid-fetch.