Documentation · Scoring
ServerScore
Master chapter — Hipparchus (v.H.1.x)
Every sub-test catalogued. No synthetic single number.
A node's ServerScore is the weighted sum of 18 sub-tests across CPU, RAM, Network, and Disk, plus a logarithmic Commitment contribution. Each sub-test carries its own weight, its own 100% baseline (where it earns full slice contribution), and — for Network — a chainweb-minimum threshold below which it contributes 0. There is no opaque score blob: every bench step renders its raw value, both baselines, normalised score, and contribution in the operator UI.
Dimension weights
Five top-level dimensions partition the ServerScore. CPU, RAM, and Network each split further into per-subtest slices; Disk is equal-thirds across 4Kr / 4Kw / 1Mseq; Commitment is a single logarithmic contribution against committed storage. Weights sum to 100% when every sub-test meets its 100% baseline.
| Dimension | Sub-test | Weight | 100% baseline | Chainweb-min |
|---|---|---|---|---|
| CPU | 22% | Sum of 8 sub-tests below | ||
| ↳ | sysbench-ST | 4% | 5,000 events/sec | — |
| ↳ | sysbench-MT | 4% | 5,000 events/sec | — |
| ↳ | IPC | 2% | 1.5 instructions/cycle | — |
| ↳ | Blake2s | 2% | 600 MB/s | — |
| ↳ | secp256k1 | 3% | 30,000 verify ops/sec | — |
| ↳ | Ed25519 | 2% | 25,000 verify ops/sec | — |
| ↳ | pact-proxy | 3% | 1,000 ops/sec | — |
| ↳ | hash-verify | 2% | 4,000 ops/sec | — |
| RAM | 19% | Sum of 2 sub-tests below | ||
| ↳ | RAM quantity | 15% | 8 GB committed | — |
| ↳ | RAM speed | 4% | 8,000 MB/s random-access | — |
| Network | 20% | Sum of 5 sub-tests below | ||
| ↳ | download | 8% | 500 Mbps per region | 50 Mbps |
| ↳ | latency | 7% | 30 ms per region | 150 ms |
| ↳ | upload | 3% | 250 Mbps to hub sink | 50 Mbps |
| ↳ | jitter | 1% | 5 ms | 50 ms |
| ↳ | loss | 1% | 0 % | 2 % |
| Disk | 21% | Equal-thirds across 3 sub-tests (unchanged from prior formula) | ||
| ↳ | 4Kr (4K random read) | 7% | fio reference (unchanged from prior formula) | — |
| ↳ | 4Kw (4K random write) | 7% | fio reference (unchanged from prior formula) | — |
| ↳ | 1Mseq (1M sequential) | 7% | fio reference (unchanged from prior formula) | — |
| Commitment | 18% | Logarithmic contribution; uncapped above baseline | ||
| Total | 100% | at 100% baselines (Commitment uncapped) | ||
Source of truth: lib/scoring-formula.ts exports the per-subtest WEIGHT_* and BASELINE_* constants used above. Recalibration of any baseline lands as a const edit there plus an update to this page in the same commit.
Chainweb-minimum baselines
Network sub-tests carry a paired chainweb-minimum threshold alongside their 100% baseline. The chainweb-min is the operate-but-degraded floor: a node that meets it can still join chainweb p2p reliably, but the sub-test contributes 0 to the ServerScore at exactly that value. Sub-test contribution scales linearly between chainweb-min (0% contribution) and 100% baseline (full slice weight).
- Download: chainweb-minimum 50 Mbps per region (vs. 500 Mbps at 100%).
- Latency: chainweb-minimum 150 ms per region (vs. 30 ms at 100%; lower is better).
- Upload: chainweb-minimum 50 Mbps to hub sink (vs. 250 Mbps at 100%).
- Jitter: chainweb-minimum 50 ms (vs. 5 ms at 100%; lower is better).
- Loss: chainweb-minimum 2 % (vs. 0 % at 100%; lower is better).
CPU and RAM 100% baselines (8 GB RAM, 8,000 MB/s RAM-random, 5,000 sysbench events/sec, 1.5 IPC, 600 MB/s Blake2s, 30,000 secp256k1 ops/sec, 25,000 Ed25519 ops/sec, 1,000 pact-proxy ops/sec, 4,000 hash-verify ops/sec) live in the same registry; CPU and RAM do not yet have chainweb-min thresholds because the hub enforces an integer-floor minimum (≥ 3 vCPU, ≥ 6 GB RAM) at install time rather than via scoring.
Reuse matrix — Prime vs Segregated
Each dimension can be measured fresh (the Prime path — a full-host bench on a standalone master or hypervisor) or inherited from a sibling/parent (the Segregated slice path — a container or VM sharing hardware with siblings). The matrix below names the source for each dimension under each path. Choosing that source is now a real decision made every time a bench is triggered: a gold-and-fitting upstream is reused — its bench skipped, not re-run — while a red, missing, or stale upstream is re-benched fresh (see the next section).
| Dimension | Prime | Segregated slice |
|---|---|---|
| CPU | Fresh full-host sysbench + crypto bench. | Ghost-CPU cache lookup by (host, cpu_model, vcpu) triple. When a trigger fires it is a real bench-time decision: a cache entry that is gold (its stamped version equals the live greenlit version) and fits the slice triple is reused — its bench is skipped, not re-run. A red, missing, or stale-version entry enqueues a fresh ghost-CPU bench in an stoa-bench container sized to the slice. |
| RAM | Host RAM total (committed_gb). | Slice RAM allocation × parent host's measured RAM speed — inherited per slice. |
| Network | Fresh per-bencher RTT-classified bench across the 8 reference regions. | Inherited from parent's bench. Network path is shared by every slice on the host; result is attributed, not re-measured. |
| Disk | Disk subscore = the drive entity score. A Prime container owns its whole drive, so this is the N=1 case: the full disk-weight-scaled, uncapped score for the drive, read from the one drive entity at read time. When a bench is triggered, a gold drive on that same physical drive is reused — its bench is skipped, not re-run; a red, missing, or stale-version drive is re-benched fresh. | Disk subscore = the drive entity score ÷ N, where N is the number of containers on that drive. Same drive entity, same ÷N divisor as before — only the source changed: every slice now reads the one drive entity at read time, no copied value. The reuse is an executable bench-time decision now: a gold drive that fits (the same physical drive) is reused and its bench skipped; a red, missing, or stale-version drive is re-benched fresh. Skipping a re-run never copies a value — every slice still reads the one drive entity at read time. |
| Commitment | Log-curve from committed_gb against the per-host minimum. | Log-curve from committed_gb against the per-slice minimum. |
Disk as a benchmark entity
A drive is a first-class benchmark entity, surfaced the same way a container entity is: it carries its own Score, its own Stamped version, a Last-benched timestamp, a Time-since value, and a Status. The disk math itself is unchanged — the equal-thirds 4Kr / 4Kw / 1Mseq weighting in the table above still applies. What changed is wherethe disk figure comes from: there is now one drive entity, and every disk-subscore reader reads it at the moment it renders rather than from a value copied into a container’s saved breakdown.
- One source — headline equals detail. The disk number shown in a container’s headline score and the disk number shown in its “details” breakdown are always the same number for the same drive. This removes by construction the prior class of divergence where one panel showed
1.738291for a drive while another showed0.366113for the very same drive. There is no resync step and no operator action — both figures read the one drive entity, so they cannot disagree. - N = 1 Prime / ÷ N slave equivalence. A Prime container that owns its whole drive gets exactly the drive entity’s score (the N = 1 case). A slave container gets that score divided by the number of containers on the drive. It is one formula whether the drive is dedicated or shared — the shared-drive ÷ N split is the same arithmetic as before; only the source (one drive entity) changed.
- Greenlight & rebench-drift discipline. A drive bench carries a bench-version stamp on exactly the same terms as a container. A bench run under a greenlit version stamps that version and grades gold; an absent or non-live stamp grades red and is retestable; an inherited or old stamp also grades red; an in-between version grades amber. Pre-existing drive benchmarks that predate the stamp read as red/retestable — there is no historical backfill. A rebench of any container on the drive, or a direct disk bench, refreshes and re-stamps the drive entity, and every reader then sees the refreshed value. The Status follows the same classifier containers use, with no disk-specific rule.
See the Demeter release notes for the full operator-facing diff of the disk-as-an-entity change.
Greenlight-gated bench reuse & bulk orchestration
Reuse is now a real decision made the moment a bench is triggered, not just a read-time accounting note. When a trigger fires, the upstream it would inherit from — a drive entity for disk, a ghost-CPU cache entry for CPU — is checked: if it is gold (its stamped bench-version equals the live greenlit version) and it fits the consumer, that upstream is reused — its bench is skipped, not re-run. If it is red, missing, or stamped under a stale version, it is re-benched fresh. “Fits” means the same physical drive for a disk, and the same host × CPU model × vCPU-count ghost-CPU triple for a CPU. This is the same “red ⇒ must rebench” discipline disks and slices already followed at read time, now also applied at bench-execution time. Reuse is a skip of execution only — no upstream value is ever copied into a consumer; every reader still reads the shared drive entity or cache entry at read time, so the single-source guarantee is unchanged.
- Reuse-aware by default. Every bench trigger — single-entity and bulk — now defaults to reuse-aware: a gold-and-fitting upstream is skipped automatically. A single explicit “force fresh even if gold” toggle, present on every trigger and defaulting off, overrides the gold-skip and re-benches regardless of gold status. The old 30-day recency window is now only a secondary safety; the primary gate is version-gold, not age.
- Four “Bench all” buttons. Bench all Disks, Bench all Virtual Container Entities, Bench all Real Container Entities, and a master Bench all Entities. Every bulk run is wave-ordered Disk → Virtual → Real so a downstream wave reuses the gold results the upstream wave just freshened in the same run. Each wave skips gold entities and benches only red or missing ones (unless that button’s force-fresh toggle is on, which benches every entity of the target kind). The per-kind buttons auto-cascade only the upstream waves they actually need — Bench all Disks runs the Disk wave only; Bench all Virtual runs Disk (red deps only) then Virtual; Bench all Real runs Disk then Virtual (both red deps only) then Real; the master is the full reconcile of all three.
- Bulk is Ancient-admin only. The bulk “bench all” buttons are restricted to Ancient-admin operators and audit-logged, with per-kind and master runs distinguishable in the trail. The per-entity triggers are unchanged except that the universal force-fresh toggle now defaults off (reuse-aware) instead of always forcing a fresh re-measure.
See the Triptolemus release notes for the full operator-facing diff of greenlight-gated reuse and the bulk orchestration buttons.
Red → no-stoicism economic gate
A red score is not just a colour. While a node is red it accrues exactly zerostoicism: both the per-tick scoring accrual path and the lower-level tip-accrual path refuse to add stoicism for a red node, and a red Tunnelee zeroes the whole accrual including the Tunneler’s fee slice. This is the same “red ⇒ you are not earning until you re-bench” discipline the greenlight model always implied, now enforced economically and not just shown cosmetically. A red node that is skipped for accrual emits a dedicated accrual-skip event so the operator can see they stopped earning because the score is red and the remedy is to re-bench.
- One single-source red determination. A node is red if any contributing sub-component is red — a visible bench-version mismatch (minor, major, or unknown), a red disk entity, or a red/failed/stale CPU, RAM, or network contribution. A missing or pre-versioning stamp is treated as red (worst-case not-current until a re-bench proves otherwise), and a never-benched node resolves to red — not a separate fourth state. That one verdict is consumed by every surface — the economic gate, the ServerScoreCard headline, the Nodes page, the Stoicism page, and the Argus and Triton surfaces — so they always reach the same red/gold answer for the same node at the same moment.
- Gold requires every sub-component current — no partial gold. There is no partial gold: gold requires every contributing sub-component to be current/greenlit, and any single red component is decisivefor the whole node. A mostly-gold node with one red component is red and earns zero — there is no “close enough” accrual.
- Forward-only — no clawback. The gate is strictly forward-only. When a node turns red, future accrual stops, but stoicism already accrued (pending or current) and already redeemed is never reduced, reversed, recomputed, or clawed back. A node that returns to gold (re-benched, current again) resumes accrual on the very next tick with no manual step beyond the re-bench; the red period is a permanent gap with no retroactive backfill.
- Red is shown red, and a red Earn Score is 0.0. The ServerScoreCard headline renders red when any sub-component is red, the Nodes-page ServerScore is gold/red by the same determination, a red node’s Earn Score is forced to 0.0 in its usual yellow tone, and the Stoicism leaderboard shows red for a red node. The displayed historical totals do not change in value — the red signal is a status, not a restatement of accrued or redeemed history.
See the Nemesis release notes for the full operator-facing diff of the red→no-stoicism economic gate and the red-visibility surfaces.
Ghost-CPU cache rule
Segregated slices on the same physical host that share the same CPU model and vCPU count get the same CPU subscore. The cache key is the triple (host_node_id, cpu_model, vcpu_count). Cache behaviour:
- Cache hit — the slice immediately renders an
inheritedper-step row referencing the prior ghost-CPU bench result. No fresh container is launched. - Cache miss — the slice enqueues a fresh
stoa-benchcontainer sized to the slice’s vCPU count. The result stamps a new cache row keyed by the triple; future slices on the same triple read it. forceFullRebenchinvalidation — the “Force-fresh fleet rebench” action on /hub/fleet-maintenance drops every cache row before re-running the wave. Every slice on every host re-benches under the new formula.- CPU model auto-invalidation — when a host’s reported
cpu_modelchanges (e.g. after a hardware swap), the cache rows keyed by the old model are no longer matched. The new triple misses; a fresh ghost-CPU bench runs automatically.
The cache is per-host: two physical hosts running identical CPUs do not share rows because the host_node_id column differs. This is intentional — thermal envelope, BIOS settings, and background load on the host can shift the ghost-CPU bench result even on nominally identical silicon.
Pre-flight distance classification
Network sub-tests run against eight reference Linode regions. Before the full download bench fires, a lightweight TCP-connect RTT pass classifies each region into one of four distance buckets. The bucket determines what download payload (if any) is fetched.
| Bucket | RTT range | Download behaviour |
|---|---|---|
| near | ≤ 80 ms | Full 100 MB payload fetched. |
| medium | 80 – 200 ms | Full 100 MB payload fetched. |
| distant | > 200 ms | 25 MiB byte-range fetch only. |
| unreachable | timeout | Download skipped. Per-region download contribution scores 0 for that slice. |
The pre-flight RTT pass adds ~3-4 s to total bench wall-time but keeps the download bench representative — a slow link to a single far region cannot dominate the per-region average, and an unreachable region is honestly recorded as a zero rather than artificially zeroed by a timeout mid-fetch.
Further reading
- Network subscore deep-dive — five-subtest table, the eight Linode reference regions, the hub upload-sink endpoint, and the multi-hub forward-compat note.
- Hipparchus migration runbook — when and how to run the Force-fresh fleet rebench, the mixed-formula display window, and the manual
stoa-benchbuild + push commands. - Hipparchus release notes — operator-facing summary of what v.H.1.0 changed.