RandomX is Monero's proof-of-work algorithm, activated November 30, 2019, replacing CryptoNight. It's the answer to a question other coins didn't seriously try to solve: "What if instead of fighting ASICs, we designed a PoW where a general-purpose CPU is genuinely the optimal hardware?" The trick is brutal in its simplicity: generate a different program for every hash, then run it through a virtual machine that exercises everything modern CPUs are good at — branch prediction, floating-point, cache hierarchy, out-of-order execution. An ASIC would have to recreate a CPU to compete. So it doesn't.
Every PoW chain eventually faces the ASIC question. Bitcoin embraced them. Monero refused. For five years (2014-2019) Monero played whack-a-mole with the CryptoNight family — tweaking parameters every six months whenever ASICs appeared. It was exhausting, partially effective, and clearly not sustainable. RandomX is the strategic pivot: instead of hiding from ASICs, make the algorithm so hostile to specialization that no ASIC can beat a commodity CPU.
Custom silicon optimized for one hash function (SHA-256, scrypt, etc.) outperforms general-purpose hardware by 100,000× or more. Result: mining centralizes around the few who can afford the silicon.
CryptoNight tried to be memory-hard. ASICs eventually got built anyway (Bitmain Antminer X3, 2018). The Monero community responded with emergency forks that bricked the ASICs — but the cycle was unwinnable.
Tevador, SChernykh, hyc, and others spent ~9 months designing RandomX. The premise: don't try to dodge ASICs — design a PoW where a CPU has every structural advantage and silicon would have to recreate a CPU to compete.
CryptoNight was an obstacle course hoping ASICs couldn't navigate it. RandomX is an art exhibit where the only valid medium happens to be commodity silicon. Anything less general — an ASIC, an FPGA, a GPU — necessarily produces an inferior copy.
Classical PoW (Bitcoin's SHA-256d, scrypt, etc.) runs the same function on different inputs. RandomX runs different functions on the same input. Every hash attempt generates a fresh ~256-instruction program, JIT-compiles it to native machine code, and executes it on a virtual machine. The space of possible programs is ~2512 — far too many for an ASIC to pre-bake.
An ASIC achieves its speed by hard-coding the operations it performs. If the operations change every hash, the ASIC's hard-coded circuitry is useless — it would have to implement an instruction decoder, register file, ALU, FPU, branch predictor, and memory subsystem. At which point it has reinvented a CPU — and a less efficient one than what Intel and AMD already ship.
RandomX juggles five distinct memory regions, each sized to a different level of the modern memory hierarchy. The layering is deliberate: it forces an implementation to have working DRAM, L3, L2, and L1 caches all firing at once. CPUs have all of these; ASICs don't have them in the same configuration.
The seed for everything else. Derived from a "key block" — the most recent block whose height is divisible by 2048. Rotates ~every 2.8 days. When K rotates, everyone has to rebuild the cache and dataset.
Generated from K via Argon2d (a memory-hard KDF). Used to derive the much larger Dataset. Light-mode miners and verifiers keep just this 256 MB and regenerate Dataset entries on-demand.
Built from the Cache by running 8 SuperscalarHash functions per entry. Read-only during hashing. Forces DRAM traffic — each hash reads ~16,384 entries (64 bytes each) over its full computation. Fast-mode only.
The VM's working memory. Read-write. Sized to fit in L3 cache. Initialized per-hash from input H via Blake2b → AesGenerator. Split into L1/L2/L3 regions mimicking CPU cache hierarchy.
8 integer (r0-r7), 4 "f" float (f0-f3), 4 "e" float (e0-e3), 4 "a" float address (a0-a3), plus tracking registers. The fast working set, sized to fit comfortably in L1.
An ASIC that wanted to compete would need 2 GB of fast SRAM (or DRAM with a wide bus), 2 MB of fast L3-equivalent storage, AND all the CPU instructions covered in §5. The economics never close.
One RandomX hash is not one program — it's eight, chained together. Each program reads/writes the scratchpad, then the next program's bytecode is generated from a hash of the previous VM register state. At the end, the entire scratchpad is fingerprinted with AES and combined with the final register file into a Blake2b 256-bit output. That is the value the miner compares to the network difficulty target.
The 8-program chain is what gives RandomX its scale. Each program is ~256 instructions × 2048 iterations of the VM loop, so one hash executes ~4 million instructions before the final fingerprint. That's why even modern CPUs only manage thousands of hashes per second per core — and why the algorithm is so hard to accelerate.
RandomX's virtual machine is designed to be a caricature of a modern CPU — it exercises exactly the features that distinguish general-purpose silicon from specialized accelerators. The instruction set is tiny (~30 opcodes), but each opcode is chosen to be something a CPU is already great at and an ASIC would have to expensively replicate.
| Class | Examples | What it forces hardware to have |
|---|---|---|
| Integer math | IADD, IMUL, IXOR, IROR | A 64-bit ALU with mult/shift/rotate — every CPU has it; ASICs would need to add it. |
| Floating point | FADD, FSUB, FMUL, FDIV, FSQRT | IEEE 754 double-precision with correct rounding modes. Killer for GPUs (which favor single-precision) and webassembly (no directed rounding). |
| Memory R/W | ISTORE, IADD_M, FADD_M | A working cache hierarchy. Reads at L1/L2/L3 latency. Writes that update cached state. |
| Branches | CBRANCH | A branch predictor. ~1% probability so misprediction is rare but not zero — exercises the prediction logic. |
| Reciprocal | IMUL_RCP | Multiplication by precomputed reciprocal (avoids slow integer divide). Loads a 64-bit literal into a register. |
RandomX bytecode is not interpreted — it's JIT-compiled to native machine code for the host CPU on every hash. The reference implementation includes JIT compilers for x86-64, ARM64, and (recently) RISC-V. The compiled code runs at full speed; the interpreter path is the fallback for platforms without a JIT and is ~10× slower. This is why mining is so much faster than verification on platforms with the JIT — and why the algorithm needs a fairly recent CPU to be efficient.
A subtle but elegant property of RandomX: verification doesn't need to be as expensive as mining. The 2 GB Dataset is only required for fast mode. In light mode, the verifier keeps only the 256 MB Cache and computes Dataset entries on-demand as the VM requests them. Same answer, much less memory — at the cost of being too slow to mine competitively.
| Mode | RAM | Use case |
|---|---|---|
| Fast mode | ~2.08 GB | Mining. The entire Dataset is precomputed and held in RAM. Each hash reads ~16,384 entries directly. ~4-6× faster than light mode. |
| Light mode | ~256 MB | Verification. Only the Cache is held. Dataset entries are regenerated on-demand by running 8 SuperscalarHash invocations from the Cache. Slow, but lets any node verify a block without 2 GB of RAM. |
If fast mode required only 256 MB, embedded devices and old CPUs could mine — but ASICs would also have a much easier target. If light mode required 2 GB, every monerod verifying blocks would need 2 GB just for PoW — pricing out lightweight nodes. Splitting the cost keeps mining hard and verification accessible.
RandomX has been running on Monero mainnet since November 30, 2019. The empirical record:
Six years in, no confirmed Monero RandomX ASIC has ever appeared on the market. Compared to the cycle of attacks against CryptoNight every 6-9 months, this is a stark difference.
GPUs underperform CPUs at RandomX by ~3-5×. The combination of integer arithmetic, FP64 with directed rounding, and random branching is exactly the GPU's weak spot. NVIDIA and AMD essentially gave up trying.
Trail of Bits, Kudelski Security, Quarkslab, and the Monero Research Lab have all independently reviewed the algorithm. No critical findings.
Because efficient mining needs ≥2 GB of RAM, ironically RandomX is somewhat easier to detect as cryptojacking malware than smaller-memory algorithms. Tools like "RandomX Sniffer" use the 2 GB allocation as a tell.
Browser sandboxes don't support FP64 directed rounding, and 2 GB allocations are blocked. WebAssembly mining (the Coinhive era) is structurally dead — by design.
RandomX requires 64-bit integer mults and 2+ GB virtual address space. Old hardware, embedded chips, and IoT devices are effectively excluded from mining — which the Monero community considers acceptable.