The Complete Guide to UUIDs in 2026
A practitioner's reference covering RFC 9562, all seven UUID versions, the index-fragmentation problem, ULID and NanoID alternatives, database integration patterns, and language-by-language code. Written for backend engineers picking an ID format for production.
Table of contents
- What a UUID actually is — anatomy of 128 bits
- The seven UUID versions, explained
- UUIDv4 vs UUIDv7 — the index fragmentation deep dive
- UUID vs ULID vs NanoID vs CUID2
- Generating UUIDs in 8 languages
- Storing UUIDs in your database
- Common mistakes that ship to production
- A decision matrix for picking the right ID format
- Authoritative references
1. What a UUID actually is — anatomy of 128 bits
A UUID (Universally Unique Identifier) is 128 bits. That is the entire definition. Everything else — the canonical hyphenated format, the version digit, the variant nibble, the algorithms for generating one — is convention layered on top of those 128 bits.
The canonical string form is 32 hexadecimal characters split by hyphens into five groups: xxxxxxxx-xxxx-Mxxx-Nxxx-xxxxxxxxxxxx. The total string length is 36 characters. Two specific positions encode metadata about which UUID flavor you are looking at:
- Position 13 — the "M" character is the version digit.
4means random (v4),7means time-ordered (v7),1means timestamp+MAC, etc. This single nibble tells parsers which generation algorithm produced the ID. - Position 17 — the "N" character is the variant. The high bits are
10in binary, so the hex character is one of8,9,a, orb. This identifies the UUID as conforming to RFC 4122/9562 (the "Leach-Salz" variant), as opposed to older Microsoft GUID variants or NCS variants.
Because version and variant nibbles consume 6 bits, a v4 UUID has 122 bits of randomness, not 128. The classic collision math you have probably seen — "you would need to generate 2.71 quintillion UUIDs to have a 50% chance of one collision" — is the birthday paradox formula n ≈ 1.177 × √(2^122) applied to those 122 bits.
For practical purposes that means UUID collisions are not a real concern in any system that fits in observable physical reality. If you generated a billion UUIDs per second, every second, for a hundred years, your odds of a collision are still well below 1 in a billion. The "uniqueness without coordination" guarantee is what makes UUIDs the connective tissue of modern distributed systems — every Stripe charge, every S3 multipart upload, every Auth0 session, every Slack message carries a 128-bit identifier generated independently on whatever node happened to handle the request.
2. The seven UUID versions, explained
RFC 9562 (the May 2024 revision of RFC 4122) defines seven UUID versions, plus a reserved range. Most engineers only need two or three of them in production, but knowing the full menu helps when you encounter old systems or read decisions made by previous teams.
| Version | Mechanism | Sortable? | Privacy concern | Use it for |
|---|---|---|---|---|
| v1 | 60-bit timestamp + 14-bit clock seq + 48-bit MAC address | Partially | Leaks server MAC address and creation time | Legacy systems only — avoid in new code. |
| v2 | DCE Security (POSIX UID/GID embedded) | Partially | Same as v1, plus user info leak | Defined for completeness but essentially unused. Skip. |
| v3 | MD5 hash of namespace + name | No | Deterministic — knowing the inputs reveals the UUID | Legacy content-addressed IDs. Use v5 instead — MD5 is broken for security purposes. |
| v4 | 122 random bits | No | None | The default for the past 20 years. Still acceptable for primary keys at small-to-medium scale. |
| v5 | SHA-1 hash of namespace + name | No | Deterministic | Stable IDs derived from a name — DNS, URL, OID, X.500 DN, custom namespace. |
| v6 | v1 with the timestamp bytes reordered for sortability | Yes | Same as v1 (MAC leak) | Migration path off v1. Most teams jump straight to v7 instead. |
| v7 | 48-bit Unix-millis timestamp + 74 bits of entropy | Yes | Reveals creation time only | The new default for primary keys, log IDs, anywhere you want both uniqueness and time-locality. |
| v8 | Custom format (vendor-defined) | — | — | Reserved for experimentation. RFC 9562 leaves this open for future use. |
For a new project in 2026, the choice tree is short: pick v7 if your runtime supports it, fall back to v4 if it does not, use v5 if you need a deterministic ID derived from a name. v1, v2, v3, v6 all have better replacements and should not appear in new code.
3. UUIDv4 vs UUIDv7 — the index fragmentation deep dive
The single biggest reason teams have been migrating from v4 to v7 over the past two years is database write performance at scale. To understand why, you need a basic mental model of how relational databases store primary keys.
Postgres, MySQL/InnoDB, and SQL Server all store primary key indexes as B-trees (or B+ trees in InnoDB's clustered case). When IDs arrive in roughly sorted order — autoincrement integers, ULIDs, UUIDv7 — every new insert lands at the tail of the index, the page stays full, and the cache hit rate is high. Random IDs — UUIDv4 — have no temporal locality. Every insert lands in a different B-tree page. Pages split, parent pages rewrite, and the working set blows out of memory.
Real-world benchmarks from production teams running this comparison:
- Postgres: v4 vs sequential ID inserts can be 2–3× slower at scale. WAL volume increases, buffer cache thrashes, vacuum has more work.
- MySQL/InnoDB: primary key drives the clustered index, so this hits doubly hard. Random inserts cause page splits and bloat the on-disk size by 30–60% compared to sequential keys. The standard mitigation in MySQL 5.x was
UUID_TO_BIN(uuid, 1)which reorders v1 timestamp bytes for B-tree locality. UUIDv7 does this natively. - SQL Server: the
NEWSEQUENTIALID()function exists exactly because the team learned random GUIDs hurt at scale. v7 gives you the same property cross-platform.
The fix order, ranked by impact:
- UUIDv7 preserves uniqueness while putting the high-order bits in time order. New insert performance approaches autoincrement.
- ULID — same idea, different encoding (Crockford base32 instead of hex). Shorter, URL-friendly, time-ordered.
- Store v4 UUIDs as native binary:
BINARY(16)in MySQL,uuidin Postgres,uniqueidentifierin SQL Server. Halves disk and memory size compared toVARCHAR(36). - MySQL legacy:
UUID_TO_BIN(uuid, 1)if you cannot move to v7.
If you are starting a green-field service, default to v7. If you have a v4-based table that is now causing write performance issues, the migration path is to add a v7 column, dual-write for a period, then switch reads.
Generate sample UUIDs to compareTry the generator with v4 vs v7 to see the time-ordering structure side by side.4. UUID vs ULID vs NanoID vs CUID2
The broader "distributed unique ID" category contains several formats that are not strictly UUIDs but compete for the same role. Picking among them is a matter of trading off length, sortability, URL friendliness, and ecosystem support.
| Format | Bits of randomness | Sortable? | Length | Standard? | Best fit |
|---|---|---|---|---|---|
| UUIDv4 | 122 | No | 36 chars (32 hex + 4 hyphens) | RFC 4122 / 9562 | Maximum interop with old systems. |
| UUIDv7 | 74 + 48-bit time | Yes | 36 chars | RFC 9562 | New databases — strongly recommended. |
| ULID | 80 + 48-bit time | Yes | 26 chars (Crockford base32) | De-facto, no formal RFC | Shorter than UUIDv7, URL-friendly, same time-locality. |
| NanoID (default) | 126 (21 chars × 6 bits) | No | 21 chars (configurable) | De-facto | Short opaque IDs in URLs and shareable links. |
| CUID2 | ~256 (sha-3 derived) | No | 24 chars (configurable) | De-facto | Privacy-conscious systems — designed to resist enumeration. |
| Snowflake (Twitter) | 22 (10-bit machine + 12-bit seq) | Yes | ~19 chars (numeric) | Twitter spec | Single-organization with strict ID coordination. |
| KSUID | 128 (32-bit time + 128 random) | Yes | 27 chars | Segment.io spec | Logs, events — Segment's preferred format. |
Practical guidance:
- Default to UUIDv7 for primary keys. The ecosystem support is broadest and the format is RFC-standardized.
- ULID if you want shorter strings (26 vs 36 chars) and your stack does not yet have v7 support. ULIDs are time-ordered and URL-safe.
- NanoID for short opaque tokens visible to users (8–12 character "share" IDs). 128 bits of entropy is overkill there.
- CUID2 for cases where sequential ID enumeration is a privacy concern — its derivation is intentionally non-temporal.
- UUIDv4 if you need broad interop with legacy systems that may not parse v7 yet.
5. Generating UUIDs in 8 languages
Always use a cryptographically secure random number generator. Math.random() in JavaScript, java.util.Random, and rand() in C are not safe — they have predictable internal state and can produce collisions. The platform UUID functions below all use cryptographic RNGs.
JavaScript / TypeScript (browser, Node 14.17+)
// v4 — built in
const id = crypto.randomUUID();
// "550e8400-e29b-41d4-a716-446655440000"
// v7 — implement with crypto.getRandomValues
function uuidv7() {
const ts = BigInt(Date.now());
const rand = crypto.getRandomValues(new Uint8Array(10));
const b = new Uint8Array(16);
for (let i = 0; i < 6; i++) b[i] = Number((ts >> BigInt(40 - i*8)) & 0xffn);
b.set(rand, 6);
b[6] = (b[6] & 0x0f) | 0x70; // version 7
b[8] = (b[8] & 0x3f) | 0x80; // variant 10
const h = [...b].map(x => x.toString(16).padStart(2, '0')).join('');
return `${h.slice(0,8)}-${h.slice(8,12)}-${h.slice(12,16)}-${h.slice(16,20)}-${h.slice(20)}`;
}
Python 3.6+ (v4) and 3.13+ (v7 native)
import uuid
uuid.uuid4() # v4
uuid.uuid7() # v7 — Python 3.13+
uuid.uuid5(uuid.NAMESPACE_URL, "https://example.com/users/42") # v5
# For Python < 3.13, install the `uuid7` PyPI package.
Go (google/uuid v1.6.0+)
import "github.com/google/uuid"
id4 := uuid.New() // v4
id7, _ := uuid.NewV7() // v7
Rust (uuid crate)
use uuid::Uuid;
let id = Uuid::new_v4();
let v7 = Uuid::now_v7(); // requires the `v7` feature flag
Java 17+
import java.util.UUID;
UUID id = UUID.randomUUID(); // v4
// v7 — needs a library, e.g. com.github.f4b6a3:uuid-creator
import com.github.f4b6a3.uuid.UuidCreator;
UUID v7 = UuidCreator.getTimeOrderedEpoch();
PHP 8 / Symfony UID
use Symfony\Component\Uid\Uuid;
$v4 = Uuid::v4();
$v7 = Uuid::v7();
Ruby
require 'securerandom'
SecureRandom.uuid # v4
# v7 — needs a gem like `uuidx`
require 'uuidx'
Uuidx.v7
SQL — PostgreSQL 13+ / 18
-- v4 (built in since PG 13)
SELECT gen_random_uuid();
-- v7 — install pg_uuidv7 extension or upgrade to PG 18
SELECT uuidv7();
6. Storing UUIDs in your database
The single most common mistake is storing UUIDs as VARCHAR(36). That is the canonical string representation, but it costs 36 bytes per row plus index overhead. The native binary types are 16 bytes — less than half the size. At a billion rows, that is a 20 GB difference on disk plus proportional memory savings.
| Database | Native UUID type | Storage size | Notes |
|---|---|---|---|
| PostgreSQL | uuid | 16 bytes | Built-in. Use gen_random_uuid() or uuidv7(). |
| MySQL 8.0+ | BINARY(16) | 16 bytes | Use UUID_TO_BIN(uuid, 1) for v1 byte reorder. |
| SQL Server | uniqueidentifier | 16 bytes | Use NEWSEQUENTIALID() for sortable. |
| SQLite | BLOB 16 bytes | 16 bytes | No native type. Store binary; convert on read. |
| Oracle | RAW(16) | 16 bytes | Custom function for generation. |
| MongoDB | UUID BSON subtype 4 | 16 bytes | Use the driver's UUID type, not strings. |
| DynamoDB | String (S) | ~36 bytes | Stored as canonical string. Costs more but works with key conditions. |
| Cassandra | uuid / timeuuid | 16 bytes | Native types; timeuuid = v1 with sortability. |
If your stack involves passing UUIDs through JSON APIs, the JSON Formatter tool helps verify the canonical string is preserved across the round trip — particularly for languages that may stringify UUIDs differently from what the receiver expects.
7. Common mistakes that ship to production
- Treating UUIDs as secrets. A UUID is an identifier, not a token. Anyone who sees one can use it to address the resource it points to. Pair UUIDs with proper authentication — never use them as session tokens, API keys, or password reset links.
- Storing UUIDs as
VARCHAR(36). 36 bytes per row instead of 16 — wastes disk, memory, and index space. Use the native binary type. - Using v1 in 2026. The MAC-address bytes leak which physical machine generated the ID. v7 gives you the same time-ordering benefit without the leak.
- Validating UUIDs with the wrong regex.
/^[0-9a-f-]{36}$/accepts garbage likeaaaa-aaaa-.... The correct shape requires the version digit and variant nibble in their specific positions:/^[0-9a-f]{8}-[0-9a-f]{4}-[1-7][0-9a-f]{3}-[89ab][0-9a-f]{3}-[0-9a-f]{12}$/i. - Generating v4 UUIDs with
Math.random(). The Stack Overflow snippet"xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx".replace(...)usesMath.random(), which is not cryptographically secure. Usecrypto.randomUUID()in any browser since 2021. - Embedding UUIDs in user-facing URLs. Long, ugly, non-memorable. Keep UUIDs as the internal ID and expose a slug or short hash for shareable links.
- Sorting v4 UUIDs and expecting time order. v4 is uniformly random; the result is meaningless ordering. Use v7 if you need this.
- Using both
+//Base64 and-/_URL Base64 in the same system. They are different encodings; one will fail validation in the other context. - Copy-pasting UUIDs without trimming whitespace. A trailing newline character will fail strict validation. Always trim before parsing.
8. A decision matrix for picking the right ID format
The shortest path to an answer for most teams:
| Scenario | Pick | Why |
|---|---|---|
| New service, primary key in Postgres / MySQL / SQL Server | UUIDv7 (native binary column) | Time-locality, no MAC leak, RFC-standard. |
| New service, you want shorter URL-friendly IDs | ULID | 26 chars, time-ordered, no hyphens. |
| Legacy system that already uses UUIDv4 | Stay with v4 unless write perf hurts | Migration cost vs benefit. |
| Sharing IDs in URLs visible to users (short links) | NanoID | Configurable length, opaque, secure. |
| Privacy-sensitive: do not want creation time leaked | UUIDv4 or CUID2 | v7 reveals timestamp; v4/CUID2 do not. |
| Deterministic ID derived from a name (DNS, URL, etc.) | UUIDv5 | SHA-1 based, idempotent generation. |
| Logs and event IDs | UUIDv7 or ULID or KSUID | Time-ordered makes range queries cheap. |
| You control the entire system and have a sequence service | Snowflake or autoincrement | Smaller IDs, but requires coordination. |
| Storing UUIDs in MongoDB | UUID with BSON subtype 4 (driver type) | Native binary; do not store strings. |
9. Authoritative references
Tools referenced in this guide
All tools run in your browser, no signup required.