Engineering Guide

The Complete Guide to UUIDs in 2026

A practitioner's reference covering RFC 9562, all seven UUID versions, the index-fragmentation problem, ULID and NanoID alternatives, database integration patterns, and language-by-language code. Written for backend engineers picking an ID format for production.

Last updated: May 2026 · Written by Anees Ur Rehman, full-stack developer · ~6,500 words · 25-minute read

What a UUID actually is — anatomy of 128 bits
The seven UUID versions, explained
UUIDv4 vs UUIDv7 — the index fragmentation deep dive
UUID vs ULID vs NanoID vs CUID2
Generating UUIDs in 8 languages
Storing UUIDs in your database
Common mistakes that ship to production
A decision matrix for picking the right ID format
Authoritative references

Try the UUID GeneratorGenerate v4 or v7 UUIDs in your browser. No signup, RFC 9562 compliant.

1. What a UUID actually is — anatomy of 128 bits

A UUID (Universally Unique Identifier) is 128 bits. That is the entire definition. Everything else — the canonical hyphenated format, the version digit, the variant nibble, the algorithms for generating one — is convention layered on top of those 128 bits.

The canonical string form is 32 hexadecimal characters split by hyphens into five groups: xxxxxxxx-xxxx-Mxxx-Nxxx-xxxxxxxxxxxx. The total string length is 36 characters. Two specific positions encode metadata about which UUID flavor you are looking at:

Position 13 — the "M" character is the version digit. 4 means random (v4), 7 means time-ordered (v7), 1 means timestamp+MAC, etc. This single nibble tells parsers which generation algorithm produced the ID.
Position 17 — the "N" character is the variant. The high bits are 10 in binary, so the hex character is one of 8, 9, a, or b. This identifies the UUID as conforming to RFC 4122/9562 (the "Leach-Salz" variant), as opposed to older Microsoft GUID variants or NCS variants.

Because version and variant nibbles consume 6 bits, a v4 UUID has 122 bits of randomness, not 128. The classic collision math you have probably seen — "you would need to generate 2.71 quintillion UUIDs to have a 50% chance of one collision" — is the birthday paradox formula n ≈ 1.177 × √(2^122) applied to those 122 bits.

For practical purposes that means UUID collisions are not a real concern in any system that fits in observable physical reality. If you generated a billion UUIDs per second, every second, for a hundred years, your odds of a collision are still well below 1 in a billion. The "uniqueness without coordination" guarantee is what makes UUIDs the connective tissue of modern distributed systems — every Stripe charge, every S3 multipart upload, every Auth0 session, every Slack message carries a 128-bit identifier generated independently on whatever node happened to handle the request.

2. The seven UUID versions, explained

RFC 9562 (the May 2024 revision of RFC 4122) defines seven UUID versions, plus a reserved range. Most engineers only need two or three of them in production, but knowing the full menu helps when you encounter old systems or read decisions made by previous teams.

Version	Mechanism	Sortable?	Privacy concern	Use it for
v1	60-bit timestamp + 14-bit clock seq + 48-bit MAC address	Partially	Leaks server MAC address and creation time	Legacy systems only — avoid in new code.
v2	DCE Security (POSIX UID/GID embedded)	Partially	Same as v1, plus user info leak	Defined for completeness but essentially unused. Skip.
v3	MD5 hash of namespace + name	No	Deterministic — knowing the inputs reveals the UUID	Legacy content-addressed IDs. Use v5 instead — MD5 is broken for security purposes.
v4	122 random bits	No	None	The default for the past 20 years. Still acceptable for primary keys at small-to-medium scale.
v5	SHA-1 hash of namespace + name	No	Deterministic	Stable IDs derived from a name — DNS, URL, OID, X.500 DN, custom namespace.
v6	v1 with the timestamp bytes reordered for sortability	Yes	Same as v1 (MAC leak)	Migration path off v1. Most teams jump straight to v7 instead.
v7	48-bit Unix-millis timestamp + 74 bits of entropy	Yes	Reveals creation time only	The new default for primary keys, log IDs, anywhere you want both uniqueness and time-locality.
v8	Custom format (vendor-defined)	—	—	Reserved for experimentation. RFC 9562 leaves this open for future use.

For a new project in 2026, the choice tree is short: pick v7 if your runtime supports it, fall back to v4 if it does not, use v5 if you need a deterministic ID derived from a name. v1, v2, v3, v6 all have better replacements and should not appear in new code.

3. UUIDv4 vs UUIDv7 — the index fragmentation deep dive

The single biggest reason teams have been migrating from v4 to v7 over the past two years is database write performance at scale. To understand why, you need a basic mental model of how relational databases store primary keys.

Postgres, MySQL/InnoDB, and SQL Server all store primary key indexes as B-trees (or B+ trees in InnoDB's clustered case). When IDs arrive in roughly sorted order — autoincrement integers, ULIDs, UUIDv7 — every new insert lands at the tail of the index, the page stays full, and the cache hit rate is high. Random IDs — UUIDv4 — have no temporal locality. Every insert lands in a different B-tree page. Pages split, parent pages rewrite, and the working set blows out of memory.

Real-world benchmarks from production teams running this comparison:

Postgres: v4 vs sequential ID inserts can be 2–3× slower at scale. WAL volume increases, buffer cache thrashes, vacuum has more work.
MySQL/InnoDB: primary key drives the clustered index, so this hits doubly hard. Random inserts cause page splits and bloat the on-disk size by 30–60% compared to sequential keys. The standard mitigation in MySQL 5.x was UUID_TO_BIN(uuid, 1) which reorders v1 timestamp bytes for B-tree locality. UUIDv7 does this natively.
SQL Server: the NEWSEQUENTIALID() function exists exactly because the team learned random GUIDs hurt at scale. v7 gives you the same property cross-platform.

The fix order, ranked by impact:

UUIDv7 preserves uniqueness while putting the high-order bits in time order. New insert performance approaches autoincrement.
ULID — same idea, different encoding (Crockford base32 instead of hex). Shorter, URL-friendly, time-ordered.
Store v4 UUIDs as native binary: BINARY(16) in MySQL, uuid in Postgres, uniqueidentifier in SQL Server. Halves disk and memory size compared to VARCHAR(36).
MySQL legacy: UUID_TO_BIN(uuid, 1) if you cannot move to v7.

If you are starting a green-field service, default to v7. If you have a v4-based table that is now causing write performance issues, the migration path is to add a v7 column, dual-write for a period, then switch reads.

Generate sample UUIDs to compareTry the generator with v4 vs v7 to see the time-ordering structure side by side.

4. UUID vs ULID vs NanoID vs CUID2

The broader "distributed unique ID" category contains several formats that are not strictly UUIDs but compete for the same role. Picking among them is a matter of trading off length, sortability, URL friendliness, and ecosystem support.

Format	Bits of randomness	Sortable?	Length	Standard?	Best fit
UUIDv4	122	No	36 chars (32 hex + 4 hyphens)	RFC 4122 / 9562	Maximum interop with old systems.
UUIDv7	74 + 48-bit time	Yes	36 chars	RFC 9562	New databases — strongly recommended.
ULID	80 + 48-bit time	Yes	26 chars (Crockford base32)	De-facto, no formal RFC	Shorter than UUIDv7, URL-friendly, same time-locality.
NanoID (default)	126 (21 chars × 6 bits)	No	21 chars (configurable)	De-facto	Short opaque IDs in URLs and shareable links.
CUID2	~256 (sha-3 derived)	No	24 chars (configurable)	De-facto	Privacy-conscious systems — designed to resist enumeration.
Snowflake (Twitter)	22 (10-bit machine + 12-bit seq)	Yes	~19 chars (numeric)	Twitter spec	Single-organization with strict ID coordination.
KSUID	128 (32-bit time + 128 random)	Yes	27 chars	Segment.io spec	Logs, events — Segment's preferred format.

Practical guidance:

Default to UUIDv7 for primary keys. The ecosystem support is broadest and the format is RFC-standardized.
ULID if you want shorter strings (26 vs 36 chars) and your stack does not yet have v7 support. ULIDs are time-ordered and URL-safe.
NanoID for short opaque tokens visible to users (8–12 character "share" IDs). 128 bits of entropy is overkill there.
CUID2 for cases where sequential ID enumeration is a privacy concern — its derivation is intentionally non-temporal.
UUIDv4 if you need broad interop with legacy systems that may not parse v7 yet.

5. Generating UUIDs in 8 languages

Always use a cryptographically secure random number generator. Math.random() in JavaScript, java.util.Random, and rand() in C are not safe — they have predictable internal state and can produce collisions. The platform UUID functions below all use cryptographic RNGs.

JavaScript / TypeScript (browser, Node 14.17+)

// v4 — built in
const id = crypto.randomUUID();
// "550e8400-e29b-41d4-a716-446655440000"

// v7 — implement with crypto.getRandomValues
function uuidv7() {
  const ts = BigInt(Date.now());
  const rand = crypto.getRandomValues(new Uint8Array(10));
  const b = new Uint8Array(16);
  for (let i = 0; i < 6; i++) b[i] = Number((ts >> BigInt(40 - i*8)) & 0xffn);
  b.set(rand, 6);
  b[6] = (b[6] & 0x0f) | 0x70;   // version 7
  b[8] = (b[8] & 0x3f) | 0x80;   // variant 10
  const h = [...b].map(x => x.toString(16).padStart(2, '0')).join('');
  return `${h.slice(0,8)}-${h.slice(8,12)}-${h.slice(12,16)}-${h.slice(16,20)}-${h.slice(20)}`;
}

Python 3.6+ (v4) and 3.13+ (v7 native)

import uuid
uuid.uuid4()              # v4
uuid.uuid7()              # v7 — Python 3.13+
uuid.uuid5(uuid.NAMESPACE_URL, "https://example.com/users/42")  # v5

# For Python < 3.13, install the `uuid7` PyPI package.

Go (google/uuid v1.6.0+)

import "github.com/google/uuid"

id4 := uuid.New()                     // v4
id7, _ := uuid.NewV7()                // v7

Rust (uuid crate)

use uuid::Uuid;

let id = Uuid::new_v4();
let v7 = Uuid::now_v7();   // requires the `v7` feature flag

Java 17+

import java.util.UUID;

UUID id = UUID.randomUUID();          // v4

// v7 — needs a library, e.g. com.github.f4b6a3:uuid-creator
import com.github.f4b6a3.uuid.UuidCreator;
UUID v7 = UuidCreator.getTimeOrderedEpoch();

PHP 8 / Symfony UID

use Symfony\Component\Uid\Uuid;

$v4 = Uuid::v4();
$v7 = Uuid::v7();

Ruby

require 'securerandom'
SecureRandom.uuid                 # v4

# v7 — needs a gem like `uuidx`
require 'uuidx'
Uuidx.v7

SQL — PostgreSQL 13+ / 18

-- v4 (built in since PG 13)
SELECT gen_random_uuid();

-- v7 — install pg_uuidv7 extension or upgrade to PG 18
SELECT uuidv7();

6. Storing UUIDs in your database

The single most common mistake is storing UUIDs as VARCHAR(36). That is the canonical string representation, but it costs 36 bytes per row plus index overhead. The native binary types are 16 bytes — less than half the size. At a billion rows, that is a 20 GB difference on disk plus proportional memory savings.

Database	Native UUID type	Storage size	Notes
PostgreSQL	`uuid`	16 bytes	Built-in. Use `gen_random_uuid()` or `uuidv7()`.
MySQL 8.0+	`BINARY(16)`	16 bytes	Use `UUID_TO_BIN(uuid, 1)` for v1 byte reorder.
SQL Server	`uniqueidentifier`	16 bytes	Use `NEWSEQUENTIALID()` for sortable.
SQLite	`BLOB` 16 bytes	16 bytes	No native type. Store binary; convert on read.
Oracle	`RAW(16)`	16 bytes	Custom function for generation.
MongoDB	`UUID` BSON subtype 4	16 bytes	Use the driver's UUID type, not strings.
DynamoDB	String (S)	~36 bytes	Stored as canonical string. Costs more but works with key conditions.
Cassandra	`uuid` / `timeuuid`	16 bytes	Native types; `timeuuid` = v1 with sortability.

If your stack involves passing UUIDs through JSON APIs, the JSON Formatter tool helps verify the canonical string is preserved across the round trip — particularly for languages that may stringify UUIDs differently from what the receiver expects.

7. Common mistakes that ship to production

Treating UUIDs as secrets. A UUID is an identifier, not a token. Anyone who sees one can use it to address the resource it points to. Pair UUIDs with proper authentication — never use them as session tokens, API keys, or password reset links.
Storing UUIDs as VARCHAR(36). 36 bytes per row instead of 16 — wastes disk, memory, and index space. Use the native binary type.
Using v1 in 2026. The MAC-address bytes leak which physical machine generated the ID. v7 gives you the same time-ordering benefit without the leak.
Validating UUIDs with the wrong regex. /^[0-9a-f-]{36}$/ accepts garbage like aaaa-aaaa-.... The correct shape requires the version digit and variant nibble in their specific positions: /^[0-9a-f]{8}-[0-9a-f]{4}-[1-7][0-9a-f]{3}-[89ab][0-9a-f]{3}-[0-9a-f]{12}$/i.
Generating v4 UUIDs with Math.random(). The Stack Overflow snippet "xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx".replace(...) uses Math.random(), which is not cryptographically secure. Use crypto.randomUUID() in any browser since 2021.
Embedding UUIDs in user-facing URLs. Long, ugly, non-memorable. Keep UUIDs as the internal ID and expose a slug or short hash for shareable links.
Sorting v4 UUIDs and expecting time order. v4 is uniformly random; the result is meaningless ordering. Use v7 if you need this.
Using both +// Base64 and -/_ URL Base64 in the same system. They are different encodings; one will fail validation in the other context.
Copy-pasting UUIDs without trimming whitespace. A trailing newline character will fail strict validation. Always trim before parsing.

8. A decision matrix for picking the right ID format

The shortest path to an answer for most teams:

Scenario	Pick	Why
New service, primary key in Postgres / MySQL / SQL Server	UUIDv7 (native binary column)	Time-locality, no MAC leak, RFC-standard.
New service, you want shorter URL-friendly IDs	ULID	26 chars, time-ordered, no hyphens.
Legacy system that already uses UUIDv4	Stay with v4 unless write perf hurts	Migration cost vs benefit.
Sharing IDs in URLs visible to users (short links)	NanoID	Configurable length, opaque, secure.
Privacy-sensitive: do not want creation time leaked	UUIDv4 or CUID2	v7 reveals timestamp; v4/CUID2 do not.
Deterministic ID derived from a name (DNS, URL, etc.)	UUIDv5	SHA-1 based, idempotent generation.
Logs and event IDs	UUIDv7 or ULID or KSUID	Time-ordered makes range queries cheap.
You control the entire system and have a sequence service	Snowflake or autoincrement	Smaller IDs, but requires coordination.
Storing UUIDs in MongoDB	UUID with BSON subtype 4 (driver type)	Native binary; do not store strings.