What Is a UUID? How v4 Works, Collision Probability, and When to Use It

A UUID (Universally Unique Identifier) is a unique identifier that, without any central ID-issuing server, can be generated independently in many places and still not collide in practice. It is a 128-bit value written as 36 characters in the 8-4-4-4-12 form, and it is widely used for primary keys in distributed systems, correlation IDs in logs, and more. This article lays out, accurately, UUID notation, the differences between versions, the structure of v4, collision probability, and where to use it.

The bottom line first: for a general-purpose unique identifier with no particular constraints, random-based v4 is the default; when you want values ordered by creation time, such as a database primary key, time-ordered v7 is the basic choice. Both rely on being generated with a cryptographically secure random source (CSPRNG).

1. What a UUID is — 128 bits and the 8-4-4-4-12 notation

A UUID is a 128-bit (16-byte) value. To make it easy for people to read and write, the 32 hexadecimal digits are separated by hyphens into five blocks of 8-4-4-4-12, for 36 characters in total.

Example: 550e8400-e29b-41d4-a716-446655440000

Each character is one hexadecimal digit (4 bits = 1 nibble).
The hyphens are separators for readability, not part of the data itself.
The standard specification is RFC 4122, and its successor RFC 9562 (2024) updates and extends it (adding v6/v7/v8).
In Microsoft technologies the same concept is called a GUID (sometimes written with curly braces).

Of these, nibbles/bits at specific positions are reserved to indicate the version and the variant. The rest is used for the actual data (random bits in the case of v4).

2. The different UUID versions

UUIDs come in several versions, which differ in what the value is derived from. Here are the representative ones.

Version	Derived from	Characteristics / notes
v1	Time + node (MAC address, etc.)	Time-ordered, but embeds the MAC, raising privacy concerns
v3	Namespace + name (MD5)	Always produces the same UUID from the same input (deterministic)
v4	Random numbers	The most common. 122 bits are random. Simple to implement and unbiased
v5	Namespace + name (SHA-1)	The SHA-1 counterpart of v3. Suited to deterministic generation
v7	Time (milliseconds) + random	Added in RFC 9562. Increases monotonically over time; good for DB keys

If in doubt, it is enough to remember: use v4 for a general identifier with no particular constraints, and v7 when you want ordering, such as for a database primary key.

3. The structure of v4 — 122 random bits, version, variant

In v4, out of the 128 bits, 122 bits are random after excluding the few fixed reserved bits. There are two sets of reserved bits.

Version nibble: the first character of the third block is always 4 (0100 in binary). It indicates that this is v4.
Variant bits: the first character of the fourth block is one of 8, 9, a, or b (the top two bits are 10). It indicates the RFC 4122 variant.

In other words, among the 36 characters, a 4 and one of 8/9/a/b always appear at specific positions. Let us confirm the positions with an example.

Example: f47ac10b-58cc-4372-a567-0e02b2c3d479

The third block 4372 begins with 4 → version = 4.
The fourth block a567 begins with a → variant = RFC 4122 (top two bits 10).

Conversely, a UUID whose third block does not begin with 4 is not v4. Subtracting the version nibble (2 bits) and the variant bits (2 bits) leaves 6 fixed bits in total, with the remaining 122 bits being random.

4. Collision probability — reasoning with the birthday problem

"If it is random, won't the same value eventually come up?" is a natural question. We can estimate this within the framework of the birthday problem (the probability that a pair sharing a birthday appears).

The random space of v4 has 2^122 (about 5.3×10³⁶) possibilities. With the birthday-problem approximation, collisions begin to become realistically probable roughly when the number generated n reaches the square root of the space, that is, around 2^61 (about 2.3×10¹⁸, ~2.3 quintillion).

Put differently, even if you kept generating one billion UUIDs per second, it would take on the order of decades or far longer before a collision became realistic.
For the number of IDs an ordinary application handles (millions to billions), the collision probability is incomparably smaller than winning a lottery and is negligible in practice.

There is a prerequisite, however. This estimate assumes the random numbers are truly random (uniform). With a poor random source, the values become biased and collide far sooner than estimated. That is precisely why the "secure random source (CSPRNG)" described next is decisively important.

Do not create UUIDs with Math.random(). JavaScript's Math.random() is not intended for cryptographic use; it is predictable and can be biased. In the browser, use crypto.randomUUID(), or crypto.getRandomValues() where that is unavailable; in Node.js, use the crypto module.

5. When to use it — distributed systems, correlation IDs, file names

The strength of a UUID is that it lets you create a unique value independently, without centralized issuance. Here are concrete use cases.

Primary keys in distributed systems: because multiple nodes or the client side can decide the ID up front, you can create records without waiting for a server round trip.
Correlation IDs (trace IDs): assign a UUID to a single request and trace logs across microservices end to end.
File names / object keys: convenient for avoiding collisions in uploaded file names. Note, however, that they are merely hard to guess and are not secret values.
Be cautious about API-key-like uses: a UUID is an identifier, not a credential. Access tokens should use dedicated random values with sufficient entropy, and you should avoid basing authorization on a UUID.

6. v4 vs v7 — index locality

When you use a UUID as a database primary key, the difference between v4 and v7 affects performance. Because v4 is completely random, the insertion position is scattered every time, so B-Tree index pages are updated all over (causing fragmentation and reduced cache efficiency). Since v7 leads with a millisecond timestamp and increases monotonically over time, new rows cluster near the tail of the index, which gives it high locality.

Aspect	v4 (random)	v7 (time + random)
Ordering	Irregular (random)	Monotonically increasing by creation order
Index locality	Low (inserts scattered)	High (concentrated near the tail)
Leakage of creation time	Does not leak	Can be inferred from the prefix
Main use	General-purpose unique identifier	DB primary keys / time-series keys
Standard	RFC 4122 / 9562	RFC 9562

If it is fine for the ordering to be externally visible and you want to prioritize insert performance, choose v7; if you want to hide even the creation time, or you simply want values spread uniformly, choose v4. Pick by use case.

Free Tool Create one for real with the UUID Generator Generate UUIDs in your browser using a secure random source and copy them. The output uses the correct format, including the version and variant positions.

Frequently Asked Questions (FAQ)

Will UUID v4 values really never collide?

There is no absolute guarantee against duplicates, but because v4 carries 122 random bits, collisions are effectively negligible in practice. Even reasoning with the birthday problem, you would need to generate on the order of 2^61 (about 2.3 quintillion) UUIDs before collisions become realistically probable, which ordinary systems never reach. The key prerequisite is generating them with a cryptographically secure random source (CSPRNG).

Should I use v4 or v7?

It depends on the use case. For a general-purpose unique identifier, random-based v4 is a safe default. However, when you want values to be ordered by creation time, such as a database primary key, v7 is better because it places a millisecond timestamp at the front. Since v7 increases monotonically over time, it has high B-Tree index locality and is advantageous for insert performance and page splits.

Are UUIDs created with a secure random source?

It depends on the implementation. The quality of v4 is directly tied to the quality of the random source, so always use a cryptographically secure pseudo-random number generator (CSPRNG). In the browser that means crypto.randomUUID() or crypto.getRandomValues(); in Node.js it is the crypto module. Math.random() is predictable and unsuitable for cryptographic purposes, so it must not be used to generate UUIDs.

What Is a UUID? How v4 Works, Collision Probability, and When to Use It

1. What a UUID is — 128 bits and the 8-4-4-4-12 notation

2. The different UUID versions

3. The structure of v4 — 122 random bits, version, variant

4. Collision probability — reasoning with the birthday problem

5. When to use it — distributed systems, correlation IDs, file names

6. v4 vs v7 — index locality

Related pages

Frequently Asked Questions (FAQ)