@whitequark Most applications that use UUIDs these days tend to use version 7 UUIDs, which use milliseconds since the Unix epoch as the most significant bits. This embeds the creation timestamp in the ID and allows for sorting, while also adding in sufficient randomness so they're not incremental and multiple systems can generate them with low probability of collision.
@ramsey oh, I didn't know this!
@whitequark @ramsey I've only ever seen version 4 (unstructured random) in production that i can recall
@whitequark @ramsey (this is the first I've ever heard of v7)
@azonenberg @whitequark @ramsey
V7 is fairly new, standardised around 2024. They've got a bit more adoption in databases over the last year.
@intrbiz @azonenberg @whitequark @ramsey Yeah, postgres 18 added support for generating them IIRC, though UUIDs are inherently "mostly backwards compatible" unless you're trying to parse them for some godforsaken reason, so older versions support it just fine if the client generates them.
They make for much happier indexes and sharding vs the ones with leading-random, because most workloads don't have truly random access patterns...
Indeed I did a talk at POSETTE last year talking about encoding information into UUIDs and some of the index issues.
IMHO you can have more fun that just encoding generation time into them.
@intrbiz @becomethewaifu Version 8 is probably more suited to those use-cases, though.
Indeed, my code was generating UUIDs marked as V8. The version is just a nibble that's been standardised. And it's handy to have a standardised version number for custom generation schemes.
@kw217 @intrbiz One reason not to use version 1 is that it leaks details about the system (i.e., the MAC address). Another reason is that the values aren’t sortable. Version 6 was introduced to solve this. It’s also based on 100-nanosecond intervals since the Gregorian epoch, but it’s sortable and uses random bytes following the timestamp, rather than the MAC address.
But, for most purposes, version 7 is the right solution, unless you need to create UUIDs for dates earlier than 1970.
Valid concern in some domains for sure. I tend to keep the time buckets pretty big. There are block based approaches which remove the time related issues but still reduce issue with things like indexes.
@intrbiz @ramsey @kw217 yeah, fascinatingly, if you allocate IDs at a global scale via sharding, the shards wind up forming a weak proxy for geographic location
this stuff is really hard to get right