Type 3 and Type 5 UUIDs are just a technique of stuffing a hash into a UUID:
- Type 1: stuffs MAC address+datetime into 128 bits
- Type 3: stuffs an MD5 hash into 128 bits
- Type 4: stuffs random data into 128 bits
- Type 5: stuffs an SHA1 hash into 128 bits
- Type 6: unofficial idea for sequential UUIDs
Edit: Unofficial type 6 now has an official rfc
An SHA1 hash outputs 160 bits (20 bytes); the result of the hash is converted into a UUID.
With a 20-byte digest from SHA1:
SHA1 Digest: 74738ff5 5367 e958 1aee 98fffdcd1876 94028007
UUID (v5): 74738ff5-5367-5958-9aee-98fffdcd1876
? ?first two bits set to 1 and 0, respectively
╰─low nibble is set to 5, to indicate type 5
What do I hash?
You're probably wondering what is it that I'm supposed to hash. Basically you hash the concatenation of:
sha1(NamespaceUUID+AnyString);
You prefix your string with a so-called namespace to prevent name conflicts.
The UUID RFC pre-defines four namespaces for you:
NameSpace_DNS
: {6ba7b810-9dad-11d1-80b4-00c04fd430c8}
NameSpace_URL
: {6ba7b811-9dad-11d1-80b4-00c04fd430c8}
NameSpace_OID
: {6ba7b812-9dad-11d1-80b4-00c04fd430c8}
NameSpace_X500
:{6ba7b814-9dad-11d1-80b4-00c04fd430c8}
So, you could hash together:
StackOverflowDnsUUID = sha1(Namespace_DNS + "stackoverflow.com");
StackOverflowUrlUUID = sha1(Namespace_URL + "stackoverflow.com");
The RFC then defines how to:
- take the 160 bits from SHA1
- and convert it into 128 bits of a UUID
The basic gist is to only take the first 128 bits, stuff a 5
in the type record, and then set the first two bits of the clock_seq_hi_and_reserved
section to 1 and 0, respectively.
More examples
Now that you have a function that generates a so-called Name , you can have the function (in pseudo-code):
UUID NameToUUID(UUID NamespaceUUID, String Name)
{
//Note: All code on stackoverflow is public domain - no attribution required.
Byte[] hash = sha1(NamespaceUUID.ToBytes() + Name.ToBytes());
Uuid result;
//Copy first 16-bytes of the hash into our Uuid result
Copy(hash, result, 16);
//set high-nibble to 5 to indicate type 5
result[6] &= 0x0F;
result[6] |= 0x50;
//set upper two bits to "10"
result[8] &= 0x3F;
result[8] |= 0x80;
return result;
}
(Note: the endian-ness of your system can affect indices of the above bytes)
Now you can have calls:
uuid = NameToUUID(Namespace_DNS, 'www.stackoverflow.com');
uuid = NameToUUID(Namespace_DNS, 'www.google.com');
uuid = NameToUUID(Namespace_URL, 'http://www.stackoverflow.com');
uuid = NameToUUID(Namespace_URL, 'http://www.google.com/search&q=rfc+4112');
uuid = NameToUUID(Namespace_URL, 'http://stackoverflow.com/questions/5515880/test-vectors-for-uuid-version-5-converting-hash-into-guid-algorithm');
Now back to your question
For version 3 and version 5 UUIDs the additional command line arguments namespace and name have to be given. The namespace is either a UUID in string representation or an identifier for internally pre-defined namespace UUIDs (currently known are "ns:DNS", "ns:URL", "ns:OID", and "ns:X500"). The name is a string of arbitrary length.
The namespace is whatever UUID you like. It can be one of the pre-defined ones, or you can make up your own, e.g.1:
UUID Namespace_RectalForeignExtractedObject = '8e884ace-bee4-11e4-8dfc-aa07a5b093db'
The name is a string of arbitrary length.
The name is just the text you want to have appended to the namespace, then hashed, and stuffed into a UUID:
uuid = NameToUUID('8e884ace-bee4-11e4-8dfc-aa07a5b093db', 'screwdriver');
uuid = NameToUUID('8e884ace-bee4-11e4-8dfc-aa07a5b093db', 'toothbrush');
uuid = NameToUUID('8e884ace-bee4-11e4-8dfc-aa07a5b093db', 'broomstick');
uuid = NameToUUID('8e884ace-bee4-11e4-8dfc-aa07a5b093db', 'orange');
uuid = NameToUUID('8e884ace-bee4-11e4-8dfc-aa07a5b093db', 'axe handle');
uuid = NameToUUID('8e884ace-bee4-11e4-8dfc-aa07a5b093db', 'impulse body spray');
uuid = NameToUUID('8e884ace-bee4-11e4-8dfc-aa07a5b093db', 'iPod Touch');