Git uses the following information to generate the sha-1:
- The source tree of the commit (which unravels to all the subtrees and
blobs)
- The parent commit sha1
- The author info (with timestamp)
- The committer info (right, those are different!, also with timestamp)
- The commit message
(on the complete explanation; look here).
Git does NOT guarantee that the first 4 characters will be unique. In chapter 7 of the Pro Git Book it is written:
Git can figure out a short, unique abbreviation for your SHA-1 values.
If you pass --abbrev-commit to the git log command, the output will
use shorter values but keep them unique; it defaults to using seven
characters but makes them longer if necessary to keep the SHA-1
unambiguous:
So Git just makes the abbreviation as long as necessary to remain unique. They even note that:
Generally, eight to ten characters are more than enough to be unique
within a project.
As an example, the Linux kernel, which is a pretty large project with
over 450k commits and 3.6 million objects, has no two objects whose
SHA-1s overlap more than the first 11 characters.
So in fact they just depend on the great improbability of having the exact same (X first characters of a) sha.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…