TLDR; as mentioned in GitMerge 2019:
git config --global core.commitGraph true
git config --global gc.writeCommitGraph true
cd /path/to/repo
git commit-graph write
Actually (see at the end), the first two config are not needed with Git 2.24+ (Q3 2019): they are true
by default.
As T4cC0re mentions in the comments:
If you are on git version 2.29 or above you should rather run:
git commit-graph write --reachable --changed-paths
This will pre-compute file paths, so that git log
commands that are scoped to files also benefit from this cache.
Git 2.18 (Q2 2018) will improve git log
performance:
See commit 902f5a2 (24 Mar 2018) by René Scharfe (rscharfe
).
See commit 0aaf05b, commit 3d475f4 (22 Mar 2018) by Derrick Stolee (derrickstolee
).
See commit 626fd98 (22 Mar 2018) by brian m. carlson (bk2204
).
(Merged by Junio C Hamano -- gitster
-- in commit 51f813c, 10 Apr 2018)
sha1_name
: use bsearch_pack()
for abbreviations
When computing abbreviation lengths for an object ID against a single
packfile, the method find_abbrev_len_for_pack()
currently implements
binary search.
This is one of several implementations.
One issue with this implementation is that it ignores the fanout table in the pack-index
.
Translate this binary search to use the existing bsearch_pack()
method
that correctly uses a fanout table.
Due to the use of the fanout table, the abbreviation computation is
slightly faster than before.
For a fully-repacked copy of the Linux repo, the following 'git log' commands improved:
* git log --oneline --parents --raw
Before: 59.2s
After: 56.9s
Rel %: -3.8%
* git log --oneline --parents
Before: 6.48s
After: 5.91s
Rel %: -8.9%
The same Git 2.18 adds a commits graph: Precompute and store information necessary for ancestry traversal in a separate file to optimize graph walking.
See commit 7547b95, commit 3d5df01, commit 049d51a, commit 177722b, commit 4f2542b, commit 1b70dfd, commit 2a2e32b (10 Apr 2018), and commit f237c8b, commit 08fd81c, commit 4ce58ee, commit ae30d7b, commit b84f767, commit cfe8321, commit f2af9f5 (02 Apr 2018) by Derrick Stolee (derrickstolee
).
(Merged by Junio C Hamano -- gitster
-- in commit b10edb2, 08 May 2018)
commit
: integrate commit graph with commit parsing
Teach Git to inspect a commit graph file to supply the contents of a
struct commit when calling parse_commit_gently()
.
This implementation satisfies all post-conditions on the struct commit, including loading parents, the root tree, and the commit date.
If core.commitGraph
is false
, then do not check graph files.
In test script t5318-commit-graph.sh, add output-matching
conditions on
read-only graph operations.
By loading commits from the graph instead of parsing commit buffers, we
save a lot of time on long commit walks.
Here are some performance results for a copy of the Linux repository where 'master' has 678,653 reachable commits and is behind 'origin/master
' by 59,929 commits.
| Command | Before | After | Rel % |
|----------------------------------|--------|--------|-------|
| log --oneline --topo-order -1000 | 8.31s | 0.94s | -88% |
| branch -vv | 1.02s | 0.14s | -86% |
| rev-list --all | 5.89s | 1.07s | -81% |
| rev-list --all --objects | 66.15s | 58.45s | -11% |
To know more about commit graph, see "How does 'git log --graph
' work?".
The same Git 2.18 (Q2 2018) adds lazy-loading tree.
The code has been taught to use the duplicated information stored
in the commit-graph file to learn the tree object name for a commit
to avoid opening and parsing the commit object when it makes sense
to do so.
See commit 279ffad (30 Apr 2018) by SZEDER Gábor (szeder
).
See commit 7b8a21d, commit 2e27bd7, commit 5bb03de, commit 891435d (06 Apr 2018) by Derrick Stolee (derrickstolee
).
(Merged by Junio C Hamano -- gitster
-- in commit c89b6e1, 23 May 2018)
commit-graph
: lazy-load trees for commits
The commit-graph file provides quick access to commit data, including
the OID of the root tree for each commit in the graph. When performing
a deep commit-graph walk, we may not need to load most of the trees
for these commits.
Delay loading the tree object for a commit loaded from the graph
until requested via get_commit_tree()
.
Do not lazy-load trees for commits not in the graph, since that requires duplicate parsing and the relative peformance improvement when trees are not needed is small.
On the Linux repository, performance tests were run for the following
command:
git log --graph --oneline -1000
Before: 0.92s
After: 0.66s
Rel %: -28.3%
Git 2.21 (Q1 2019) adds loose cache.
See commit 8be88db (07 Jan 2019), and commit 4cea1ce, commit d4e19e5, commit 0000d65 (06 Jan 2019) by René Scharfe (rscharfe
).
(Merged by Junio C Hamano -- gitster
-- in commit eb8638a, 18 Jan 2019)
object-store
: use one oid_array
per subdirectory for loose cache
The loose objects cache is filled one subdirectory at a time as needed.
It is stored in an oid_array
, which has to be resorted after each add operation.
So when querying a wide range of objects, the partially filled array needs to be resorted up to 255 times, which takes over 100 times longer than sorting once.
Use one oid_array
for each subdirectory.
This ensures that entries have to only be sorted a single time. It also avoids eight binary search steps for each cache lookup as a small bonus.
The cache is used for collision checks for the log placeholders %h
, %t
and %p
, and we can see the change speeding them up in a repository with ca. 100 objects per subdirectory:
$ git count-objects
26733 objects, 68808 kilobytes
Test HEAD^