My aim is to be able to both build recent versions of, and contribute to, a project that has a long and voluminous history - and to do this without using local storage to duplicate lots of historic branches and history going back a decade and more (which I can always look up in the web UI of the project's central repository anyway, if I ever need to, which I probably won't).
I seemed to get lucky with my first try:
git clone --depth 40 -b master http://github.com/who/what.git/ what
That gave me a tidy local clone of 'what' that only had the 'master
' branch, and enough commits to cover the most recent two tagged releases.
I could then do 'git checkout latest-release-tag
' and build the latest release. Yippee!
As I sort of expected, I needed to make a patch. Everything went swimmingly: 'git checkout -b my-patch-branch
', make my changes, commit, and I was able to push my-patch-branch back to a clone on github so the project could pull it. Easy!
I think I got lucky there because from what I read, e.g., here, I would not have been able to do that before git 1.9.
But the installed version turned out to be 1.9, so I got away with it.
Now the next obvious thing I'd like to do is fetch from the remote and pick up the most recent activity on master (including the upstream merge of my patch, so I won't need that branch any more). I tried 'git fetch --dry-run upstream' and watched in horror as it ticked off endless megabytes of download and then gave me a list of new tags going back to the age of mastodons. I'm glad I said --dry-run
!
I was really hoping it would just pick up the dozen or so new commits on 'master
' since the HEAD of my clone, and then maybe I'd have a depth 52 clone instead of 40, but that's kind of what I want ... start with a useful amount of recent history before I became involved, then just track and grow from that point, and be able to build, branch, and push patches. It seems so close.
Is there any simple way to make git do what I'm trying to do? Is what I'm trying to do unreasonable?
Edit: a bit more information.
(1) the upsteam is actually ahead of me by closer to a hundred commits, my estimate of a dozen was pulled out of the air.
(2) it turns out the original 40 commits that I got with my clone were all single-parent commits. A bunch of the later ones that I'm trying to fetch are merge commits with a second parent in some branch my clone didn't include. Could those be causing git to pull in all their ancient history because the earliest commit in my clone isn't a common ancestor?
Is there a way to tell it I don't want that?
More new information:
(1) it occurred to me that I was using the http protocol earlier, which doesn't actually interact with a git process on the server so it has no opportunity to tailor the download size.
However, when I retried using git-over-ssh, I still got a huge fetch.
Then
(2) manually, like an animal, I clicked through the merge commits shown in github's 'newtork' display, found the ones involving branches that began before my shallow cut, and added their parent-2 SHAs to my .git/shallow
file, and then tried 'git fetch' over ssh again. That worked great and downloaded a tiny pack file that could just fast-forward my local master
branch. I think this is exactly the operation I'd like git to be able to do automatically, but I haven't found a way to do it. Manually, it's pretty tedious. :)
See Question&Answers more detail:
os