Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
655 views
in Technique[技术] by (71.8m points)

github - git pull multiple remotes in parallel

I have a repo with thousands of remotes, and I'd like to pull from thousands of remotes at the same time, ideally I can specify a maximum number to do at the same time.

I wasn't able to find anything related to this in the manpages, google, or git-scm online.

To be perfectly clear: I do not want to run one command over multiple repos, I have one repo with thousands of remotes.

This has nothing to do with submodules, don't talk about submodules. Submodules are unrelated to git remotes.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I'm pretty sure you have to write your own code to do this.

As CodeWizard says in a comment, Git needs to lock parts of the repository. Some of these locks are bound to collide at times, if you simply run multiple git fetch processes in parallel within a single repository.

You might also want some kind of remote-ordering strategy since, e.g., collecting from remoteA, remoteB, and remoteC in parallel may discover 10000 common objects on remoteB as compared to the other two if remoteB is generally (but not always) a superset of remoteA and remoteC.1 While this also applies to sequential git fetch operations, it becomes considerably less important. Suppose, for example, that there are 5000 objects—some commits, some trees, and some blobs—on A that you do not yet have, 5000 others on C, and all 10000 on B. If you fetch sequentially, in any order, you pick up either 5k, then 5k, then 0; or 10k, then 0, then 0; because by the time you move to the next remote, you have collected and stored the 5k or 10k incoming objects. But if you do all three in parallel, you will bring 5k, 5k, and 10k objects in, and only then discover that you have doubled your workload.


1If B is always a superset, simply go to B first (sequentially), then go to A and C in parallel solely for their references, which will point to objects you now have.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...