Remove Large Files in Git
TL;DR: BFG is your friend. java -jar bfg.jar -b 50M myrepo-bfg.git
for example
The other day I've committed a rather large file by mistake (generated movie of the git commit history). This was a bit annoying because all other developers would have suddenly to check out a 1G file :( with no LFS support...
As I took the opportunity to clean up the repository also, I've noticed the size of the cloned reop was rather huge! So, I was presented with several options
- Create a new repo with the cleaned version (i.e. lose the history)
- Try to migrate selectively commits
- Remove the large file (and other larg-ish files)
Initially, I wanted to start with the second variant, but making the cuts and patching things around was difficult (well, active development, a bunch of commits in the meantime...). So, the third option looked promising. For this, you need the excellent BFG tool.
My workflow was something like this:
git clone --mirror http://.../myrepo.git
git clone --mirror myrepo.git myrepo-bfg.git
java -jar bfg.jar -b 50M myrepo-bfg.git
git reflog expire --expire=now --all && git gc --prune=now --aggressive
git commit origin
i.e.,
- Clone the original repo as a mirror
- Make a copy (in case I mess up)
- Run
bfg
to remove all large files - Expire the reflog and do a garbage collection to prune the tree
- Commit back
This way, I got rid of all big files. Cool!
Important note: This process will permanently delete stuff from git, as if it was never there. USE IT WITH CARE (aka I'm not responsible if you empty up your company's code base!)
HTH,