Remove Large Files in Git
TL;DR: BFG is your friend.
java -jar bfg.jar -b 50M myrepo-bfg.git for example
The other day I’ve committed a rather large file by mistake (generated movie of the git commit history). This was a bit annoying because all other developers would have suddenly to check out a 1G file :( with no LFS support…
As I took the opportunity to clean up the repository also, I’ve noticed the size of the cloned reop was rather huge! So, I was presented with several options
- Create a new repo with the cleaned version (i.e. lose the history)
- Try to migrate selectively commits
- Remove the large file (and other larg-ish files)
Initially, I wanted to start with the second variant, but making the cuts and patching things around was difficult (well, active development, a bunch of commits in the meantime…). So, the third option looked promising. For this, you need the excellent BFG tool.
My workflow was something like this:
git clone --mirror http://.../myrepo.git
git clone --mirror myrepo.git myrepo-bfg.git
java -jar bfg.jar -b 50M myrepo-bfg.git
git reflog expire --expire=now --all && git gc --prune=now --aggressive
git commit origin
- Clone the original repo as a mirror
- Make a copy (in case I mess up)
bfgto remove all large files
- Expire the reflog and do a garbage collection to prune the tree
- Commit back
This way, I got rid of all big files. Cool!
Important note: This process will permanently delete stuff from git, as if it was never there. USE IT WITH CARE (aka I’m not responsible if you empty up your company’s code base!)
A little experiment: If you find this post and ad below useful, please check the ad out :-)