Locked History Actions


How To Purge a Large File from a GIT Repo

Use BFG and following this and this:

First make a raw backup,e.g.:

git clone --mirror git@darkmatter.ps.uci.edu:baofit.git baofit.backup

Install the Java development kit (JDK) if necessary (this is no longer bundled with OS-X). Download the latest version of the BFG jarfile.

Remove the large file from the head of the master branch, then run BFG from within the (normal, non-raw) repo folder:

java -jar ~/Downloads/bfg-1.11.10.jar --delete-files BOSSDR11LyaF.cov

This produces output like this:

Using repo : /Users/david/Cosmo/LyAlpha/code/baofit/.git

Found 183 objects to protect
Found 1 tag-pointing refs : refs/tags/1.0.0
Found 24 commit-pointing refs : HEAD, refs/heads/Add_README, refs/heads/Implement_simultaneous_fitting, ...

Protected commits

These are your protected commits, and so their contents will NOT be altered:

 * commit 7154d629 (protected by 'HEAD')


Found 811 commits
Cleaning commits:       100% (811/811)
Cleaning commits completed in 237 ms.

Updating 2 Refs

        Ref                          Before     After   
        refs/heads/master          | 7154d629 | 50f1c6bb
        refs/remotes/origin/master | d320e6b2 | 7d52bdd1

Updating references:    100% (2/2)
...Ref update completed in 21 ms.

Commit Tree-Dirt History

        Earliest                                              Latest
        |                                                          |

        D = dirty commits (file tree fixed)
        m = modified commits (commit message or parents changed)
        . = clean commits (no changes to file tree)

                                Before     After   
        First modified commit | 5db736f8 | 325a603b
        Last dirty commit     | d320e6b2 | 7d52bdd1

In total, 12 object ids were changed - a record of these will be written to:


BFG run is complete! When ready, run: git reflog expire --expire=now --all && git gc --prune=now --aggressive

Finally, clean up with:

git reflog expire --expire=now --all && git gc --prune=now --aggressive

and push the changes:

git push origin --force --all
git push github --force --all

Anyone who has already fetched any commits now needs to re-synch their repo. A git pull does not do the right thing since it merges the big file commit back into the history. The docs above mention that git rebase works but it is probably simpler just to re-clone the repo and abandon the original repo.