Bioinformatics day-by-day

Menu

Tag Archives: large

The github repository of #NGSchool website has grown to over 5GB. I wanted to reduce the size & simplify this repository, but this task turned out to quite complicated. Instead, I have decided to leave current repo as is (and probably removed it soon) and start new repo for existing version. I could do that, as I don’t care about version earlier than the one I’m currently using. This is short how-to:

Push all changes and remove .git folder

git push origin master
rm -rI .git

Rename existing repo

Settings > Repository name > RENAME

Start new repository using old repo name

Don’t need to create any files as all already exists.

Init your local repo and add new remote

git init
git remote add origin git@github.com:USER/REPO

Commit changes and push

git add --all . && git commit -m "fresh" && git push origin master

Doing so, my new repo size is below 1GB, which is much better compared to 5GB previously.

Lately, I have had lots of problems with pushing large files to github. I am maintaining compilation of materials and software deposited by other people, so cannot control the size of files… and this makes push to fail often.

Git is great, there is no doubt about that. Being able to revert any changes and recover lost data is simply priceless. But recently, I have started to be concerned about the size of some of my repositories. Some, especially those containing changing binary files, were really large!!!
You can check the size of your repository by simple command:

git count-objects -vH

Here, git Large File Storage (LSF) comes into action. Below, I’ll describe how to install and mark large binary files, so they are not uploaded as a whole, but only relevant chunks of changed binary file is uploaded.