Normally, the content of files in the annex is prevented from being modified. (Unless your repository is using direct mode.)

That's a good thing, because it might be the only copy, you wouldn't want to lose it in a fumblefingered mistake.

# echo oops > my_cool_big_file
bash: my_cool_big_file: Permission denied

In order to modify a file, it should first be unlocked.

# git annex unlock my_cool_big_file
unlock my_cool_big_file (copying...) ok

That replaces the symlink that normally points at its content with a copy of the content. You can then modify the file like any regular file. Because it is a regular file.

(If you decide you don't need to modify the file after all, or want to discard modifications, just use git annex lock.)

When you git commit, git-annex's pre-commit hook will automatically notice that you are committing an unlocked file, and add its new content to the annex. The file will be replaced with a symlink to the new content, and this symlink is what gets committed to git in the end.

# echo "now smaller, but even cooler" > my_cool_big_file
# git commit my_cool_big_file -m "changed an annexed file"
add my_cool_big_file ok
[master 64cda67] changed an annexed file
 1 files changed, 1 insertions(+), 1 deletions(-)

There is one problem with using git commit like this: Git wants to first stage the entire contents of the file in its index. That can be slow for big files (sorta why git-annex exists in the first place). So, the automatic handling on commit is a nice safety feature, since it prevents the file content being accidentally committed into git. But when working with big files, it's faster to explicitly add them to the annex yourself before committing.

# echo "now smaller, but even cooler yet" > my_cool_big_file
# git annex add my_cool_big_file
add my_cool_big_file ok
# git commit my_cool_big_file -m "changed an annexed file"

Git wants to first stage the entire contents of the file in its index. That can be slow for big files (sorta why git-annex exists in the first place)

I think that git-annex's usefulness is not only because of the Git's index overhead: I like its idea because it will help track the copies in the "special remotes", which are not Git because

  • they are either not under my control (e.g., web URLs),
  • or because it's not convenient to hold a Git repo there (an external disk/DVD with files can be viewed easily by a human, but imposing a Git repo structure on it would either at least double the consume space: for the history of the commits and for the working dir, or make it "unreadable" for a human, if it is a bare repo);
  • or because it's nearly impossible to put a Git repo on a storage like peer networks without special tools.
Comment by imz [lj.rossia.org] Tue Sep 25 18:04:01 2012

ATM unlock copies original file for modifications, so that original copy remains under annex and possibly to-be-edited copy created. But if I am "unlock"ing file I might simply not be interested in a previous copy and want to maintain only a single (possibly edited) new copy. What if there was a mode where the actual file is simply moved into "unlocked" location for editing, thus effectively dropping the actual content from git annex. That would allow to not inquire lengthy copying/wasting local space. If then I would need a previous copy I would just "get" it again.

Comment by site-myopenid Thu May 30 16:26:52 2013
Comments on this page are closed.