git » git-arr » commit 58037e5

Repo.blob: respect reported blob size

author Eric Sunshine
2015-01-13 09:57:12 UTC
committer Alberto Bertogli
2015-01-13 19:51:44 UTC
parent 50c004f8a5f26ad27f03597b050f9b1a1910cc45

Repo.blob: respect reported blob size

Batch output of git-cat-file has the form:

    <sha1> SP <type> SP <size> LF <contents> LF

It unconditionally includes a trailing line-feed which Repo.blob()
incorrectly returns as part of blob content. For textual blobs, this
extra character is often benign, however, for binary blobs, it can
easily change the meaning of the data in unexpected or disastrous ways.
Fix this by respecting the blob size reported by git-cat-file.

(The alternate approach of unconditionally dropping the final LF also
works, however, respecting the reported size is perhaps a bit more
robust and "correct".)

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Alberto Bertogli <albertito@blitiri.com.ar>

git.py +2 -2

diff --git a/git.py b/git.py
index 9f73fd1..ad3952d 100644
--- a/git.py
+++ b/git.py
@@ -345,7 +345,7 @@ class Repo:
             ref = self.branch
         cmd = self.cmd('cat-file')
         cmd.raw(True)
-        cmd.batch = None
+        cmd.batch = '%(objectsize)'
 
         if isinstance(ref, unicode):
             ref = ref.encode('utf8')
@@ -356,7 +356,7 @@ class Repo:
         if not head or head.strip().endswith('missing'):
             return None
 
-        return Blob(out.read())
+        return Blob(out.read()[:int(head)])
 
     def last_commit_timestamp(self):
         """Return the timestamp of the last commit."""