commit 5568fd50c26e0474b3b1e02fa3581053d1478c5f Author: Eric Sunshine Date: Wed Jan 14 02:46:35 2015 -0500 Repo: retire new_in_branch() and notion of "bound" branch Binding (or "pegging") a Repo at a particular branch via new_in_branch() increases the cognitive burden since the reader must maintain a mental model of which Repo instances are pegged and which are not. This burden outweighs whatever minor convenience (if any) is gained by pegging the Repo at a particular branch. It is easier to reason about the code when the branch name is passed to clients directly rather than indirectly via a pegged Repo. Preceding patches retired all callers of new_in_branch(), therefore remove it. Signed-off-by: Eric Sunshine Signed-off-by: Alberto Bertogli commit 89a637660fed90af8991d734ea758a07780e9ac1 Author: Eric Sunshine Date: Wed Jan 14 02:46:34 2015 -0500 branch: pass branch name view explicitly Passing the branch name into the view indirectly via Repo.new_in_branch() increases cognitive burden, thus outweighing whatever minor convenience (if any) is gained by doing so. The code is easier to reason about when the branch name is passed to the view directly. Signed-off-by: Eric Sunshine Signed-off-by: Alberto Bertogli commit 37e731fc2ebe5d249ea9caa85e15491654450238 Author: Eric Sunshine Date: Wed Jan 14 02:46:33 2015 -0500 blob: pass branch name to view explicitly Passing the branch name into the view indirectly via Repo.new_in_branch() increases cognitive burden, thus outweighing whatever minor convenience (if any) is gained by doing so. The code is easier to reason about when the branch name is passed to the view directly. Signed-off-by: Eric Sunshine Signed-off-by: Alberto Bertogli commit e6099cf2729fc446f7cba2dab5da2d6b0646b567 Author: Eric Sunshine Date: Wed Jan 14 02:46:32 2015 -0500 tree: pass branch name to view explicitly Passing the branch name into the view indirectly via Repo.new_in_branch() increases cognitive burden, thus outweighing whatever minor convenience (if any) is gained by doing so. The code is easier to reason about when the branch name is passed to the view directly. Signed-off-by: Eric Sunshine Signed-off-by: Alberto Bertogli commit 46640c68b9758b3c76d6a81d5a7c8601b6c4b5ee Author: Eric Sunshine Date: Tue Jan 13 04:57:16 2015 -0500 views: blob: render empty blobs specially Empty (zero-length) blobs are currently rendered by 'pygments' misleadingly as a single empty line, or, when 'pygments' is unavailable, as "nothingness" preceding a horizontal rule. In either case, it is somewhat difficult to glean concrete information about the blob. Address this by instead rendering summary information about the blob: in particular, its classification ("empty") and its size ("0 bytes"). This is analogous to the summary information rendered for binary blobs ("binary" and size). Signed-off-by: Eric Sunshine Signed-off-by: Alberto Bertogli commit c91beccdb04f0437ac6cd8f13c09117ea9766296 Author: Eric Sunshine Date: Tue Jan 13 04:57:15 2015 -0500 blob: cap amount of rendered binary blob content Although hexdump(1)-style rendering of binary blob content may reveal some meaningful information about the data, it wastes even more storage space than embedding the raw data itself. However, many binary files have a "magic number" or other signature near the beginning of the file, so it is often possible to glean useful information from just the initial chunk of the file without having the entire content available. Thus, limiting the rendered data to just an initial chunk saves storage space while still potentially presenting useful information about the binary content. Signed-off-by: Eric Sunshine Signed-off-by: Alberto Bertogli commit 6f3942ce38d0417baf57188eebf9bc2075f2f59a Author: Eric Sunshine Date: Tue Jan 13 04:57:14 2015 -0500 blob: render hexdump(1)-style binary blob content Raw binary blob content tends to look like "line noise" and is rarely, if ever, meaningful. A hexdump(1)-style rendering (specifically, "hexdump -C"), on the other hand, showing runs of hexadecimal byte values along with an ASCII representation of those bytes can sometimes reveal useful information about the data. (A subsequent patch will add the ability to cap the amount of data rendered in order to reduce storage space requirements.) Signed-off-by: Eric Sunshine Signed-off-by: Alberto Bertogli commit 09c2f33f5a1f7137d50b3638e1a3f937e0701a6e Author: Eric Sunshine Date: Tue Jan 13 04:57:13 2015 -0500 blob: render binary blob summary information rather than raw content Binary blobs are currently rendered as raw data directly into the HTML output, looking much like "line noise". This is rarely, if ever, meaningful, and consumes considerable storage space since the entire raw blob content is embedded in the generated HTML file. Address this issue by instead emitting summary information about the blob, such as its classification ("binary") and its size. Other information can be added as needed. As in Git itself, a blob is considered binary if a NUL is present in the first ~8KB. Signed-off-by: Eric Sunshine Signed-off-by: Alberto Bertogli commit 58037e57c591b1c55594a0adb637d1880bfacaee Author: Eric Sunshine Date: Tue Jan 13 04:57:12 2015 -0500 Repo.blob: respect reported blob size Batch output of git-cat-file has the form: SP SP LF LF It unconditionally includes a trailing line-feed which Repo.blob() incorrectly returns as part of blob content. For textual blobs, this extra character is often benign, however, for binary blobs, it can easily change the meaning of the data in unexpected or disastrous ways. Fix this by respecting the blob size reported by git-cat-file. (The alternate approach of unconditionally dropping the final LF also works, however, respecting the reported size is perhaps a bit more robust and "correct".) Signed-off-by: Eric Sunshine Signed-off-by: Alberto Bertogli commit 50c004f8a5f26ad27f03597b050f9b1a1910cc45 Author: Eric Sunshine Date: Tue Jan 13 04:57:11 2015 -0500 embed_image_blob: retire reload of image blob Historically, the 'blob' view was unconditionally handed cooked (utf8-encoded) blob content, so embed_image_blob(), which requires raw blob content, has been forced to reload the blob in raw form, which is ugly and expensive. However, now that the Blob returned by Repo.blob() is able to vend raw or cooked content, it is no longer necessary for embed_image_blob() to reload the blob to gain access to the raw content. Signed-off-by: Eric Sunshine Signed-off-by: Alberto Bertogli commit 1d79988228a9b08c86d9e595fdc1b92f7ca50424 Author: Eric Sunshine Date: Tue Jan 13 04:57:10 2015 -0500 Blob: vend raw or cooked content Some blob representations require raw blob content, however, the 'blob' view is unconditionally handed cooked (utf8-encoded) content, thus representations which need raw content are forced to reload the blob in raw form, which is ugly and expensive. The ultimate goal is to eliminate the wasteful blob reloading when raw content is needed. Toward that end, teach Blob how to vend raw or cooked content. Signed-off-by: Eric Sunshine Signed-off-by: Alberto Bertogli commit 0ba89d75e6e26bf14f5b6cfb6526e601f7ad7e2d Author: Eric Sunshine Date: Tue Jan 13 04:57:09 2015 -0500 git.py: introduce Blob abstraction Some blob representations (such as embedded images) require raw blob content, however, the 'blob' view is unconditionally handed cooked (utf8-encoded) content, thus representations which need raw content are forced to reload the blob in raw form, which is ugly and expensive (due to shelling out to git-cat-file a second time). The ultimate goal is to eliminate the wasteful blob reloading when raw content is needed. As a first step, introduce a Blob abstraction to be returned by Repo.blob() rather than the cooked content. A subsequent change will flesh out Blob, allowing it to return raw or cooked content on demand without the client having to specify one or the other when invoking Repo.blob(). Signed-off-by: Eric Sunshine Signed-off-by: Alberto Bertogli commit 6b83e32bc1cc6adb831631e30de59b026971534a Author: Eric Sunshine Date: Tue Jan 13 04:57:08 2015 -0500 Repo.blob: employ formal mechanism for requesting raw command output Sneakily extracting the raw 'fd' from the utf8-encoding wrapper returned by GitCommand.run() is ugly and fragile. Instead, take advantage of the new formal API for requesting raw command output. Signed-off-by: Eric Sunshine Signed-off-by: Alberto Bertogli commit 43f4132bf1e7855f17187f3f496b62a0a8192424 Author: Eric Sunshine Date: Tue Jan 13 04:57:07 2015 -0500 GitCommand: teach run() how to return raw output stream Currently, clients which want the raw output from a Git command must sneakily extract the raw 'fd' from the utf8-encoding wrapper returned by GitCommand.run(). This is ugly and fragile. Instead, provide a formal mechanism for requesting raw output. Signed-off-by: Eric Sunshine Signed-off-by: Alberto Bertogli commit 66afd72d6d49ad26f53457c4fda31d01c927e0fa Author: Eric Sunshine Date: Tue Jan 13 04:57:06 2015 -0500 run_git: add option to return raw output stream Currently, clients which want the raw output from a Git command must sneakily extract the raw 'fd' from the utf8-encoding wrapper returned by run_git(). This is ugly and fragile. Instead, provide a formal mechanism for requesting raw output. Signed-off-by: Eric Sunshine Signed-off-by: Alberto Bertogli commit bb9bad89d17ed584f52015ae6db6398889d23a81 Author: Eric Sunshine Date: Mon Jan 12 01:23:15 2015 -0500 git-arr: increase default 'max_pages' value The 'max_pages' default value of 5 is quite low. Coupled with 'commits_per_page' default 50, this allows for only 250 commits, which is likely unsuitable for even relatively small projects. Options are to remove the cap altogether or to raise the default limit. At this time, choose the latter, which should be friendlier to larger projects, in general, while still guarding against run-away storage space consumption. Signed-off-by: Eric Sunshine Signed-off-by: Alberto Bertogli commit 56fcfd0278377eb6f3b0318b1f3c0a9e6abf7895 Author: Eric Sunshine Date: Wed Dec 31 04:50:10 2014 -0500 route: recognize hierarchical branch names Branch names in Git may be hierarchical (for example, "wip/parser/fix"), however, git-arr's Bottle routing rules do not take this into account. Fix this shortcoming. Signed-off-by: Eric Sunshine Signed-off-by: Alberto Bertogli commit e930f9e4f75983948a7b6cfcf0d844b935369e2a Author: Eric Sunshine Date: Wed Dec 31 04:50:09 2014 -0500 route: prepare to fix routing of hierarchical branch names Branch names in Git may be hierarchical (for example, "wip/parser/fix"), however, git-arr does not take this into account in its Bottle routing rules. Unfortunately, when updated to recognize hierarchical branch names, the rules become ambiguous in their present order since Bottle matches them in the order registered. The ambiguity results in incorrect matches. For instance, branch pages (/r//b//) are matched before tree pages (/r//b//t/), however, when branch names can be hierarchical, a tree path such as "/r/proj/b/branch/t/" also looks like a branch named "branch/t", and thus undesirably matches the branch rule rather than the tree rule. This problem can be resolved by adjusting the order of rules. Therefore, re-order the rules from most to least specific as a preparatory step prior to actually fixing them to accept hierarchical branch names. This is a purely textual relocation. No functional changes intended. Signed-off-by: Eric Sunshine Signed-off-by: Alberto Bertogli commit 93b161c23ea90f2dd9f7329a1afac51e23ebe3aa Author: Eric Sunshine Date: Wed Dec 31 04:50:08 2014 -0500 views: fix broken URLs involving hierarchical branch names Git branch names can be hierarchical (for example, "wip/parser/fix"), however, git-arr does not take this into account when formulating URLs on branch, tree, and blobs pages. These URLs are dysfunctional because it is assumed incorrectly that a single "../" is sufficient to climb over the branch name when computing relative paths to resources higher in the hierarchy. This problem manifests as failure to load static resources (stylesheet, etc.), broken links to commits on branch pages, and malfunctioning breadcrumb trails. Fix this problem by computing the the proper number of "../" based upon the branch name, rather than assuming that a single "../" will work unconditionally. (This is analogous to the treatment already given to hierarchical pathnames in tree and blob views.) Signed-off-by: Eric Sunshine Signed-off-by: Alberto Bertogli commit 7f2f67629f1d6b7c5046ae43c1a293137c89b4d3 Author: Eric Sunshine Date: Thu Jan 1 16:41:09 2015 -0500 views: branch/paginate: teach "next" link to respect 'max_pages' Pagination link "next" does not respect 'max_pages', thus it incorrectly remains enabled on the final page capped by 'max_pages'. When clicked, the user is taken to a "404 Page not found" error page, which makes for a poor user experience. Fix this problem by teaching the "next" link to respect 'max_pages'. (As a side-effect, this also causes 'serve' mode to respect 'max_pages', which was not previously the case. This change of behavior is appropriate since it brings 'serve' mode, which is intended primarily for testing, more in line with 'generate' mode.) Signed-off-by: Eric Sunshine Signed-off-by: Alberto Bertogli commit ac105c838396a6bd0cb13b455dc7d3562ac43360 Author: Eric Sunshine Date: Thu Jan 1 16:41:08 2015 -0500 views: branch/paginate: fix incorrectly enabled "next" link When the number of commits on a branch page is less than 'commits_per_page', the pagination "next" link is disabled, indicating correctly that this is the final page. However, if the number of commits on the branch page is exactly 'commits_per_page', then the "next" link is incorrectly enabled, even on the final page. When clicked, the user is taken to a "404 Page not found" error page, which makes for a poor user experience. Fix this problem by reliably detecting when the branch page is the final one. Do so by asking for (but not displaying) one commit more than actually needed by the page. If the additional commit is successfully retrieved, then another page definitely follows this one. If not retrieved, then this is definitely the final page. (Unfortunately, the seemingly more expedient approach of checking if the final commit on the current page is a root commit -- has no parents -- is not a reliable indicator that this the final page since a branch may have multiple root commits due to merging.) Signed-off-by: Eric Sunshine Signed-off-by: Alberto Bertogli commit bebc7fa3f00e9e9d11db488bef6a76836ac6730c Author: Eric Sunshine Date: Wed Dec 31 23:41:37 2014 -0500 repo: diff: add option to show "creation event" diff for root commit At its inception, Git did not show a "creation event" diff for a project's root commit since early projects, such as the Linux kernel, were already well established, and a large root diff was considered uninteresting noise. On the other hand, new projects adopting Git typically have small root commits, and such a "creation event" is likely to have meaning, rather than being pure noise. Consequently, git-diff-tree gained a --root flag in dc26bd89 (diff-tree: add "--root" flag to show a root commit as a big creation event, 2005-05-19), though it was disabled by default. Displaying the root "creation event" diff, however, became the default behavior when configuration option 'log.showroot' was added to git-log in 0f03ca94 (config option log.showroot to show the diff of root commits; 2006-11-23). And, gitk (belatedly) followed suit when it learned to respect 'log.showroot' in b2b76d10 (gitk: Teach gitk to respect log.showroot; 2011-10-04). By default, these tools now all show the root diff as a "creation event", however, git-arr suppresses it unconditionally. Resolve this shortcoming by adding a new git-arr configuration option "rootdiff" to control the behavior (enabled by default). Signed-off-by: Eric Sunshine Signed-off-by: Alberto Bertogli commit 9ef78aaffd9ca5100659b8737cbd41523be330e2 Author: Eric Sunshine Date: Wed Dec 31 04:50:12 2014 -0500 git-arr: interpret 'max_pages = 0' as unlimited By default, git-arr limits the number of pages of commits to 5, however, it is reasonable to expect that some projects will want all commits to be shown. Rather than forcing such projects to choose an arbitrarily large number as the value of 'max_pages', provide a formal mechanism to specify unlimited commit pages. Signed-off-by: Eric Sunshine Signed-off-by: Alberto Bertogli commit d7604dab4dcd4c826828cb1abbed6eabdbcfa790 Author: Eric Sunshine Date: Wed Dec 31 04:50:06 2014 -0500 write_tree: suppress double-slash in blob HTML filename When emitting a blob in the root tree of a commit, write_tree() composes the blob's HTML filename with an extra slash before the "f=", like this: output/r/repo/b/master/t//f=README.txt.html Although the double-slash is not harmful on Unix, it is unsightly, and may be problematic for other platforms or filesystems which interpret double-slash specially or disallow it. Therefore, suppress the extra slash for blobs in the root tree. Signed-off-by: Eric Sunshine Signed-off-by: Alberto Bertogli commit aaf2968538121f9443f1aea2156814fede5a5648 Author: Eric Sunshine Date: Wed Dec 31 04:50:07 2014 -0500 route: commit: match only hexadecimal rather than digits + full alphabet A human-readable representation of a Git SHA1 commit ID is composed only of hexadecimal digits, thus there is no need to match against the full alphabet. Signed-off-by: Eric Sunshine Signed-off-by: Alberto Bertogli commit 420afd3206d59ca76309087234f838edb32d7b8b Author: Eric Sunshine Date: Wed Dec 31 04:50:05 2014 -0500 views: summary: suppress extra horizontal rule when no "master" branch When a repository has a "master" branch, a short summary of its most recent commits is shown, followed by a horizontal rule. If there is no "master" branch, then the commit summary is suppressed, however, the rule is shown unconditionally, which is ugly, particularly when there is already a rule following the web_url/git_url block. Therefore, suppress the "master" branch horizontal rule when not needed. (This is analogous to how the rule following the web_url/git_url block is suppressed when that information is not shown). Signed-off-by: Eric Sunshine Signed-off-by: Alberto Bertogli commit 605421f2d6696db307309aa7b454ff368c2ba742 Author: Eric Sunshine Date: Wed Dec 31 04:50:04 2014 -0500 sample.conf: document embed_markdown and embed_images These repo-specific options were added in 54026b75 (Make embedding markdown and images configurable per-repo, 2013-11-02) but not documented. Signed-off-by: Eric Sunshine Signed-off-by: Alberto Bertogli commit df00293a7cd6e3fb11d0c1bf2fbf1c3ad485824c Author: Alberto Bertogli Date: Wed Dec 31 17:01:28 2014 +0000 git: Add '--' to "git rev-list" runs to avoid ambiguous arguments If there is a branch and a file with the same name, git-arr will fail to generate, as git will complain when running git rev-list. For example, if there is both a file and a branch called "hooks" in the repository, git-arr would fail as follows: === git-arr running: ['git', '--git-dir=/some/repo', 'rev-list', '--max-count=1', '--header', u'hooks']) fatal: ambiguous argument 'hooks': both revision and filename Use '--' to separate paths from revisions, like this: 'git [...] -- [...]' Traceback (most recent call last): File "./git-arr", line 457, in main() File "./git-arr", line 452, in main skip_index = len(opts.only) > 0) File "./git-arr", line 388, in generate branch_mtime = r.commit(bn).committer_date.epoch AttributeError: 'NoneType' object has no attribute 'committer_date' To fix that, this patch appends a "--" as the last argument to rev-list, which indicates that it has completed the revision list, which disambiguates the argument. While at it, a minor typo in a comment is also fixed. Signed-off-by: Alberto Bertogli commit 7898b2becdc9ad35b0d853bc5d46be24a05a6a48 Author: Alberto Bertogli Date: Sun Oct 5 22:15:54 2014 +0100 git.py: Parse timestamps from UTC, not from local time The current parsing of dates from git incorrectly uses datetime.fromtimestamp(), which returns the *local* date and time corresponding to the given timestamp. Instead, it should be using datetime.utcfromtimestamp() which returns the UTC date and time, as the rest of the code expects. Signed-off-by: Alberto Bertogli