git » git-arr » commit 52862dd

Skip calling write_to() more than once for the same commit

author Alberto Bertogli
2025-05-18 19:35:26 UTC
committer Alberto Bertogli
2025-05-18 20:48:43 UTC
parent 182c807ca3c05fec9fc710ecd2a8f38495241ed5

Skip calling write_to() more than once for the same commit

The commit view does not depend on the branch, and the write_to()
function will skip the write if the output file already exists.

However, that existance check still takes a significant amount of time,
especially when extending an existing repo.

So to speed up re-generation of static content, this patch makes git-arr
keep track of commits that we've already generated.

On a repo with 4 branches and 600 commits, this results in ~10% speedup
on a no-op regeneration.

git-arr +13 -0

diff --git a/git-arr b/git-arr
index 9148d92..5322f7d 100755
--- a/git-arr
+++ b/git-arr
@@ -453,6 +453,15 @@ def generate(output: str, only=None):
 
     for r in rs:
         write_to("r/%s/index.html" % r.name, summary(r))
+
+        # It's very common that branches share the same commits. While we
+        # only write commits once (because write_to() will skip writing if the
+        # file already exists), doing that call and file existence check
+        # repeatedly takes a significant amount of time.
+        # To reduce that, we keep track of which commits we've already
+        # written, and skip writing them again.
+        commits_written = set()
+
         for bn in r.branch_names():
             commit_count = 0
             commit_ids = r.commit_ids(
@@ -460,6 +469,10 @@ def generate(output: str, only=None):
                 limit=r.info.commits_per_page * r.info.max_pages,
             )
             for cid in commit_ids:
+                if cid in commits_written:
+                    continue
+                commits_written.add(cid)
+
                 write_to(
                     "r/%s/c/%s/index.html" % (r.name, cid), commit, (r, cid)
                 )