Pages

syntax highlight, analytics

10 April 2012

Character-level diff in git gui

Per-character differences are often more usable than line-level differences.1 WinMerge has pretty decent character-scoped diffing:


...but git gui  defaults to line-level "unified diff":



To get word-level diff, you can specify:
--color-words --word-diff
...but I want something closer to character-level diff.

Per-character diff semantics are not easy to get right (they depend on the content), so git diff doesn't explicitly support it out-of-the-box—but it does allow you to define the word boundary2 via regex. Thomas Rast suggested this word-boundary regex which achieves a reasonable compromise:
--word-diff-regex=[^[:space:]]|([[:alnum:]]|UTF_8_GUARD)+
This breaks words at punctuation, so the above sample looks like this:



Update: The git contrib/ tree provides the diff-highlight script.

Use it like this:

$ curl https://git.kernel.org/cgit/git/git.git/plain/contrib/diff-highlight/diff-highlight > ~/bin/diff-highlight
$ chmod u+x ~/bin/diff-highlight
$ git diff --diff-algorithm=patience --color=always HEAD~10 | ~/bin/diff-highlight | less -R

to get output like this:




Since git accepts an external diff provider, you could try piping through google-diff-match-patch instead of GNU diff.


1. Caveat: git gui depends on unified diff format to selectively stage/unstage lines or hunks ("partial commit"), so enabling word-diff breaks this feature.

2. You can set the word-diff regex such that it is effectively a character boundary, but that's not useful for most text

No comments:

Post a Comment

Previous Posts


Creative Commons License   This work is licensed under a Creative Commons Attribution 3.0 Unported License.