Automatic Arrow Characters in Emacs

Published:
Last updated:
Keywords: emacs

Contents

Intro

While adding technical writing to my new website using Emacs, I realized I was used to working with software that automatically converted the character - followed by > to the Unicode arrow character . I decided to add the same behavior to my Emacs setup, and I’d like to describe how I did it.

Emacs has a library called quail that allows defining custom character mappings. If the left side of the mapping has multiple characters, and you type them consecutively, Emacs replaces them with the right side.

Quail Setup

(quail-define-package
 "Arrows" "UTF-8" "" nil
 "Arrow input mode"
 nil t t nil nil nil nil nil nil nil t)

(quail-define-rules
 ("->" ?))

This creates a new input method called “Arrows” that converts -> to as you type. I have to caveat that the arguments to quail-define-package are a bit unergonomic, and having quail-define-rules not refer to the “Arrows” method explicitly is a bit odd (it applies its rules to the most recently defined quail input method/package), but it gets the job done.

Markdown Mode Activation

To use this input method specifically in Markdown mode, I have the following:

(defun my-turn-on-arrow-input ()
  "Turn on arrow input mode."
  (interactive)
  (set-input-method 'Arrows))

(add-hook 'markdown-mode-hook #'my-turn-on-arrow-input t)

Handling Indirect Buffers

The above is already enough to accomplish what’s needed, but I’ll explain one other edge case that I cover for Markdown specifically.

I use (setopt markdown-fontify-code-blocks-natively t) to fontify code blocks within Markdown buffers using their native major modes. Previously I used poly-markdown-mode for that, but the built-in markdown-mode approach seems more robust lately.

One precaution that I ported over from polymode is the ability to exclude certain hooks from running in the indirect buffers associated with those code blocks, in case they cause slowdowns or problems. I’ll present those here:

(defun my-in-indirect-md-buffer-p ()
  "Return non-nil if the current buffer is an indirect buffer created from a Markdown buffer."
  (when-let* ((buf (buffer-base-buffer)))
    (and (buffer-live-p buf)
         (with-current-buffer buf
           (derived-mode-p 'markdown-mode)))))

(defun my-inhibit-in-indirect-md-buffers (orig-fun &rest args)
  "Don't run ORIG-FUN (with ARGS) in indirect markdown buffers.
Use this to advise functions that could be problematic."
  (unless (my-in-indirect-md-buffer-p)
    (apply orig-fun args)))

(defun my-around-advice (fun advice)
  "Apply around ADVICE to FUN.
If FUN is a list, apply ADVICE to each element of it."
  (cond ((listp fun)
         (dolist (el fun) (my-around-advice el advice)))
        ((and (symbolp fun)
              (not (advice-member-p advice fun)))
         (advice-add fun :around advice))))

I then ensure that the Arrows input method only activates in the main Markdown buffers, not indirect ones:

(my-around-advice #'my-turn-on-arrow-input
                  #'my-inhibit-in-indirect-md-buffers)

This implementation, along with quite a few other customizations, is also available in my shared .emacs setup.

Some inspiration was taken from this Mastering Emacs post which describes how to make your own emoji input method with Quail.

Why Not Use Abbrevs?

After publishing this post, someone suggested using a local abbrev instead, which would remove the need to inhibit activation in non-Markdown buffers. This seemed like a reasonable alternative, so I investigated.

Alternative: abbrev-mode

It turns out that abbrevs aren’t quite as ergonomic for this use case. The reason comes down to how Emacs categorizes characters into syntax classes. Abbrevs expand when you type a non-word-constituent character after a sequence of word-constituent characters. The problem is that - is typically a punctuation character in most modes, not a word constituent. This means:

  1. You can’t define an abbrev -> that works normally, because - itself would trigger expansion of whatever word came before it.
  2. To make abbrevs work, you’d need to either modify the syntax table to make - and > word constituents, or implement a custom abbrev-expand-function to handle symbol syntax.

If you wanted to pursue the abbrev approach anyway, here’s a working example that modifies the syntax table:

(define-abbrev gfm-mode-abbrev-table "->" "")

(defun my-markdown-abbrev-setup ()
  "Make `-' and `>' word constituents so that `->` abbrev works."
  (modify-syntax-entry ?- "w")
  (modify-syntax-entry ?> "w")
  (abbrev-mode 1))

(add-hook 'markdown-mode-hook #'my-markdown-abbrev-setup)

Note: the gfm-mode-abbrev-table is specific to gfm-mode. If you use plain markdown-mode, substitute markdown-mode-abbrev-table instead.

However, this approach has several trade-offs compared to Quail:

  • Changing the syntax class of - and > affects word motion commands (M-f, M-b), so foo-bar would now be treated as a single word rather than two.
  • You must type a space (or other non-word character) after -> to trigger expansion, rather than having it expand immediately as you type.
  • There’s no visual feedback while typing. Quail highlights the first character when a potential expansion is in progress, but abbrev-mode has no equivalent preview feature.

The Quail input method approach avoids these trade-offs. It operates at a different level, matching literal character sequences regardless of syntax class, and expands immediately as you type - which matches the Notion-like behavior I was after.

Alternative: auto-activating-snippets

If you’d prefer a package-based solution that behaves more like Quail, auto-activating-snippets (available on MELPA as aas) expands snippets as you type without requiring a trigger key.

This gives you immediate expansion like Quail, without syntax table modifications. The main difference from Quail is that aas doesn’t provide visual feedback during typing - you won’t see highlighting on the first character when an expansion is possible.

Also at the time of this writing, it hasn’t been updated since 2023-03-03 and may have byte-compilation warnings.