Working with Regular Expressions in Emacs is fun. Unlike conventional regex in Perl or Bash, where one has to type the expression and execute it in order to test, regex in Emacs is highly interactive! As we create or edit regular expressions, the matching part will be highlighted in the target buffer.
In this post, I am going to introduce
re-builder function of Emacs, which I personally enjoyed a lot. I am going to take some header lines from linux kernel source code(I altered some of them) for which I will build a regular expression.
Let us consider following header lines as an example
#include <stdio.h> #include <linux/stdio.h> #include <linux/stdio.h> #include <linux/module.h> #include<linux/slab.h> #include<linux/init.h> #include <linux/types.h> #include <linux/dmi.h> #include <linux/delay.h> #include <linux/platform_device.h> #include <linux/power_supply.h> #include "stdio.h" #include "linux/stdio.h" #include "linux/stdio.h" #include "linux/module.h"
This will open a buffer
Build an expression
- As the header line start with
#, lets type
^matches beginning of line, string or a buffer followed by a string
include. The complete expression will be
^#include. This should highlight all the region with
Beginning of line, string or a buffer
- Next match is a white space after the string
include. In some case it does not exists, as in the line
#include<slab.h>. That means the white space should be skipped wherever possible. To handle this, we make use of square brackets
. Lets append
[ ](notice the space between the square brackets). The expression will be
Highlight white spaces
Problem with this is, it skips lines like
and does not highlight more than one spaces like
#include <linux/stdio.h> #include "linux/module.h"
This can easily handled using
*, which match previous pattern zero or more times. So our expression will be
Highlight zero or more white spaces
- Next task is to match
<or “(double-quote). Lets put that in another square bracket. Note that
<and “(double-quote) are special characters and should be escaped with
\at the beginning. So our expression will be
- Now we need to match a string of characters, this can be done by
[a-z]which will match characters from ‘a’ to ‘z’. So the expression will be
But this will highlight just one character, lets append
+sign after that.
+match previous pattern one or more times. Now the expression will be
^#include[ ]*[\<\"][a-z]+. To be on safer side, lets also match all the capital letters resulting an expression into
Match all characters
- Now lets also match
_. We have to escape all these special characters using
\and the expression will look like this:
Match special characters
>and closing “(double-quote) remains, this can again be matched using
[\>\"]. Our final expression will be:
Regex in animated form
This ends introduction to Emacs’s
re-builder, for more info please visit Xah Lee’s page on regex.