Introduction to Regex in Emacs

Working with Regular Expressions in Emacs is fun. Unlike conventional regex in Perl or Bash, where one has to type the expression and execute it in order to test, regex in Emacs is highly interactive! As we create or edit regular expressions, the matching part will be highlighted in the target buffer.

re-builder

Emacs’s re-builder

In this post, I am going to introduce re-builder function of Emacs, which I personally enjoyed a lot. I am going to take some header lines from linux kernel source code(I altered some of them) for which I will build a regular expression.

Let us consider following header lines as an example

#include <stdio.h>
#include <linux/stdio.h>
#include  <linux/stdio.h>
#include <linux/module.h>
#include<linux/slab.h>
#include<linux/init.h>
#include <linux/types.h>
#include <linux/dmi.h>
#include <linux/delay.h>
#include <linux/platform_device.h>
#include <linux/power_supply.h>
#include "stdio.h"
#include "linux/stdio.h"
#include "linux/stdio.h"
#include  "linux/module.h"

Call re-builder

  • Call re-builder using
    M-x re-builder  
    

    This will open a buffer *RE-Builder*

    re-buffer

    RE-Builder buffer

Build an expression

  • As the header line start with #, lets type ^#, ^ matches beginning of line, string or a buffer followed by a string include. The complete expression will be ^#include. This should highlight all the region with #include

    Beginning of line, string or a buffer

    Beginning of line, string or a buffer

  • Next match is a white space after the string include. In some case it does not exists, as in the line #include<slab.h>. That means the white space should be skipped wherever possible. To handle this, we make use of square brackets []. Lets append [ ] (notice the space between the square brackets). The expression will be ^#include[ ].

    Highlight white spaces

    Highlight white spaces

    Problem with this is, it skips lines like

    #include<linux/slab.h>
    #include<linux/init.h>
    

    and does not highlight more than one spaces like

    #include  <linux/stdio.h>
    #include  "linux/module.h"
    

    This can easily handled using *, which match previous pattern zero or more times. So our expression will be ^#include[ ]*

    Highlight zero or more white spaces

    Highlight zero or more white spaces

  • Next task is to match < or “(double-quote). Lets put that in another square bracket. Note that < and “(double-quote) are special characters and should be escaped with \ at the beginning. So our expression will be ^#include[ ]*[\<\"]

    Special characters

    Special characters

  • Now we need to match a string of characters, this can be done by [a-z] which will match characters from ‘a’ to ‘z’. So the expression will be ^#include[ ]*[\<\"][a-z]

    Match characters

    Match characters

    But this will highlight just one character, lets append + sign after that. + match previous pattern one or more times. Now the expression will be ^#include[ ]*[\<\"][a-z]+. To be on safer side, lets also match all the capital letters resulting an expression into ^#include[ ]*[\<\"][a-zA-Z]+

Match all characters

Match all characters

  • Now lets also match /, . and _. We have to escape all these special characters using \ and the expression will look like this: ^#include[ ]*[\<\"][a-zA-Z\/\.\_]+

Match special characters

Match special characters

  • Finally > and closing “(double-quote) remains, this can again be matched using [\>\"]. Our final expression will be:
    "^#include[ ]*[\<\"][a-zA-Z\/\.\_]+[\>\"]"
    

Match all

Match all

re-animated

Regex in animated form

This ends introduction to Emacs’s re-builder, for more info please visit Xah Lee’s page on regex.

Advertisements
Standard

One thought on “Introduction to Regex in Emacs

  1. Pingback: Introduction to Regex in Emacs | taurus

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s