Amanda Olsen

Front End Developer User Experience Enthusiast

Blog

Thoughts on the industry

Regular Expression Basics

Note: I prepared the following notes for a presentation/workshop that I recently gave at work to the UI Engineer community.

Terms

  1. Regular Expression – a pattern describing a certain amount of text
    1. It is case sensitive.
  2. Literal character – a character we want to find
  3. Metacharacter – a character which does the finding

Metacharacters

  1. All Metacharacters: . | () [] ^ $ ? * + {} \
  2. Descriptions
    1. .
      1. Official Name: period or dot or wild.
      2. Shorthand: period
      3. Matches: any character except newline
      4. Example: .
    2. |
      1. Official Name: alternator
      2. Shorthand: Pipe
      3. Matches: the lefthand or righthand value
        1. If the lefthand side is found, it will skip the righthand side.
        2. Example:
          1. find e or m
          2. e|m
    3. ()
      1. Official Name: capture group
        1. A slightly more official name might be “subexpression”.
        2. They can be nested to any depth.
      2. Matches: a group
        1. Example:
          1. Find ‘gray’ or ‘grey’
          2. RegEx: gr(a|e)y
        2. However, it also create a variable which some call a “backreference”.
          1. $1 through $9
          2. Example:
            1. Wrap all “grays/greys” in strong tag
              1. Find: (gr(a|e)y)
              2. RegEx:
                <strong>$1</strong>
      3. Shorthand Name: none
    4. []
      1. Official name: Character Class
      2. Shorthand Name: Character Set
      3. Match: the first character, if it doesn’t exist try the next one, etc.
        1. Matches each character individually (i.e. it is not matching a string of characters).
        2. The order of the characters does not matter.
        3. If set to global, will match all characters.
      4. Examples:
        1. Find: all lowercase letters
          1. RegEx: [a-z]
          2. RegEx: [a-g1-4]
        2. Find: gray and grey
          1. RegEx: gr[ae]y
      5. Shorthand Character Classes (sometimes simply called “Character Classes”, but that’s less accurate)
        1. \d any number – shorthand for [0-9]
        2. \D anything but a number
        3. \s any space
        4. \S anything but a space
        5. \w any letter
        6. \W anything but a letter
  3. Anchors
    1. ^
      1. Official Name:
        1. Left Anchor (if outside of square brackets)
        2. Caret (if inside square brackets)
      2. Shorthand Name: none
      3. Matches
        1. If it’s before a letter, it means the letter following must be at the beginning of a string.
          1. Find: S at beginning of line
          2. RegEx: ^S
        2. If it’s inside square brackets, it means “not”.
          1. Example: [^m-z]
    2. $
      1. Offical Name: Right Anchor
      2. Shorthand: none
      3. Matches: If after a letter, it means the letter previous must be at the end of a string.
      4. Example
        1. Find: ! at end of string
        2. RegEx: !$
  4. Quantifiers – controls the number of times the preceding character is found
    1. ?
      1. Official name: question mark
      2. Shorhand Name: none
      3. Matches: 0 or 1 repetition of the preceding character
      4. Unsure, but careful
        1. I don’t know what I want! I just know I don’t want a lot of it.
      5. Example
        1. Find metacharacter with or without the “s”
          1. RegEx: metacharacters?
        2. Find all variations of Jennifer
          1. Jenn(ifer)?(s)?
          2. Jenn(ifer)?(s|y)?
          3. we’ll do further variation on this in the next section
    2. *
      1. Official Name: Asterisk or Star
      2. Matches: 0 or more repetition of the preceding character
      3. Shorthand Name: The Star
        • I’m the star, I’ll take everything, even if it’s not there.
          1. “Even if it’s not there” means it won’t break the regular expression.
      4. Example
        • Find metacharacter with or without the “s”
          1. RegEx: metacharacters*
        • Find all variations of Jennifer and includ the last name (whether or not it exists).
          1. Jenn(ifer)?(s|y)?\s\w
        • Find dollar amounts with or without dollars (i.e. could be sense only).
          1. \$\d*.
    3. +
      1. Official Name: Plus sign
      2. Matches: 1 or more repetitions of the preceeding character
      3. Shorthand Name: More friends!
        • He always wants more friends, and won’t take no for an answer.
      4. Example
        • Find: metacharacter with one or more s’s
        • RegEx: metacharacters+
    4. {}
      1. Official Name: Repetition Operator
      2. Shorthand Name: quantifier, or curly brackets
      3. Matches: n number of repetitions of the preceding character
      4. Example
        1. Find: metacharacters where there are 3 “s” only
          1. RegEx: metacharacters{3}
          2. RegEx: metacharacters{2,3}
          3. RegEx: metacharacters{4,}
        2. Find a standard telephone number
          1. {n} [0-9]{3}-[0-9]{4} finds 123-4567 (a standard telephone number)
  5. \
    1. Official Name: Backslash
    2. Matches: forces the following metacharacter to be treated as a literal character.
    3. Example
      1. Find: dollar amounts
      2. RegEx: \$\d*.\d{2}

Resources

Share your thoughts

Your email address will not be published.


− two = 5