Regular Expression Tips

Sometimes these are useful when the things you want to match are tricky.

Look-around Assertions

  • ?= Positive Lookahead
  • ?<= Positive Lookbehind
  • ?! Negative Lookahead
  • ?<! Negative Lookbehind

(?=foo) asserts what immediately follows the current position in the string is foo.
(?<=foo) asserts what immediately precedes the current position in the string is foo.
(?!foo) asserts what immediately follows the current position in the string is not foo.
(?<!foo) asserts what immediately precedes the current position in the string is not foo.

These are expensive to compute and likely the expense grows exponentially with larger input size. Using this technique also needs to be very careful. It could take down the entire internet. Read this excellent postmodern from Cloudfare on their outage on July 2, 2019. Remember the plural of regexp is regrets.

Character Properties

\p{Blank} can match space or tab
\p{Han} can match Chinese characters
\p{Hiragana} can match Hiragana characters

Remember no matching is faster than any matching! You should try not to use these by better transforming your input, eliminate the need to do matching or matching less subparts of strings.

Hope these are helpful, cheers!

Thanks for being here!