Friday, June 26, 2015

Python Regular Expressions

Regular expressions (regex's) are a language for describing patterns in text. Although Python functions like string.startswith() and string.endswith() can search for fixed substrings, a regex can recognize string patterns where the exact value is not known.

Books

Mastering Regular Expressions (O'Reilly Publishing)

This is truly the only book you will ever need. A massive encyclopedia (over 500 pages) covering every aspect of regular expressions in all the major programming languages. Affiliate link


Online Testers

Regex101

This is one of the best regular expression testers. It supports several regex dialects (PHP, JavaScript, and Python). You can build your expressions interactively and test them against sample text. Additional features include
  • online regex reference guide
  • display regular expressions in plain English
  • display match groups
  • automatically generate Python code for any regular expression


Command Line Tools

grep

Written 40 years ago, grep is one of the oldest regular expression tools. You can find free versions for all the major operating systems (Linux, Windows, Mac). With grep you can search through large amounts of text files using regular expressions to narrow your search. If you work with text files a lot, grep will probably end up being your main search tool. If you need a quick overview of grep's options, look at this short tutorial. Read the full article


awk

This command line tool is very similar to grep, but it's optimized for searching program files. It's written in Perl, which means you'll need to install Perl if you don't already have it on your system. Read the full article


Puzzles and Games

RegEx Crossword

Test your regular expression skills! This is an online game similar to Sudoku, but all the clues are written as regular expressions. Start at the "beginner" level and see how far you can go. Play now


Videos

Python for Informatics : Regular Expressions

This 35 minute lecture is Lesson 11 in the Python for Informatics course by Charles Severance. Dr. Severance explains how to use the Python regular expression library to clean up "dirty" data. And he'll show you his tattoo. Watch the video


Libraries

Python "re" module

A detailed explanation of Python's standard "re" module with examples. Read the full article


2 comments:

  1. Hi! Thank you for the good articles, on behalf of python community.

    PS: There is a typo in this article ..It is not ack, but awk...

    ReplyDelete