Python Find Words

From Regex Regular Expression Encyclopedia

Jump to: navigation, search

You can use this recipe for finding single words in a block of text. The expression will find only complete words surrounded by spaces.

[edit] code

import re
r = re.compile( r'\bword\b', re.M )
if r.search( open( 'sample.txt' ).read( ) ) :
    print "I finally found what I'm looking for.",
else:
    print "\"word\"s not here, man.",

[edit] How It Works

A special character class in Perl, \b, allows you to easily search for whole words. This is an advantage because without doing a whole bunch of extra work you can make sure that a search for word, for example, doesn't yield unexpected matches such as sword.

You can easily break the regular expression shown here into the following:

Regular Expression Description
\b a word boundary (a space or beginning of a line, or punctuation) . . .
w a w followed by . . .
o an o, followed by . . .
r an r, then . . .
d a d, and finally . . .
\b a word boundary at the end of the word . . .
Personal tools