Regular expressions in python

Regular expressions in python

ยท

3 min read

Introduction

Regular expressions are a powerful tool for various kinds of string manipulation. Regular expressions are a domain specific language (DSL) that is present as a library in most modern programming languages. A regular expression is a special sequence of characters that helps to match or find strings in another string

In python, regular expressions can be accessed using the re module which comes as a part of the standard library

The match() Function

The function tries to match the pattern with a string. On success match() function returns an object representing the match, else returns None

The syntax of match() function is

re.match(pattern, string, flags = 0)

the flag field is optional. the values of flag field is specified below ๐Ÿ‘‡

FQAJW0h90 (2).png

import re
string = 'Hello world'
pattern = 'world'
if re.match(pattern, string):
    print('Match found')
else:
    print('Match not found')

// Output: Match not found
import re
string = 'Hello world'
pattern = 'Hello'
if re.match(pattern, string):
    print('Match found')
else:
    print('Match not found')

// Output: Match found

re.match() function finds a match only at the beginning of the string

The search() Function

the search() function searches for the first occurrence of a pattern within a string with optional flags. If the search is successful, a match object is returned and None otherwise

The syntax of match() function is

re.search(pattern, string, flags = 0)

import re
string = 'Hello world'
pattern = 'world'
if re.search(pattern, string):
    print('Match found')
else:
    print('Match not found')

//Output: Match found

re.search() function finds a match of a pattern anywhere in the string

The sub() Function

the sub() function can be used to search a pattern in a string and replace it with another pattern

The syntax of sub() function is

re.sub(pattern, replace, string, max = 0)

import re
string = 'she sells sea shells on the sea shore'
pattern = 'sea'
replace = 'ocean'(pattern, replace, string, 1)
print(new)

// Output: she sells ocean shells on the sea shore

in this program only one occurrence was replaced because the max value is 1

The findall() Function

the findall() function is used to search a string and returns a list of matches of a pattern in the string. If no match is found, then the returned list is empty

The syntax of findall() function is

re.findall(pattern, input_str, flags = 0)

import re
txt = "The rain in Spain"
x = re.findall("ai", txt)
print(x)

// Output: ['ai', 'ai']

The finditer() Function

the finditer() function returns an iterator yielding MatchObject instances over all non-overlapping matches for the RE pattern in a string.

The syntax for finditer() is

re.finditer(pattern, string, flags=0)

import re
s = 'apple, banana '
pattern = '[aeoui]'
matches = re.finditer(pattern, s)
for match in matches:
    print(match)

// Output: 
// <re.Match object; span=(0, 1), match='a'>
// <re.Match object; span=(4, 5), match='e'>
// <re.Match object; span=(8, 9), match='a'>
// <re.Match object; span=(10, 11), match='a'>
// <re.Match object; span=(12, 13), match='a'>

The Split() Function

Splits an input string into an array of substrings at the positions defined by a regular expression match.

The syntax of split() function is

re.split(pattern, str, maxsplit=0)

import re
txt = "The apple is red in color"
x = re.split("\s", txt, 2)
print(x)

// Output: ['The', 'rain in Spain']
ย