Introduction
Regular expressions are a powerful tool for various kinds of string manipulation. Regular expressions are a domain specific language (DSL) that is present as a library in most modern programming languages. A regular expression is a special sequence of characters that helps to match or find strings in another string
In python, regular expressions can be accessed using the re module which comes as a part of the standard library
The match() Function
The function tries to match the pattern with a string. On success match() function returns an object representing the match, else returns None
The syntax of match() function is
re.match(pattern, string, flags = 0)
the flag field is optional. the values of flag field is specified below ๐
import re
string = 'Hello world'
pattern = 'world'
if re.match(pattern, string):
print('Match found')
else:
print('Match not found')
// Output: Match not found
import re
string = 'Hello world'
pattern = 'Hello'
if re.match(pattern, string):
print('Match found')
else:
print('Match not found')
// Output: Match found
re.match() function finds a match only at the beginning of the string
The search() Function
the search() function searches for the first occurrence of a pattern within a string with optional flags. If the search is successful, a match object is returned and None otherwise
The syntax of match() function is
re.search(pattern, string, flags = 0)
import re
string = 'Hello world'
pattern = 'world'
if re.search(pattern, string):
print('Match found')
else:
print('Match not found')
//Output: Match found
re.search() function finds a match of a pattern anywhere in the string
The sub() Function
the sub() function can be used to search a pattern in a string and replace it with another pattern
The syntax of sub() function is
re.sub(pattern, replace, string, max = 0)
import re
string = 'she sells sea shells on the sea shore'
pattern = 'sea'
replace = 'ocean'(pattern, replace, string, 1)
print(new)
// Output: she sells ocean shells on the sea shore
in this program only one occurrence was replaced because the max value is 1
The findall() Function
the findall() function is used to search a string and returns a list of matches of a pattern in the string. If no match is found, then the returned list is empty
The syntax of findall() function is
re.findall(pattern, input_str, flags = 0)
import re
txt = "The rain in Spain"
x = re.findall("ai", txt)
print(x)
// Output: ['ai', 'ai']
The finditer() Function
the finditer() function returns an iterator yielding MatchObject instances over all non-overlapping matches for the RE pattern in a string.
The syntax for finditer() is
re.finditer(pattern, string, flags=0)
import re
s = 'apple, banana '
pattern = '[aeoui]'
matches = re.finditer(pattern, s)
for match in matches:
print(match)
// Output:
// <re.Match object; span=(0, 1), match='a'>
// <re.Match object; span=(4, 5), match='e'>
// <re.Match object; span=(8, 9), match='a'>
// <re.Match object; span=(10, 11), match='a'>
// <re.Match object; span=(12, 13), match='a'>
The Split() Function
Splits an input string into an array of substrings at the positions defined by a regular expression match.
The syntax of split() function is
re.split(pattern, str, maxsplit=0)
import re
txt = "The apple is red in color"
x = re.split("\s", txt, 2)
print(x)
// Output: ['The', 'rain in Spain']