re模組常用方法

正規表示式，又稱規則表示式。（英語：Regular Expression，在程式碼中常簡寫為regex、regexp或RE），電腦科學的一個概念。正規表示式通常被用來檢索、替換那些符合某個模式(規則)的文字。
給定一個正規表示式和另一個字串，我們可以達到如下的目的：
給定的字串是否符合正規表示式的過濾邏輯（稱作「匹配」）；
可以通過正規表示式，從字串中獲取我們想要的特定部分。
正規表示式的特點是：
靈活性、邏輯性和功能性非常強；
可以迅速地用極簡單的方式達到字串的複雜控制；
對於剛接觸的人來說，比較晦澀難懂。
re模組操作
在Python中通過re模組來完成正規表示式操作

match(string[, pos[, endpos]])
string 是待匹配的字串 pos和 endpos 可選引數，指定字串的起始和終點位置，預設值分別是 0和 len(字串長度)。

# match 方法：從起始位置開始查詢，一次匹配
re.match(pattern, string, flags=0)


result = re.match("hello", "hellolzt world")
print(result, result.group(), type(result))

在字串開頭匹配pattern，如果匹配成功（可以是空字串）返回對應的match物件,否則返回None。

search 方法

查詢字串的任何位置，只匹配一次，只要找到了一個匹配的結果就返回
search(string[, pos[, endpos]]) ,string是待匹配的字串 pos 和 endpos 可選引數，指定字串的起始和終點位置。當匹配成功時，返回一個 Match 物件，如果沒有匹配上，則返回 None。掃描整個字串string，找到與正規表示式pattern的第一個匹配（可以是空字串），並返回一個對應的match物件。如果沒有匹配返回None.

re.search(pattern, string, flags=0)
result = re.search("hello", "2018hellolzt world")
print(result.group())

fullmatch方法

fullmatch(pattern, string, flags=0)，是match函數的完全匹配（從字串開頭到結尾）

re.fullmatch(pattern, string, flags=0)
result = re.fullmatch("hello", "hello1")
print(result)

string是否整個和pattern匹配，如果是返回對應的match物件,否則返回None。

findall方法

以列表形式返回全部能匹配的子串，如果沒有匹配，則返回一個空列表。 findall(string[, pos[, endpos]]),string待匹配的字串 pos 和 endpos 可選引數，指定字串的起始和終點位置。

findall(pattern, string, flags=0)
result = re.findall("hello", "lzt hello china hello world")
print(result, type(result))
# 返回列表

split方法

按照能夠匹配的子串將字串分割後返回列表 split(string[, maxsplit]),maxsplit用於指定最大分割次數，不指定將全部分割。

re.split(pattern, string, maxsplit=0, flags=0)
result = re.split("hello", "hello china hello world", 2)
print(result, type(result))
# 返回分割列表

sub方法

用於替換,sub(repl, string[, count]),epl可以是字串也可以是一個函數：
(1) 如果repl 是字串，則會使用 repl去替換字串每一個匹配的子串
(2) 如果repl 是函數，方法只接受一個引數（Match物件），並返回一個字串用於替換。
(3) count 用於指定最多替換次數，不指定時全部替換。

sub(pattern, repl, string, count=0, flags=0)
result = re.sub("hello", "hi", "hello china hello world", 2)
print(result, type(result))

使用repl替換pattern匹配到的內容，最多匹配count次

iterator方法

finditer(pattern, string, flags=0)
result = re.finditer("hello", "hello world hello china")
print(result, type(result))
# 返回迭代器

compile方法

compile 函數用於編譯正規表示式，生成一個 Pattern 物件

compile(pattern, flags=0)
pat = re.compile("hello")
print(pat, type(pat))
result = pat.search("helloworld")
print(result, type(result))
# 編譯得到匹配模型

flags

re模組的一些函數中將flags作為可選引數，下面列出了常用的幾個flag, 它們實際對應的是二進位制數，可以通過位或將他們組合使用。flags可能改變正則表達時的行為：
re.I re.IGNORECASE: 匹配中大小寫不敏感
re.M re.MULTILINE: 「^「匹配字串開始以及」\n"之後；」$「匹配」\n"之前以及字串末尾。通常稱為多行模式
re.S re.DOTALL: "."匹配任意字元，包括換行符。通常稱為單行模式
如果要同時使用單行模式和多行模式，只需要將函數的可選引數flags設定為re.I| re.S即可。

result = re.match("hello", "HeLlo", flags=re.I)
print(result)
result = re.findall("^abc","abcde\nabcd",re.M)
print(result)
result = re.findall("e$","abcde\nabcd",re.M)
print(result)
result = re.findall(".", "hello \n china", flags=re.S)
# "." 可以匹配換行符
print(result)
result = re.findall(".", "hello \n china", flags=re.M)
# "." 不可以匹配換行符
print(result)