I'm surprised that I'm not able to match a German umlaut in a regexp. I tried several approaches, most involving setting locales, but up to now to no avail.
locale.setlocale(locale.LC_ALL, 'de_DE.UTF-8')
re.findall(r'w+', 'abc def gxfci jkl', re.L)
re.findall(r'w+', 'abc def gxc3xbci jkl', re.L)
re.findall(r'w+', 'abc def güi jkl', re.L)
re.findall(r'w+', u'abc def güi jkl', re.L)
None of these versions matches the umlaut-u (ü) correctly with w+
. Also removing the re.L
flag or prefixing the pattern string with u
(to make it unicode) did not help me.
Any ideas? How is the flag re.L
used correctly?
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…