Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
376 views
in Technique[技术] by (71.8m points)

Python regex to extract phone numbers from string

I am very new to regex , Using python re i am looking to extract phone numbers from the following multi-line string text below :

 Source = """<p><strong>Kuala Lumpur</strong><strong>:</strong> +60 (0)3 2723 7900</p>
        <p><strong>Mutiara Damansara:</strong> +60 (0)3 2723 7900</p>
        <p><strong>Penang:</strong> + 60 (0)4 255 9000</p>
        <h2>Where we are </h2>
        <strong>&nbsp;Call us on:</strong>&nbsp;+6 (03) 8924 8686
        </p></div><div class="sys_two">
    <h3 class="parentSchool">General enquiries</h3><p style="FONT-SIZE: 11px">
     <strong>&nbsp;Call us on:</strong>&nbsp;+6 (03) 8924 8000
+ 60 (7) 268-6200 <br />
 Fax:<br /> 
 +60 (7) 228-6202<br /> 
Phone:</strong><strong style="color: #f00">+601-4228-8055</strong>"""

So when i compile the pattern , i should be able to find using

phone = re.findall(pattern,source,re.DOTALL)

 ['+60 (0)3 2723 7900',
  '+60 (0)3 2723 7900',
  '+ 60 (0)4 255 9000',
  '+6 (03) 8924 8686',
  '+6 (03) 8924 8000',
  '+ 60 (7) 268-6200',
  '+60 (7) 228-6202',
  '+601-4228-8055']

Please help me identify the right pattern


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Using re module.

>>> import re
>>> Source = """<p><strong>Kuala Lumpur</strong><strong>:</strong> +60 (0)3 2723 7900</p>
        <p><strong>Mutiara Damansara:</strong> +60 (0)3 2723 7900</p>
        <p><strong>Penang:</strong> + 60 (0)4 255 9000</p>
        <h2>Where we are </h2>
        <strong>&nbsp;Call us on:</strong>&nbsp;+6 (03) 8924 8686
        </p></div><div class="sys_two">
    <h3 class="parentSchool">General enquiries</h3><p style="FONT-SIZE: 11px">
     <strong>&nbsp;Call us on:</strong>&nbsp;+6 (03) 8924 8000
+ 60 (7) 268-6200 <br />
 Fax:<br /> 
 +60 (7) 228-6202<br /> 
Phone:</strong><strong style="color: #f00">+601-4228-8055</strong>"""

>>> for i in re.findall(r'+[-()sd]+?(?=s*[+<])', Source):
    print i


+60 (0)3 2723 7900
+60 (0)3 2723 7900
+ 60 (0)4 255 9000
+6 (03) 8924 8686
+6 (03) 8924 8000
+ 60 (7) 268-6200
+60 (7) 228-6202
+601-4228-8055
>>> 

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...