Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
269 views
in Technique[技术] by (71.8m points)

regex - php regular expression to get the specific url

I would like to get the urls from a webpage that starts with "../category/" from these tags below:

<a href="../category/product/pc.html" target="_blank">PC</a><br>
<a href="../category/product/carpet.html" target="_blank">Carpet</a><br>

Any suggestion would be very much appreciated.

Thanks!

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

No regular expressions is required. A simple XPath query with DOM will suffice:

$dom = new DOMDocument;
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);

$nodes = $xpath->query('//a[starts-with(@href, "../category/")]');
foreach ($nodes as $node) {
    echo $node->nodeValue.' = '.$node->getAttribute('href').PHP_EOL;
}

Will print:

PC = ../category/product/pc.html
Carpet = ../category/product/carpet.html

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...