Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
400 views
in Technique[技术] by (71.8m points)

php - DomDocument and special characters

This is my code:

$oDom = new DOMDocument();
$oDom->loadHTML("èàéìòù");
echo $oDom->saveHTML();

This is the output:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html><body><p>&Atilde;&uml;&Atilde;&nbsp;&Atilde;&copy;&Atilde;&not;&Atilde;&sup2;&Atilde;&sup1;</p></body></html>

I want this output:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html><body><p>èàéìòù</p></body></html>

I've tried with ...

$oDom = new DomDocument('4.0', 'UTF-8');

or with 1.0 and other stuffs but nothing.

Another thing ... There is a way to obtain the same untouched HTML? For example with this html in input <p>hello!</p> obtain the same output <p>hello!</p> using DOMDocument only for parsing the DOM and to do some substitutions inside the tags.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Solution:

$oDom = new DOMDocument();
$oDom->encoding = 'utf-8';
$oDom->loadHTML( utf8_decode( $sString ) ); // important!

$sHtml = '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">';
$sHtml .= $oDom->saveHTML( $oDom->documentElement ); // important!

The saveHTML() method works differently specifying a node. You can use the main node ($oDom->documentElement) adding the desired !DOCTYPE manually. Another important thing is utf8_decode(). All the attributes and the other methods of the DOMDocument class, in my case, don't produce the desired result.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...