Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
242 views
in Technique[技术] by (71.8m points)

php - HTML filter that is HTML5 compliant

Is there a simple approach to add a HTML5 ruleset for HTMLPurifier?

HP can be configured to recognize new tags with:

// setup configurable HP instance
$config = HTMLPurifier_Config::createDefault();
$config->set('HTML.DefinitionID', 'html5 draft');
$config->set('HTML.DefinitionRev', 1);
$config->set('Cache.DefinitionImpl', null); // no caching
$def = $config->getHTMLDefinition(true);

// add a new tag
$form = $def->addElement(
  'article',   // name
  'Block',     // content set
  'Flow',      // allowed children
  'Common',    // attribute collection
  array(       // attributes
  )
);

// add a new attribute
$def->addAttribute('a', 'contextmenu', "ID");

However this is clearly a bit of work. Since there are a lot of new HTML5 tags and attributes that had to be registered. And new global attributes should be combinable even with existing HTML 4 tags. (It's difficult to judge from the docs how to augment core rules). So, is there a more useful config format/array structure to feed new and updated tag+attribute+context configuration (inline/block/empty/flow/..) into HTMLPurifier?

# mostly confused about how to extend existing tags:
$def->addAttribute('input', 'type', "...|...|...");

# or how to allow data-* attributes (if I actually wanted that):
$def->addAttribute("data-*", ...

And of course not all new HTML5 tags are fit for unrestricted allowance. HTMLPurifier is all about content filtering. Defining value constraints is where it's at. -- <canvas> for example might not be that big of a deal when it appears in user content. Because it's useless at best without Javascript (which HP already filters out). But other tags and attributes might be undesirable; so a flexible configuration structure is imperative for enabling/disabling tags and their associated attributes.

(Guess I should update some research...). But there's still no practical compendium/specification (no, XML DTDs aren't) that suits a HP configuration.

(Uh, and HTML5 is no longer a draft.)

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The php tidy extension can be configured to recognize html5 tags. http://tidy.sourceforge.net/docs/quickref.html#new-blocklevel-tags


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...