• 设为首页
  • 点击收藏
  • 手机版
    手机扫一扫访问
    迪恩网络手机版
  • 关注官方公众号
    微信扫一扫关注
    迪恩网络公众号

blowsie/Pure-JavaScript-HTML5-Parser: A Pure JavaScript HTML5 Parser

原作者: [db:作者] 来自: 网络 收藏 邀请

开源软件名称:

blowsie/Pure-JavaScript-HTML5-Parser

开源软件地址:

https://github.com/blowsie/Pure-JavaScript-HTML5-Parser

开源编程语言:

JavaScript 78.4%

开源软件介绍:

Pure JavaScript HTML5 Parser

A working demo can be seen here.

Credit goes to John Resig for his code written back in 2008 and Erik Arvidsson for his code written prior to that.

This code has been updated to work with HTML 5 to fix several problems.

4 Libraries in One!

A SAX-style API

Handles tag, text, and comments with callbacks. For example, let’s say you wanted to implement a simple HTML to XML serialization scheme – you could do so using the following:

var results = "";

HTMLParser("<p id=test>hello <i>world", {
  start: function( tag, attrs, unary ) {
    results += "<" + tag;
 
    for ( var i = 0; i < attrs.length; i++ )
      results += " " + attrs[i].name + '="' + attrs[i].escaped + '"';
 
    results += ">";
  },
  end: function( tag ) {
    results += "</" + tag + ">";
  },
  chars: function( text ) {
    results += text;
  },
  comment: function( text ) {
    results += "<!--" + text + "-->";
  }
});

results == '<p id="test">hello <i>world</i></p>"

XML Serializer

Now, there’s no need to worry about implementing the above, since it’s included directly in the library, as well. Just feed in HTML and it spits back an XML string.

var results = HTMLtoXML("<p>Data: <input disabled>")
results == '<p>Data: <input disabled="disabled"></p>'

DOM Builder

If you’re using the HTML parser to inject into an existing DOM document (or within an existing DOM element) then htmlparser.js provides a simple method for handling that:

// The following is appended into the document body
HTMLtoDOM("<p>Hello <b>World", document)
 
// The follow is appended into the specified element
HTMLtoDOM("<p>Hello <b>World", document.getElementById("test"))

DOM Document Creator

This is a more-advanced version of the DOM builder – it includes logic for handling the overall structure of a web page, returning a new DOM document.

A couple points are enforced by this method:

  • There will always be a html, head, body, and title element.
  • There will only be one html, head, body, and title element (if the user specifies more, then will be moved to the appropriate locations and merged). link and base elements are forced into the head.

You would use the method like so:

var dom = HTMLtoDOM("<p>Data: <input disabled>");
dom.getElementsByTagName("body").length == 1
dom.getElementsByTagName("p").length == 1

While this library doesn’t cover the full gamut of possible weirdness that HTML provides, it does handle a lot of the most obvious stuff. All of the following are accounted for:

Unclosed Tags:

HTMLtoXML("<p><b>Hello") == '<p><b>Hello</b></p>'

Empty Elements:

HTMLtoXML("<img src=test.jpg>") == '<img src="test.jpg">'

Block vs. Inline Elements:

HTMLtoXML("<b>Hello <p>John") == '<b>Hello </b><p>John</p>'

Self-closing Elements:

HTMLtoXML("<p>Hello<p>World") == '<p>Hello</p><p>World</p>'

Attributes Without Values:

HTMLtoXML("<input disabled>") == '<input disabled="disabled">'



鲜花

握手

雷人

路过

鸡蛋
该文章已有0人参与评论

请发表评论

全部评论

专题导读
上一篇:
LetsUpgrade/Javascript-Essentials发布时间:2022-06-23
下一篇:
raphw/byte-buddy: Runtime code generation for the Java virtual machine.发布时间:2022-06-23
热门推荐
阅读排行榜

扫描微信二维码

查看手机版网站

随时了解更新最新资讯

139-2527-9053

在线客服(服务时间 9:00~18:00)

在线QQ客服
地址:深圳市南山区西丽大学城创智工业园
电邮:jeky_zhao#qq.com
移动电话:139-2527-9053

Powered by 互联科技 X3.4© 2001-2213 极客世界.|Sitemap