• 设为首页
  • 点击收藏
  • 手机版
    手机扫一扫访问
    迪恩网络手机版
  • 关注官方公众号
    微信扫一扫关注
    迪恩网络公众号

machawk1/warcreate: Chrome extension to "Create WARC files from any webpage ...

原作者: [db:作者] 来自: 网络 收藏 邀请

开源软件名称:

machawk1/warcreate

开源软件地址:

https://github.com/machawk1/warcreate

开源编程语言:

JavaScript 84.3%

开源软件介绍:

WARCreate logo
WARCreate

"Create WARC files from any webpage"

TravisCI build status

WARCreate is a Google Chrome extension with an aim to be able to "Create WARC files from any webpage".

With WARCs normally being limited to be generated by Internet Archive's Heritrix Archival Crawler, providing another means of generating these files from webpages opens the door to

  • Preserving content not accessible to crawlers (e.g., deep web contents)
  • Circumventing the complication and overhead needed to setup a Heritrix instance by an end-user
  • Allowing a webpage to be interacted with (e.g., Facebook comments unrolled) prior to preservation, ensuring content that might not be initially present in a page is available to be captured.

...among many other use cases.

WARCreate is currently in active development though has gone through various release and retraction periods due to changes in the Google Chrome extension API and rules controlling extension distribution.

The original idea and prototype was published in the Joint Conference on Digital Libraries 2012 (JCDL '12) Proceedings.

Install

The latest stable binary can be downloaded from the Chrome Web Store.

Citing Project

A publication related to this project appeared in the proceedings of JCDL 2012 (Read the PDF). Please cite it as below:

Mat Kelly and Michele C. Weigle. WARCreate - Create Wayback-Consumable WARC Files from Any Webpage. In Proceedings of the ACM/IEEE Joint Conference on Digital Libraries (JCDL), pages 437–438, Washington, DC, June 2012.

@INPROCEEDINGS{warcreate-jcdl2012,
  AUTHOR    = {Mat Kelly and
               Michele C. Weigle},
  TITLE     = {{WARCreate} - Create Wayback-Consumable WARC Files from Any Webpage},
  BOOKTITLE = {Proceedings of the ACM/IEEE Joint Conference on Digital Libraries (JCDL)},
  PAGES     = {437--438},
  MONTH     = {June},
  YEAR      = {2012},
  ADDRESS   = {Washington, DC},
  DOI       = {10.1145/2232817.2232930}
}

Contact

WARCreate is a project of the Web Science and Digital Libraries (WS-DL) research group at Old Dominion University (ODU), created by Mat Kelly.

For support e-mail [email protected] or tweet to us at @machawk1 and/or @WebSciDL.

License

MIT




鲜花

握手

雷人

路过

鸡蛋
该文章已有0人参与评论

请发表评论

全部评论

专题导读
热门推荐
阅读排行榜

扫描微信二维码

查看手机版网站

随时了解更新最新资讯

139-2527-9053

在线客服(服务时间 9:00~18:00)

在线QQ客服
地址:深圳市南山区西丽大学城创智工业园
电邮:jeky_zhao#qq.com
移动电话:139-2527-9053

Powered by 互联科技 X3.4© 2001-2213 极客世界.|Sitemap