Html Agility PackHtml Agility Pack (HAP) is a free and open-source HTML parser written in C# to read/write DOM and supports plain XPATH or XSLT. It is a .NET code library that allows you to parse "out of the web" HTML files.
Stars: ✭ 2,014 (+381.82%)
MyhtmlFast C/C++ HTML 5 Parser. Using threads.
Stars: ✭ 1,512 (+261.72%)
DHTMLParserD HTML Parser, similar to python BeautifulSoup
Stars: ✭ 17 (-95.93%)
KannaKanna(鉋) is an XML/HTML parser for Swift.
Stars: ✭ 2,227 (+432.78%)
Clojure SoupClojurized access for Jsoup.
Stars: ✭ 38 (-90.91%)
htmlparserdelphi html parser(代码是改自原wr960204的HtmlParser)
Stars: ✭ 65 (-84.45%)
Wxparse微信小程序富文本解析
Stars: ✭ 135 (-67.7%)
ioBroker.parserParse web-site or file and extract data from it.
Stars: ✭ 14 (-96.65%)
HyntaxStraightforward HTML parser for JavaScript
Stars: ✭ 84 (-79.9%)
pd3f🏭 PDF text extraction pipeline: self-hosted, local-first, Docker-based
Stars: ✭ 132 (-68.42%)
React Native HtmlviewA React Native component which renders HTML content as native views
Stars: ✭ 2,546 (+509.09%)
ModestModest is a fast HTML renderer implemented as a pure C99 library with no outside dependencies.
Stars: ✭ 572 (+36.84%)
html-parserHtml Parser - Html to Pug, Jinja2, Blade Converter | AppSeed
Stars: ✭ 40 (-90.43%)
PywebcopyPython library to mirror webpage and websites.
Stars: ✭ 156 (-62.68%)
ocrSimple app to extract text from pictures using Tesseract
Stars: ✭ 98 (-76.56%)
NsoupNSoup is a .NET port of the jsoup (http://jsoup.org) HTML parser and sanitizer originally written in Java
Stars: ✭ 145 (-65.31%)
html-parserSimple HTML to JSON parser use Regexp and String.indexOf
Stars: ✭ 34 (-91.87%)
Lua GumboMoved to https://gitlab.com/craigbarnes/lua-gumbo
Stars: ✭ 116 (-72.25%)
Jsoupxpath纯Java实现的支持W3C Xpath 1.0标准语法的HTML解析器。A html parser with xpath base on Jsoup and Antlr4. Maybe it is the best in java,ha ha.Just try it.
Stars: ✭ 331 (-20.81%)
Wxmlify一个轻量快速的插件,帮助你在微信小程序中显示富文本编辑器生成的HTML。
Stars: ✭ 93 (-77.75%)
mobipython based software to unpack kindlegen generated ebooks
Stars: ✭ 37 (-91.15%)
Flutter htmlA Flutter widget for rendering static html as Flutter widgets (Will render over 80 different html tags!)
Stars: ✭ 1,046 (+150.24%)
sherpa 41Simple browser engine.
Stars: ✭ 31 (-92.58%)
FuziA fast & lightweight XML & HTML parser in Swift with XPath & CSS support
Stars: ✭ 894 (+113.88%)
Prettyhtml💅 The formatter for the modern web https://prettyhtml.netlify.com/
Stars: ✭ 241 (-42.34%)
PosthtmlPostHTML is a tool to transform HTML/XML with JS plugins
Stars: ✭ 2,737 (+554.78%)
Html Parserphp html parser,类似与PHP Simple HTML DOM Parser,但是比它快好几倍
Stars: ✭ 510 (+22.01%)
ArisAris - A fast and powerful tool to write HTML in JS easily. Includes syntax highlighting, templates, SVG, CSS autofixing, debugger support and more...
Stars: ✭ 61 (-85.41%)
NokogiriHTML parser for PHP - Парсер HTML
Stars: ✭ 214 (-48.8%)
modest exElixir library to do pipeable transformations on html strings (with CSS selectors)
Stars: ✭ 31 (-92.58%)
Unhtml.rsA magic html parser
Stars: ✭ 180 (-56.94%)
HtmlMonkeyLightweight HTML/XML parser written in C#.
Stars: ✭ 37 (-91.15%)
DidomSimple and fast HTML and XML parser
Stars: ✭ 1,939 (+363.88%)
Htmlqueryhtmlquery is golang XPath package for HTML query.
Stars: ✭ 338 (-19.14%)
MinimizeMinimize HTML
Stars: ✭ 150 (-64.11%)
AdvancedHTMLParserFast Indexed python HTML parser which builds a DOM node tree, providing common getElementsBy* functions for scraping, testing, modification, and formatting. Also XPath.
Stars: ✭ 90 (-78.47%)
AutocserAutoCSer is a high-performance RPC framework. AutoCSer 是一个以高效率为目标向导的整体开发框架。主要包括 TCP 接口服务框架、TCP 函数服务框架、远程表达式链组件、前后端一体 WEB 视图框架、ORM 内存索引缓存框架、日志流内存数据库缓存组件、消息队列组件、二进制 / JSON / XML 数据序列化 等一系列无缝集成的高性能组件。
Stars: ✭ 140 (-66.51%)
html2any🌀 parse and convert html string to anything
Stars: ✭ 43 (-89.71%)
HarserEasy way for HTML parsing and building XPath
Stars: ✭ 135 (-67.7%)
trafilaturaPython & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments
Stars: ✭ 711 (+70.1%)
Save For OfflineAndroid app for saving webpages for offline reading.
Stars: ✭ 114 (-72.73%)
JoddJodd! Lightweight. Java. Zero dependencies. Use what you like.
Stars: ✭ 3,616 (+765.07%)
FlokiFloki is a simple HTML parser that enables search for nodes using CSS selectors.
Stars: ✭ 1,642 (+292.82%)
bkitbuild a messenger bot using HTML
Stars: ✭ 36 (-91.39%)
Sax WasmThe first streamable, fixed memory XML, HTML, and JSX parser for WebAssembly.
Stars: ✭ 89 (-78.71%)
any-textGet text content from any file
Stars: ✭ 19 (-95.45%)
OgaRead-only mirror of https://gitlab.com/yorickpeterse/oga
Stars: ✭ 1,147 (+174.4%)
html5parserA super tiny and fast html5 AST parser.
Stars: ✭ 153 (-63.4%)
Marigold.openxhtmlMariGold.OpenXHTML is a wrapper library for Open XML SDK to convert HTML documents into Open XML word documents.
Stars: ✭ 44 (-89.47%)
Hquery.phpAn extremely fast web scraper that parses megabytes of invalid HTML in a blink of an eye. PHP5.3+, no dependencies.
Stars: ✭ 295 (-29.43%)
Htmlagilitypack.netcoreAn agile HTML parser that builds a read/write DOM and supports plain XPATH or XSLT. Deprecated as there's new maintainer for original HAP project.
Stars: ✭ 31 (-92.58%)
html-parserA simple and general purpose html/xhtml parser, using Pest.
Stars: ✭ 56 (-86.6%)
ApifierApifier is a very simple HTML parser written in Python based on CSS selectors
Stars: ✭ 5 (-98.8%)
wagtail textractText extraction for Wagtail document search
Stars: ✭ 27 (-93.54%)
WebpageparserA delightful xml and html parsing relish for iOS
Stars: ✭ 236 (-43.54%)
Nlp[UNMANTEINED] Extract values from strings and fill your structs with nlp.
Stars: ✭ 367 (-12.2%)
PdftoolsText Extraction, Rendering and Converting of PDF Documents
Stars: ✭ 349 (-16.51%)
Htmlparser2The fast & forgiving HTML and XML parser
Stars: ✭ 3,299 (+689.23%)
Skrape.itA Kotlin-based testing/scraping/parsing library providing the ability to analyze and extract data from HTML (server & client-side rendered). It places particular emphasis on ease of use and a high level of readability by providing an intuitive DSL. It aims to be a testing lib, but can also be used to scrape websites in a convenient fashion.
Stars: ✭ 231 (-44.74%)