Related Products
|
||||||
About
Parsel is a BSD-licensed Python library to extract and remove data from HTML and XML using XPath and CSS selectors, optionally combined with regular expressions. Create a selector object for the HTML or XML text that you want to parse. Then use CSS or XPath expressions to select elements. CSS is a language for applying styles to HTML documents. It defines selectors to associate those styles with specific HTML elements. XPath is a language for selecting nodes in XML documents, which can also be used with HTML. You can use either CSS or XPath. CSS is usually more readable, but some things can only be done with XPath. Being built atop lxml, parsel selectors support some EXSLT extensions and come with pre-registered namespaces to use in XPath expressions. Parsel selectors allow you to chain selectors, so most of the time you can just select by class using CSS and then switch to XPath when needed.
|
About
python-docx is a Python library for creating and updating Microsoft Word (.docx) files. Paragraphs are fundamental in Word. They’re used for body text, but also for headings and list items like bullets. You’re free to specify both width and height, but usually, you wouldn’t want to. If you specify only one, python-docx uses it to calculate the properly scaled value of the other. This way the aspect ratio is preserved and your picture doesn’t look stretched. If you don’t know what a Word paragraph style is you should definitely check it out. Basically, it allows you to apply a whole set of formatting options to a paragraph at once. python-docx allows you to create new documents as well as make changes to existing ones. Actually, it only lets you make changes to existing documents; it’s just that if you start with a document that doesn’t have any content, it might feel at first like you’re creating one from scratch.
|
About
Tool and library for handling Web ARChive (WARC) files. Naively join archives into one. Extract files from archive. List commands available. List contents of archive. Load archive and write it back out. Split archives into individual records. Verify digest and validate conformance. The library may not be entirely thread-safe yet. The goal of the Warcat project is to create a tool and library as easily and fast as manipulating any other archive such as tar and zip archives. Warcat is designed to handle large, gzip-ed files by partially extracting them as needed. Warcat is provided without warranty and cannot guarantee the safety of your files. Remember to make backups and test them! A WARC file contains one or more records concatenated together. Each record contains named fields, newline, a content block, newline, and newline. A Content Block may be two types, {binary data} or {Named Fields, newline, and binary data}.
|
||||
Platforms Supported
Windows
Mac
Linux
Cloud
On-Premises
iPhone
iPad
Android
Chromebook
|
Platforms Supported
Windows
Mac
Linux
Cloud
On-Premises
iPhone
iPad
Android
Chromebook
|
Platforms Supported
Windows
Mac
Linux
Cloud
On-Premises
iPhone
iPad
Android
Chromebook
|
||||
Audience
Anyone searching for a library to extract data from HTML and XML using XPath and CSS selectors
|
Audience
Any user in need of a solution to create new documents as well as make changes to existing ones
|
Audience
Individuals requiring both a tool and a library solution for handling web ARChive files
|
||||
Support
Phone Support
24/7 Live Support
Online
|
Support
Phone Support
24/7 Live Support
Online
|
Support
Phone Support
24/7 Live Support
Online
|
||||
API
Offers API
|
API
Offers API
|
API
Offers API
|
||||
Screenshots and Videos |
Screenshots and Videos |
Screenshots and Videos |
||||
Pricing
Free
Free Version
Free Trial
|
Pricing
Free
Free Version
Free Trial
|
Pricing
Free
Free Version
Free Trial
|
||||
Reviews/
|
Reviews/
|
Reviews/
|
||||
Training
Documentation
Webinars
Live Online
In Person
|
Training
Documentation
Webinars
Live Online
In Person
|
Training
Documentation
Webinars
Live Online
In Person
|
||||
Company InformationPython Software Foundation
United States
pypi.org/project/parsel/
|
Company Informationpython-docx
python-docx.readthedocs.io/en/latest/
|
Company InformationPython Software Foundation
United States
pypi.org/project/Warcat/
|
||||
Alternatives |
Alternatives |
Alternatives |
||||
|
|
||||||
|
|
||||||
|
|
|
|||||
Categories |
Categories |
Categories |
||||
|
|
|
|