Newxp. PreProcessing
(newxp v0.1.0)
Copy Markdown
Summary
Functions
Get configured html2text options.
Process content for general applications.
Convert HTML to plain text for summarization.
Functions
Get configured html2text options.
Process content for general applications.
This includes:
- Core HTML cleaning (figures, tables, read-more)
- Convert to plaintext (preserving most HTML structure)
Convert HTML to plain text for summarization.