Unstructured Core Library
The unstructured
library is designed to help preprocess structure unstructured text documents
for use in downstream machine learning tasks. Examples of documents that can be processes
using the unstructured
library include PDFs, XML and HTML documents.
Library Documentation
- Installation
Instructions on how to install the
unstructured
library on your system.- Getting Started
Check out this section to learn about basic workflows in
unstructured
.- Bricks
Learning more about partitioning, cleaning, and staging bricks, included advanced usage patterns.
- Examples
Examples of other types of workflows within the
unstructured
package.- Integrations
We make it easy for you to connect your output with other popular ML services.