Unstructured Core Library

The unstructured library is designed to help preprocess structure unstructured text documents for use in downstream machine learning tasks. Examples of documents that can be processes using the unstructured library include PDFs, XML and HTML documents.

Library Documentation

Installation

Instructions on how to install the unstructured library on your system.

Getting Started

Check out this section to learn about basic workflows in unstructured.

Bricks

Learning more about partitioning, cleaning, and staging bricks, included advanced usage patterns.

Examples

Examples of other types of workflows within the unstructured package.

Integrations

We make it easy for you to connect your output with other popular ML services.