The Pipeline
1. Parse & Extract
Plomo extracts text from your file — whether it’s a PDF, Word document, Excel spreadsheet, or PowerPoint. The parsing layer handles 13+ file formats using specialized processors for each type, preserving document structure like headings, tables, and sections.2. Multimodal Classification
The extracted content is sent to multimodal LLM along with your deal’s taxonomy. Plomo uses a chain-of-thought reasoning approach — the AI reads the document, reasons about each category step by step, and picks the best match with evidence. For PDFs, Plomo also uses computer vision to analyze actual pages as images. This multimodal approach catches information in tables, charts, letterheads, and formatting that text extraction alone would miss.3. Ensemble Voting
A single AI call can sometimes make mistakes. Plomo uses an ensemble voting system with multiple classification rounds to pick the most reliable answer:- Multiple inference rounds with diverse parameters ensure no single outlier drives the result
- An anomaly detection layer filters out suspicious or inconsistent responses
- The final answer is selected by consensus voting
4. Evidence Grounding
Every classification must cite the exact text from the document that supports it. Plomo locates the cited text in the source — matching it to a specific page and position. If the evidence cannot be traced back, the confidence score drops and the result is flagged.5. Confidence Scoring & Triage
Plomo does not rely on raw model confidence — instead, it calibrates scores by analyzing the distribution across the model’s top alternatives. When the margin between the leading candidates is narrow, the score is penalized. This calibrated score drives triage:| Confidence | What Happens |
|---|---|
| 85% or higher | Automatically accepted — no review needed |
| 60–84% | Flagged for your review |
| Below 60% | Marked as uncategorized — needs manual classification |
6. Pipeline Optimization (Opt-In)
If you choose to provide labeled examples from past deals, Plomo’s evolutionary optimizer can improve classification accuracy for your specific document types:- The optimizer evolves prompt instructions through a reflection loop
- It evaluates candidates against your labeled data and selects the best performers
- Top-performing prompts are saved as an ensemble for higher accuracy
Plomo does not collect or use your documents for training. Optimization only runs when you explicitly provide labeled examples. We’re planning to add an opt-in option for contributing anonymized feedback to improve the platform — this will always be your choice.
Supported File Formats
Plomo handles the most common document types used in due diligence: PDF · DOCX · DOC · XLSX · XLS · PPTX · CSV · TXT and moreReady to try it?
Set up Plomo and classify your first documents.