Free PDF to Text Extractor
Drop a PDF and pull out its real text content, page by page. Two layout modes: flow joins everything into a single paragraph per page (best for plain prose), preserve lines respects approximate Y-coordinate of each text run (better for tables and structured documents). Files are parsed entirely in your browser via PDF.js.
How to use
- 01Drop a PDF file. Parsing starts immediately.
- 02Pick a layout mode. "Flow" produces one big paragraph per page; "Preserve lines" tries to reconstruct the original line breaks.
- 03Each page is shown as an expandable section with character count.
- 04Use Copy all or Download .txt to save the extracted text.
FAQ
Why is my output empty?▼
The PDF probably contains scanned images instead of real text. PDF.js can only extract text that exists as text in the file. For scanned PDFs you need OCR (Optical Character Recognition), which this tool does not include.
Why do tables come out garbled?▼
PDFs do not record table structure; they record positioned text runs. Even Adobe Acrobat struggles with tables. The "Preserve lines" mode helps but is not a full table parser.
Will it work on large PDFs?▼
Yes, up to roughly 200-300 pages or about 50 MB before browser memory becomes a bottleneck. Parsing happens page by page so you see progress incrementally.
Is the file uploaded?▼
No. The PDF is read into your browser memory and parsed by PDF.js. The library worker is loaded from a CDN once, then all parsing is local. Verify in DevTools Network tab.
Can it extract images or annotations?▼
No, this tool only extracts text. For pages-as-images use the PDF to Images tool.