Filter
Exclude
Time range
-
Near
Extracting structured tables from PDFs is harder than it looks. PDF files do not store tables as structured data. Instead, they position text at specific coordinates on the page. Table extraction tools must reconstruct the structure by determining which values belong in which rows and columns. The problem becomes even harder when tables include multi-level headers, merged cells, or complex layouts. To explore this problem, I experimented with three tools designed for PDF table extraction: LlamaParse, Marker, and Docling. Each tool takes a different approach. Performance overview: • Docling: Fastest local option, but struggles with complex tables • Marker: Handles complex layouts well and runs locally, but is much slower • LlamaParse: Most accurate on complex tables and fastest overall, but requires a cloud API In this article, I share the code, examples, and results from testing each tool. 🚀 Full article: bit.ly/40jDWVF #PDFExtraction #Python #DataEngineering
1
13
593
27 Sep 2025
Built an n8n workflow 🤖 → Scrape product listings w/ PDFs, extract text, style w/ Google Gemini AI, convert to Google Docs & log links in Sheets 🚀 #n8n #Automation #PDFExtraction #GoogleGemini #AIWorkflows #GoogleDocs #GoogleSheets
2
63
19 Jan 2025
PDF Dino is #5 on Product Hunt right now! 🚀Lets keep it going! producthunt.com/posts/pdf-di… #AItools #PDFExtraction #Productivity
1
3
68
Despite having limited experience with PDF tech, I managed to build a fast, WASM-compatible text extractor by integrating o1 into my Cursor workflow in just 3 hours! The result is efficient and produces code that's bespoke to the needs of my project (clarity vs speed). 🦀⚡️ #RustLang #WebAssembly #AI #PDFExtraction #Programming
1
2
23
5,266
7 Feb 2023
Just successfully extracted an audio attachment from a #PDF using Bytescout PDF Extractor SDK in C#! 🎶📄 Simplifying my #workflow and saving time. Let me show you how💡bytescout.com/blog/how-to-ex… #PDFextraction #CSharp #Bytescout 💻
3
45
Extract key data from complex #PDF #documents, see comprehensive summary in minutes, and export in any format Check it out: lnkd.in/gd9JyNyP #pdf #data #ai #dataextraction #dataautomation #pdfextraction #idp #ipa #rpa #documentautomation #documentprocessing #datacapture
2
39
aiMunshi an Intelligent Financial Documents Processing tool from AIBridge ML, a pioneer in Artificial Intelligence solutions... #MachineLearning #DeepLearning #Automation #BigData #DataExtraction #ImageExtraction #PDFExtraction #InvoiceAutomation youtu.be/pkRCB36uGtM

1
2
Checkout some intelligent features of 'aiMunshi' an Intelligent Financial Documents Processing tool from AIBridge ML, a pioneer in Artificial Intelligence solutions. #aiMunshi #PDFExtraction #InvoiceAutomation #DataCaptureTool #MachineLearning #DeepLearning #BigData #AI #ML
1
Call for Semantic Publishing Challenge @ #ESWC2016 lists.w3.org/Archives/Public… #LinkedData #SemanticWeb HTML2RDF PDFextraction Interlinking

4
4