Microsoft has released its own document parser for LLM use!
.
.
Introducing MarkItDown, a 100% open-source, one-stop solution for effortlessly converting any file to Markdown—perfect for text analysis, indexing, and more!
Here’s what makes it special:
↳ Converts PDF, Word, Excel, PPT, images, audio to markdown
↳ Extracts EXIF, OCR, and transcripts automatically
↳ Available via CLI, Python API, or Docker
↳ Offers LLM-based image descriptions
↳ Supports batch conversions
Link to the repo in next tweet!
_____
Find me →
@akshay_pachaar ✔️
For more insights & tutorials on AI and Machine Learning.