Tagged Content
Everything on the platform tagged with document-parsing.
Unstructured is a San Francisco company building the data layer for generative AI. Its open-source library and enterprise platform ingest PDFs, slide decks, emails, images and 70+ other file types, then transform them into clean, structured data that LLMs and RAG pipelines can actually use.
LlamaIndex is a San Francisco company building the data framework and cloud platform that lets enterprises turn messy unstructured documents into knowledge agents powered by large language models. Its open-source library is one of the most-used scaffolds for retrieval-augmented generation, and its hosted product, LlamaCloud, packages parsing, extraction, and indexing for production teams.
Jerry Liu is the co-founder and CEO of LlamaIndex, the open-source data framework that became essential infrastructure for connecting large language models to enterprise data. What started as a weekend side project in November 2022 - an indexing tool he built to feed his own data into GPT-3 - grew into a company with $46.5M in total funding, 600,000+ monthly downloads, and clients including Salesforce, KPMG, and Carlyle. A Princeton computer science graduate who published GAN research as an undergrad, Liu moved from Quora's feed ranking team to Uber's autonomous vehicle research labs before co-founding LlamaIndex with former Uber colleague Simon Suo in early 2023.