Chronological feed of everything captured from LlamaIndex.
tweet / @llama_index / 4d ago
LlamaIndex and LanceDB developed a structure-aware PDF QA pipeline that significantly improves agentic search. This pipeline addresses the challenge of processing visually rich documents by integrating multimodal data storage and retrieval. The combined approach of robust parsing with LiteParse and multimodal storage in LanceDB enables agents to achieve high accuracy in complex reasoning tasks involving PDFs.
multimodal-aipdf-processingllm-agentsinformation-retrievalrag-pipelineslancedbliteparse
“Visually rich documents pose a significant challenge for traditional document processing pipelines and AI agents.”
tweet / @llama_index / 4d ago
LlamaIndex is hosting an in-person workshop in NYC on May 13th for fintech leaders. The workshop will focus on practical applications of agentic OCR to transform complex financial documents into LLM-ready data, including insights from a top-tier PE firm's production agent. Attendees are expected to bring their own laptops to build real pipelines.
fintechllmsocragentic-aidata-pipelinesworkshops
“LlamaIndex is hosting a workshop for fintech leaders in NYC on May 13th.”
tweet / @llama_index / 8d ago
LlamaIndex hosted a community gathering at their new San Francisco office, attracting over 100 developers. The event served as a networking session for AI builders, coinciding with local festivities in the city.
new-officecommunity-eventai-buildersllamaindex-x-feednetworking
“LlamaIndex has opened a new office in San Francisco.”
tweet / @llama_index / 9d ago
LlamaIndex has launched Extract v2, a significant upgrade to its document extraction tool. This new version offers simplified operation through intuitive tiers, pre-saved extraction configurations for efficiency, and configurable document parsing for greater control and improved results. Extract v1 will remain available for a limited transition period.
document-extractionllm-data-processingdata-pipelinesplatform-updatesllamaindex
“Extract v2 features simplified, intuitive tiers, replacing previous modes.”
tweet / @llama_index / 10d ago
LlamaIndex is sponsoring Stanford FutureLaw Week 2026, an event focused on the intersection of AI and law, featuring bootcamps, hackathons, and a conference. This initiative aims to train future legal professionals in AI. However, a significant need remains for AI legal tools supporting commercial teams in small to mid-sized companies that lack dedicated legal support.
legal-aiai-applicationsai-bootcampslegal-techstanfordfuture-law
“LlamaIndex is sponsoring Stanford FutureLaw Week 2026.”
tweet / @llama_index / 11d ago
LlamaIndex has been named to the 2026 Enterprise Tech 30, securing the #3 spot in the Early Stage category. This recognition, based on votes from over 90 leading investors and corporate development leaders, highlights LlamaIndex's significant potential to influence the future of enterprise technology. The award underscores the company's strong industry standing and validates its impact within the enterprise tech landscape.
llamaindex-recognitionenterprise-techearly-stage-companieswing-vcindustry-awardsstartup-ecosystem
“LlamaIndex was recognized in the 2026 Enterprise Tech 30.”
tweet / @llama_index / 12d ago
LlamaIndex is hosting an office warming party on April 2nd at their new "AI Waterfront" location on 2nd Street. The event will offer networking opportunities, food, and drinks. Due to limited space, early RSVP is encouraged.
new-officenetworking-eventai-communityllama-indexmiami-events
“LlamaIndex has moved to a new office location.”
tweet / @llama_index / 12d ago
LiteSearch serves as a reference implementation for high-performance, fully local document ingestion and retrieval. The stack integrates LiteParse for parsing, Chonkie for chunking, and a Rust-based Qdrant edge shard for vectorized storage, executed via the Bun runtime.
open-sourcelocal-airetrieval-augmented-generationdeveloper-toolsdocument-parsing
“LiteSearch is a fully local document ingestion and retrieval CLI/TUI application.”
tweet / @llama_index / 15d ago
Modern OCR solutions for tables go beyond basic text recognition by reconstructing spatial relationships, preserving header hierarchies, and ensuring data integrity. This deep dive explains the three core phases of table extraction: detection, structure recognition, and data extraction with validation. The applications are wide-ranging, from financial services to healthcare, enabling the conversion of complex tabular data into structured formats like JSON for seamless integration.
document-processingintelligent-table-extractionocrllama-parsedata-extractionai-applications
“Modern OCR for tables reconstructs spatial relationships and preserves header hierarchies.”
tweet / @llama_index / 15d ago
Modern OCR solutions like LlamaParse address the challenges of extracting structured data from complex tables in PDFs. This technology reconstructs spatial relationships, preserves header hierarchies, and validates data integrity, going beyond basic OCR capabilities. It transforms visual table formats into usable structured data, crucial for various industry applications.
document-processingintelligent-table-extractionocrllama-parsedata-extractionpdf-processing
“Table extraction is more challenging than standard text OCR due to the importance of spatial relationships.”