Conf42 Large Language Models (LLMs) 2025 - Online

- premiere 5PM GMT

Use LLMs and OCR to Extract Data from Images

Abstract

Using images as data sources is becoming increasingly popular among organizations today; however, extracting structured pieces of information from images remains a difficult challenge. Optical Character Recognition (OCR) technology has been widely used for many years to extract text from images and photos, but it does not take into account the context, structure, and relationships of the extracted data. The integration of OCR with Large Language Models (LLMs) enables businesses to go beyond text extraction, to the next level where words can be understood, categorized, organized, and data put in context.

During the talk, I will explain how this combination of technologies can be used to change unstructured image data into useful information by providing actionable insights. Advance with us as we take you through the deficiencies in the conventional OCR, how it is continually improved and the fundamentals of low-quality image, handwriting, and multi-language extraction. The session will also discuss how LLMs can support OCR by helping endow meaning for identifiers and entities to define relations that help primary processing.

...

Vladimir Pesterev

Software Engineer @ WhalesCorp

Vladimir Pesterev's LinkedIn account



Join the community!

Learn for free, join the best tech learning community for a price of a pumpkin latte.

Annual
Monthly
Newsletter
$ 0 /mo

Event notifications, weekly newsletter

Delayed access to all content

Immediate access to Keynotes & Panels

Community
$ 8.34 /mo

Immediate access to all content

Courses, quizes & certificates

Community chats

Join the community (7 day free trial)