May 08 2025 - premiere 5PM GMT

Beyond BLEU and ROUGE — Modern Approaches to Evaluating LLMs and AI Systems

Abstract

Traditional metrics like BLEU and ROUGE fall short in capturing advanced LLM capabilities. In this talk, discover modern methods—from benchmarks like MMLU and TruthfulQA to real-world evaluations and human-in-the-loop insights—that better assess AI performance. Join us to rethink AI evaluation—now!

See all 122 talks at this event!

Join the community!

Learn for free, join the best tech learning community for a price of a pumpkin latte.

Newsletter

$ 0 /mo

Event notifications, weekly newsletter

Delayed access to all content

Immediate access to Keynotes & Panels

Email address

First Name

Last Name

Company

Job Title

Phone Number

Country

Community

$ 8.34 /mo

Access to Circle community platform

Immediate access to all content

Live events!

Regular office hours, Q&As, CV reviews

Courses, quizes & certificates

Community chats

Join the community (7 day free trial)

Conf42 Machine Learning 2025 - Online

May 08 2025 - premiere 5PM GMT

Beyond BLEU and ROUGE — Modern Approaches to Evaluating LLMs and AI Systems

Abstract

Alok Ranjan

Engineering Manager @ Dropbox

Join the community!

Featured event

2025

2024

Info

Conf42 Machine Learning 2025 - Online

May 08 2025 - premiere 5PM GMT

Beyond BLEU and ROUGE — Modern Approaches to Evaluating LLMs and AI Systems

Abstract

Alok Ranjan

Engineering Manager @ Dropbox

Join the community!