absorb.md

Ddpo

1 mentions across 0 people

Unknown speaker
paper · 2026-04-27
Recommended

Our core technical contribution is the \textbf{DDPO} algorithm,Diversity Driven Policy Optimization, a multi-turn GRPO-based approach designed to preserve dialogue diversity while holistically optimizing dialogue quality.

LLM-Driven Dialogue System for K-12 English Language Learning