Structural Rationale Distillation via Reasoning Space Compression

arXiv:2605.07139v1 Announce Type: new Abstract: When distilling reasoning from large language models (LLMs) into smaller ones, teacher rationales for similar problems often vary wildly in structure and strategy. Like a chef who makes the same dish differently each time, this inconsistency burdens the student with noisy supervision that is hard to internalize. We propose Distillation through Reasoning Path Compression (D-RPC), which constrains the teacher to follow a compact, dynamically…

cs.CL updates on arXiv.org · May 12 · 1 min read · score 7.0

From the source