Expert Routing for Communication-Efficient MoE via Finite Expert Banks

arXiv:2605.05278v1 Announce Type: new Abstract: Resource-efficient machine learning increasingly uses sparse Mixture-of-Experts (MoE) architectures, where the gate acts as both a learning component and a routing interface controlling computation, communication, and accuracy. Motivated by finite-rate interpretations of MoE gating, we treat the gate as a stochastic channel and use to quantify the routing information available to the selected expert. To make the associated information quantities…

cs.LG updates on arXiv.org · May 8 · 1 min read · score 7.0

From the source