Appearance
MTP: Multi-Token Prediction
Super and Ultra incorporate MTP layers for improved long-form text generation efficiency and better model quality.
Key Benefits
- Predicts multiple future tokens simultaneously
- Enhances long-context coherence and consistency
- Improves generation quality for extended sequences
- Reduces autoregressive generation latency