vLLM/Recipes
inclusionAI

inclusionAI/Ling-2.6-flash

Ling-2.6-flash (BailingMoeV2_5) instruct model with 104B total / 7.4B active params, hybrid linear + MLA attention, 128K context, optimized for agent workloads

View on HuggingFace
moe104B / 7.4B131,072 ctxvLLM nightly+text