Bunn (@BunnHack) 在 Qwen 3.5 即将发布，新PR揭示线性注意力与 MoE 细节中发帖Qwen似乎已准备好发布其下一代模型

Bunn (@BunnHack) 在 Qwen 3.5 即将发布，新PR揭示线性注意力与 MoE 细节中发帖

Qwen似乎已准备好发布其下一代模型。根据 Hugging Face Transformers 代码库的最新动态显示，一项名为「Adding Support for Qwen3.5」的 PR已被提交，正式为即将到来的 Qwen 3.5 系列模型铺平道路 
在本次更新中最引人注目的技術細節，莫過於一個名為 Qwen3_5DynamicCache 的新類別。根據代碼註釋，該緩存機制被設計為： 

“A dynamic cache that can handle both the attention cache (which has a seq_len dimension) and the linear attention cache (which has a constant shape regardless of seq_len).” 
(一個動態緩存，可同時處理具有序列長度維度的注意力緩存...