朱百万oOZZXX (@zhubaiwan-oozzxx)Qwen 发布了 Qwen2.5-Omni-7B 中发帖

Qwen 2.5-Omni是一个端到端的多模态大语言模型,旨在感知包括文本、图像、音频和视频在内的多种模态,同时以流式的方式生成文本和自然语音响应。 

官方blog介绍:Qwen2.5 Omni: See, Hear, Talk, Write, Do It All! | Qwen
github:GitHub - QwenLM/Qwen2.5-Omni: Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and performing real-time speech generation.
huggingface:Qwen2.5-Omni - a Qwen Collectio...