@fengchris微软推理模型再升级 Phi-4-reasoning-plus 中发帖

一些参数: 





AIME 24
AIME 25
OmniMath
GPQA-D
LiveCodeBench (8/1/24–2/1/25)




Phi-4-reasoning
75.3
62.9
76.6
65.8
53.8


Phi-4-reasoning-plus
81.3
78.0
81.9
68.9
53.1


OpenThinker2-32B
58.0
58.0

64.1



QwQ 32B
79.5
65.8

59.5
63.4


EXAONE-Deep-32B
72.1
65.8

66.1
59.5


DeepSeek-R1-Distill-70B
69.3
51.5
63.4
66.2
57.5


DeepSeek-R1
78.7
70.4
85.0
73.0
62.8


o1-mini
63.6
54.8

60.0
53.8
...