CNMGemini 2.0,拿下! 中发帖

LM竞技场




Model
Overall
Overall w/ Style Control
Hard Prompts
Hard Prompts w/ Style Control
Coding
Math
Creative Writing
Instruction Following
Longer Query
Multi-Turn




gemini-exp-1206
1
1
1
1
1
1
1
1
1
1


chatgpt-4o-latest-20241120
1
1
3
4
1
5
1
2
1
1


gemini-2.0-flash-exp
3
3
2
2
3
1
2
2
1
1


o1-preview
4
3
2
1
1
1
4
2
3
3


o1-mini
5
7
3
4
1
1
16
5
4
5


gemini-1.5-pro-002
5
6
6
7
7
5
4
5
5...