黯绛 (@amlkiller)本地跑 gemma4-12B-qat 推理 60tok/s 还能怎么提高? 中发帖

RT 
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 595.79 Driver Version: 595.79 CUDA Version: 13.2 |
+-----------------------------------------+------------------------+----------------------+
| GPU Name Driver-Model | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf ...