Appears that asking for direct answers generally decreased performance across all non-reasoning models.
Appears that asking for direct answers generally decreased performance across all non-reasoning models.