R = GSM8K G = OlyBench B = Countdown Colored planes = base levels ★ Pretrained Model

Accuracy landscape in weight space. Task accuracy of models perturbed around pretrained weights (★), shown as 2D heatmaps (RGB = tasks) and interactive 3D surfaces.