There's an easy way to make the automatic test app more accurate - make it run longer.
To this end, all you need to do is edit the file work_unit.sah in the benchmark directory. Use your favourite text editor and find both occurrences of chirp_limit (right next to each other). Then change those values (currently, they're ridiculously low, you can try using 1/2.5) and save the file, rerun benchmark app.
Should make for more accurate testing but will definitely take a bit longer.
Regards,
Simon.