Thanks for the tolerances. Being largely memory bound, the FLops tolerances are more than enough, and indicate +/- 10% variation of worst case on that. I presume that's driving a display, so that's reasonable.
Ahhh, hi Patrick. Looks like your card should still be able to use pinned host memory, but isn't . It indeed doesn't support mapped memory (a different kind), but didn't engage the pinned memory improvement because I need to change how I detect that feature. I'm checking the wrong feature flags it looks like.... ooops Will make a #7 end of week, and pay special attention to making sure that engages properly on compute capability 1.0 cards (that don't support mapped memory).Cheers for finding the problem
Opt1 (PSmod3+SM): 64 thrds/blockPowerSpectrumSumMax array pinned in host memory. 64 threads, fftlen 64: (worst case: full summax copy) 1.5 GFlops 5.9 GB/s 121.7ulpsEvery ifft average & peak OK 64 threads, fftlen 64: (best case, nothing to update) 1.6 GFlops 6.7 GB/s 121.7ulps
Thanks, It's what you (the test #6 anyway) didn't do This line's missing:QuoteOpt1 (PSmod3+SM): 64 thrds/blockPowerSpectrumSumMax array pinned in host memory. 64 threads, fftlen 64: (worst case: full summax copy) 1.5 GFlops 5.9 GB/s 121.7ulpsEvery ifft average & peak OK 64 threads, fftlen 64: (best case, nothing to update) 1.6 GFlops 6.7 GB/s 121.7ulpsWhen operational, that feature seems to add a touch of throughput to both XP & Vista/Win7, and seems to close the performance difference. (we've been so worried about). You should get a boost when I fix that.Jason
570 wooot!
It's hot, almost non-overclockable
probably do Batman really well though