Forum > Windows
GPU crunching question
Devaster:
a small problem : i have installed last directx sdk (04/2007) and ouha : some incompatibility between fxc from sdk and brcc compiler :o
Haos:
Just as curiosity. Using 9500 (softmodded to 9700) - 326/586 gpu/mem timings
on Win2k3 i was able to achieve:
-FFT bench:
min_n = 4
max_n = 4
RapidMind FFT Benchmark
-----------------------------------------------
Length: 16 = 2^4
Warming up...
Run timings, to and from host (in us):
5795.75 5615.26 9583.4 4686.23 4654.09
5703.83 4779.27 5022.07 6880.41 4667.78
4814.75 4944.96 9120.98 5642.92 4681.48
4857.22 4657.45 5188.32 6032.41 5560.77
4826.49 4694.61 5058.68 4724.78 5647.95
4876.22 4744.62 4652.42 10326.3 9268.23
4911.71 5770.61 4956.97 23194.5 4759.99
4882.65 6180.5 5031.01 4836.55 5471.36
4928.47 4928.47 7661.36 5651.02 5982.4
4808.33 6698.24 4948.59 5036.04 5189.72
5267.11 4874.27 5834.03 4966.47 4908.07
5025.15 5394.24 5988.82 4784.02 4641.8
5427.77 6573.07 4754.12 6100.31 4694.61
4805.81 4694.89 6234.14 4818.94 5904.72
4763.34 4658.56 5026.82 5687.9 6996.09
4931.55 4993.85 4619.45 5373.01 4758.59
6509.92 11045.3 5100.31 7362.39 4694.89
4770.61 4720.03 4724.78 4840.46 5887.12
5021.79 4970.66 7732.61 4761.39 5846.88
4848.28 6482.82 8503.49 6538.7 5774.52
Average execution time: 5751.77us
Normalized execution time (T/N): 359.485us/sample
Normalized by complexity (T/N lg N): 89.8713
Mflops (5 N lg N/T): 0.0556351
Average execution time: 5751.77us
Minimum execution time: 4619.45us
Normalized average execution time (T/N): 359.485us/sample
Normalized minimum execution time (T/N): 288.715us/sample
Average time normalized by complexity (T/N lg N): 89.8713
Minimum time normalized by complexity (T/N lg N): 72.1789
Average Mflops (5 N lg N/T): 0.0556351
Peak Mflops (5 N lg N/T): 0.0692724
---
Warming up...
Run timings, GPU-local (in us):
4287.51 4164.57 6281.36 6275.22 4418.55
5913.66 4145.01 5304 5119.87 4263.48
4521.65 5006.15 4357.64 4280.25 4391.73
5377.48 4325.79 4395.92 4089.41 4129.09
4823.97 5475.55 4131.6 4458.51 8534.23
4578.93 4113.44 4511.32 4092.76 4383.63
4261.25 4618.33 4183.01 6111.48 4119.31
9139.98 15454.9 4327.19 4232.47 5113.16
4495.11 17601.6 4422.74 5288.91 4215.42
4183.29 5226.6 4343.67 4503.77 4434.2
5019.84 4253.98 5049.18 4101.43 4438.95
4985.75 4206.48 4177.42 4077.95 5292.26
4396.48 6117.35 4233.86 4148.09 5918.13
4221.29 4130.48 4120.98 4343.39 14860.3
4552.39 4233.31 5142.78 4885.16 5926.24
4205.92 4913.66 4260.69 4510.2 4202.85
4182.73 4203.97 7359.32 4228.56 4182.17
4232.47 5304.55 5454.88 4221.57 5075.16
4208.44 4438.11 4200.89 5349.54 6816.99
4436.71 5529.76 4514.95 6238.61 4691.53
Average execution time: 5122.26us
Minimum execution time: 4077.95us
Normalized average execution time (T/N): 320.141us/sample
Normalized minimum execution time (T/N): 254.872us/sample
Average time normalized by complexity (T/N lg N): 80.0354
Minimum time normalized by complexity (T/N lg N): 63.718
BenchFFT average Mflops (5 N lg N/T): 0.0624724
BenchFFT peak Mflops (5 N lg N/T): 0.0784707
Residuals (compare with inverse):
Average absolute: 4.37377e-006
Maximum absolute: 2.29192e-005
Average relative: -1.#IND
Maximum relative: 1.#INF
-----------------------------------------------
-fft2d:
stopping after line:
Total number of floating point operations: 5.24288e+006
Devaster:
after discussion with brook creators they are working on the fix for the compatibility issue with fxc and brcc, and on issue why cant run kernels on vista ....
back to rapidmind backend : next reason why is powerspectrum so slow is number times of upload/dowload datas - a gpu computation is effective only for massive datas and arithmetic intensity... ???
what to do ? that a question :-\
wrong is that there is not a simple way to get some info .... :'(
Vyper:
You're forgetting that GPU computing isn't needed in 100% of the code when it's not necessary.. I don't know if Powerspectrum is the most demanding part in the code or so, but u could convert those parts that benefit GPU programming the most in an experimental way and then move forward on optimizing other parts of the code..
U need to start somewhere and feel proud of it..
Btw, if u want GPU testing don't hesitate to contact me , running Vista X64 (Aero off, not to disturb the GPU) and a 8800GTX factory clocked...
I'm eager to assist you and try to persuade you to use the Cuda api aswell :)
Kind Regards Vyper
citroja:
wow....I have been gone for some time now and it seems like things are moving...though slowly and I am not entirely sure which direction :)
Anyways, I am back for a bit but I have to rebuild multiple comps over the next few weeks so I don't know how much I can help.
Also, has anyone seen / heard from Hans Dorn recently? He was working on the same project but has disappeared....
let me know if you need any help or testing.
-citroja
Navigation
[0] Message Index
[#] Next page
[*] Previous page
Go to full version