+- +-
Say hello if visiting :) by Gecko
11 Jan 2023, 07:43:05 pm

Seti is down again by Mike
09 Aug 2017, 10:02:44 am

Some considerations regarding OpenCL MultiBeam app tuning from algorithm view by Raistmer
11 Dec 2016, 06:30:56 am

Loading APU to the limit: performance considerations by Mike
05 Nov 2016, 06:49:26 am

Better sleep on Windows - new round by Raistmer
26 Aug 2016, 02:02:31 pm

Author Topic: GPU crunching question  (Read 120035 times)

Offline Simon

  • Ni!
  • Knight who says 'Ni!'
  • *****
  • Posts: 1045
    • Is it a bird? Is it a plane? No...its-the.net!
Re: GPU crunching question
« Reply #15 on: 13 Feb 2007, 05:55:02 pm »
I have no numbers to compare with, but going by how the 6600 compares with the 6800, 7800, 7900 would deliver a similar sort of speedup, and the 8800 GTX may be quite a lot quicker due to its different architecture.

I'm estimating that if a GF 6600 can do ~0.5 GFlops in FFTs, then a 7800/7900 would do around 3x-4x as much. Have you gotten anyone to test your code on those cards?

Regards,
Simon.
« Last Edit: 14 Feb 2007, 02:01:52 pm by Simon »

pepperammi

  • Guest
Re: GPU crunching question
« Reply #16 on: 13 Feb 2007, 10:17:08 pm »
I've got a 7950gt 512Mb to run on if ever you need it if anyway it can be of help. I'm sure you've had loads of people offer to test for you  ;) :D

Increadable stuff looks like. Keep up the great work  ;D

citroja

  • Guest
Re: GPU crunching question
« Reply #17 on: 13 Feb 2007, 11:48:34 pm »
I just found (some of) my notes on a FFT project I did about 2 years ago.  I am a bit rusty, but if you want an extra set of eyes to cross reference....just let me know.

and as stated before I have a 7800GTX ready and waiting...however I do have a second one coming in soon so I will be able to do SLI testing if needed as well.

-citroja

Offline Devaster

  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 653
  • I like Duke !!!
Re: GPU crunching question
« Reply #18 on: 14 Feb 2007, 10:43:56 am »
wanna test ????? ;D

 unpack and run ....

and write numbers and card ....

[attachment deleted by admin]
« Last Edit: 14 Feb 2007, 10:46:57 am by Devaster »

pepperammi

  • Guest
Re: GPU crunching question
« Reply #19 on: 14 Feb 2007, 11:04:49 am »
Fails to run in Vista Ultimate x64  :'( Says for both fft and fft2d
"The application has failed to start because it's side-by-side configuration is incorrect. Please see the application log for more detail."

Event viewer info:
Quote
Activation context generation failed for "C:\xxxxxxx\fft.exe". Dependent Assembly Microsoft.VC80.CRT,processorArchitecture="x86",publicKeyToken="1fc8b3b9a1e18e3b",type="win32",version="8.0.50727.762" could not be found. Please use sxstrace.exe for detailed diagnosis.
Sorry. I wont be much help I suppose  :'(

[EDIT] same on a second XP machine.
« Last Edit: 14 Feb 2007, 11:38:14 am by pepperammi »

Offline Devaster

  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 653
  • I like Duke !!!
Re: GPU crunching question
« Reply #20 on: 14 Feb 2007, 12:17:21 pm »
only 32 bit ....

oops i forgot add ms vc8 runtime ... sorry ...

here is it :

[attachment deleted by admin]

pepperammi

  • Guest
Re: GPU crunching question
« Reply #21 on: 14 Feb 2007, 01:54:11 pm »
Yea it kinda crashes on x64. Was able to get this info before it does though if its any help to you.
On PD945, 7950 GT 512MB oc edition but can't remember the clcok speed. Will try and find that again. Also the driver for Vista aren't brilliant at the moment.

I'll try and get it on the XP 32bit system I've got.

[EDIT]Got results from XP 32bit system. Much better. On a 6700XL 128MB
Quote
Size: 256 x 256 = 2^8 x 2^8
Radix: 4 = 2^2
Total number of floating point operations: 5.24288e+006

Run timings, to and from host (in ms):

Average execution time: 10.3317ms
Overall average execution time: 10.333ms
Minimum execution time: 9.10301ms
Average Mflops: 507.458
Peak Mflops: 575.95

Run timings, GPU-local (in ms):

Average execution time: 7.88121ms
Overall average execution time: 7.88255ms
Minimum execution time: 7.25584ms
Average Mflops: 665.238
Peak Mflops: 722.574

[attachment deleted by admin]
« Last Edit: 14 Feb 2007, 02:10:01 pm by pepperammi »

keeleysam

  • Guest
Re: GPU crunching question
« Reply #22 on: 14 Feb 2007, 06:54:00 pm »
On a 7900GT:

min_n = 4
max_n = 4
RapidMind FFT Benchmark
-----------------------------------------------
Length: 16 = 2^4
Warming up...
Run timings, to and from host (in us):
 20597 9077.37 9696.11 9197.03 9168.72
 9431.07 8683.53 9497.21 8846.68 9282.54
 9018.77 9536.37 8525.4 10783.7 8275.49
 8725.45 9378.13 8728.19 8879.72 9141.6
 9507.8 9493.01 8800.25 11025.1 8919.49
 8545.89 9093.98 10293.4 9472.42 10200.4
 8922.43 9307.07 9000.57 9144.88 9039.11
 9070.22 8831.57 10942 8566.39 10773.8
 8636.83 8644.92 8682.37 8773.49 9290.59
 7589.43 9198.22 8743.57 7973.17 9571.23
 7876.32 8255.47 9064.25 8775.51 8158.25
 9060.96 7676.09 7666.71 9149.89 9774.81
 10266.7 10175.7 9520.35 8725.29 10543.8
 8581.63 7617.03 15456.1 8748.53 8726.4
 9638.18 9400.55 10548.7 8776.72 10612.3
 9235.29 9257.36 9272.04 8578.75 10260.5
 9040.53 7605.66 9057.08 9349.05 9530.74
 8781.51 9602.82 9365.02 7739.68 7746.63
 8837.69 10425.8 8660.14 9671.3 9630.79
 9706.52 9869.04 9411.83 9261.09 9144.61
Average execution time: 9323.61us
Normalized execution time (T/N): 582.726us/sample
Normalized by complexity (T/N lg N): 145.681
Mflops (5 N lg N/T): 0.0343215
Average execution time: 9323.61us
Minimum execution time: 7589.43us
Normalized average execution time (T/N): 582.726us/sample
Normalized minimum execution time (T/N): 474.339us/sample
Average time normalized by complexity (T/N lg N): 145.681
Minimum time normalized by complexity (T/N lg N): 118.585
Average Mflops (5 N lg N/T): 0.0343215
Peak Mflops (5 N lg N/T): 0.0421639
---
Warming up...
Run timings, GPU-local (in us):
 5976.24 5896.85 5840.59 6505.2 5775.57
 7056.4 6266.14 5998.06 5886.62 7327.8
 6065.59 5858.28 6421.34 5776.61 5926.31
 5250.09 5871.16 7021.49 5823.92 6924.32
 5780.96 5904.29 5706.06 7206.85 6377.11
 6465.32 6095.81 6328 5976.41 6630.75
 5816.1 5795.21 7562.49 5496.43 6818.26
 5466.12 5741.6 5980.02 5716.79 7440.3
 5966.9 6397.72 5532.77 5484.52 5601.83
 6377.94 5580.49 6659.62 5603.51 6320.36
 5269.05 5209.39 6419.08 5713.91 5216.8
 5260.48 7587.09 5241.04 5475.64 5406.69
 7129.43 5858.2 5725.67 5813.34 6022.91
 5768.2 5609.28 6125.66 5996.56 6007.18
 7563.85 6086.56 6230.87 6926.92 5960.09
 6062.77 5800.01 6015.09 5505.55 5892.75
 6236.54 5841.23 5506.36 5892.58 5654.26
 6105.84 5710.56 5600.19 6400.18 6086.03
 6659.31 5882.92 5838.27 6343.58 6125.2
 6492.9 6064 5760.77 5854.11 5531.29
Average execution time: 6057.85us
Minimum execution time: 5209.39us
Normalized average execution time (T/N): 378.616us/sample
Normalized minimum execution time (T/N): 325.587us/sample
Average time normalized by complexity (T/N lg N): 94.654
Minimum time normalized by complexity (T/N lg N): 81.3967
BenchFFT average Mflops (5 N lg N/T): 0.052824
BenchFFT peak Mflops (5 N lg N/T): 0.0614276
Residuals (compare with inverse):
  Average absolute: 2.4984e-008
  Maximum absolute: 1.19267e-007
  Average relative: -1.#IND
  Maximum relative: 1.#INF
-----------------------------------------------


RapidMind 2D FFT Benchmark
===============================================
Size: 256 x 256 = 2^8 x 2^8
Radix: 4 = 2^2
Total number of floating point operations: 5.24288e+006

Run timings, to and from host (in ms):

Average execution time: 16.7119ms
Overall average execution time: 16.7127ms
Minimum execution time: 14.8866ms
Average Mflops: 313.721
Peak Mflops: 352.189

Run timings, GPU-local (in ms):

Average execution time: 10.8762ms
Overall average execution time: 10.8781ms
Minimum execution time: 9.81122ms
Average Mflops: 482.052
Peak Mflops: 534.376

popandbob

  • Guest
Re: GPU crunching question
« Reply #23 on: 14 Feb 2007, 09:22:54 pm »
spat out an error for both...

fft.exe (top in pic)
ftt2d.exe (bottom in pic) showed that then went to the same as fft.exe

BoB


[attachment deleted by admin]

citroja

  • Guest
Re: GPU crunching question
« Reply #24 on: 14 Feb 2007, 10:18:27 pm »
Ok I ran it...but got an error both times running it.  Used a XFX 7800 GTX OC.

RESULTS:

FFT

min_n = 4
max_n = 4
RapidMind FFT Benchmark
-----------------------------------------------
Length: 16 = 2^4
Warming up...
Run timings, to and from host (in us):
 11088.9 10020 10061.1 9965.72 9864.96
 9933.09 9944.54 9765.17 10057.2 9835.17
 9694.47 9850.44 9889.76 9837.63 9770.31
 9745.29 9740.39 10029.6 9977.12 9747.09
 9773.63 9721.04 9869.84 9799.63 9861.39
 9877.91 9840.76 10061.1 9847.86 9776.06
 9863.19 9510.83 9619.27 10084.2 9967.15
 9788.94 9841.71 9879.99 9715.2 9831.11
 10047.5 9785.2 9878.41 9814.68 9767.72
 9773.21 9901.7 10074.6 10086.7 9847.63
 9846.62 9976.32 10008.6 9875.92 9859.49
 9764.52 9779.82 9774.2 9933.79 9897.1
 9915.27 9792.4 9807.99 9823.81 9846.13
 9873.5 9807.47 10006.1 9770.74 9872.61
 9938.64 9916.57 9874.38 9941.68 9819.74
 9913.2 9837.42 9671.82 9753.61 9805.79
 9752.28 9730.36 9751.96 9912.53 10012.9
 10133.2 9882.52 9870.45 9763.79 9948.21
 10232.1 9924.38 9935.36 9899.92 9818.8
 10061.7 9916.66 9969.69 9952.8 9904.88
Average execution time: 9884.06us
Normalized execution time (T/N): 617.753us/sample
Normalized by complexity (T/N lg N): 154.438
Mflops (5 N lg N/T): 0.0323754
Average execution time: 9884.06us
Minimum execution time: 9510.83us
Normalized average execution time (T/N): 617.753us/sample
Normalized minimum execution time (T/N): 594.427us/sample
Average time normalized by complexity (T/N lg N): 154.438
Minimum time normalized by complexity (T/N lg N): 148.607
Average Mflops (5 N lg N/T): 0.0323754
Peak Mflops (5 N lg N/T): 0.0336459
---
Warming up...
Run timings, GPU-local (in us):
 9748.17 9507.76 9554.45 9612.01 9610.69
 9481.02 9496.95 9411.37 9427.72 9407.54
 9517.09 9602.89 9635.99 9578.78 9604.73
 9608.02 9468.44 9477.32 9497.57 9727.09
 9508.55 9551.91 9555.9 9560 9550.06
 9614.92 9521.42 9391.96 9365.14 9369.59
 9557.56 9480.28 9525.28 9642.08 9370.73
 9727.39 9779.86 9979.25 9611.85 9492.61
 9580.91 9439.35 9497.55 9502.86 9545.7
 9548.19 9523.97 9503.56 9537.42 9514.92
 9627 9618.37 9531.4 9570.15 9555.49
 9562.65 9598.57 9823.91 9509.34 9603.7
 9600.79 9564.68 9567.27 9671.98 9453.32
 9650.67 9525.09 9515.26 9536.27 9488.43
 9562.71 9416.56 9415.84 9441.23 9630.29
 9598.56 9515.82 9514.17 9532.05 9507.69
 9569.8 9491.44 9446.88 9423.49 9439.6
 9511.41 9481.26 9477.17 9664.5 9769.24
 9616.25 9560.46 9517.15 9606.68 9453.77
 9401.95 9459.16 9489.44 9437.21 9485.7
Average execution time: 9543.38us
Minimum execution time: 9365.14us
Normalized average execution time (T/N): 596.461us/sample
Normalized minimum execution time (T/N): 585.321us/sample
Average time normalized by complexity (T/N lg N): 149.115
Minimum time normalized by complexity (T/N lg N): 146.33
BenchFFT average Mflops (5 N lg N/T): 0.0335311
BenchFFT peak Mflops (5 N lg N/T): 0.0341693
Residuals (compare with inverse):
  Average absolute: 2.4984e-008
  Maximum absolute: 1.19267e-007
  Average relative: -1.#IND
  Maximum relative: 1.#INF




******************EXITS WITH ERROR***************

The Instructions at "0x6962e876" referenced memory at "0x0000045c".
The memory could not be "read".

Click on OK to terminate the program

******************End Message********************


FFT2d


RapidMind 2D FFT Benchmark
===============================================
Size: 256 x 256 = 2^8 x 2^8
Radix: 4 = 2^2
Total number of floating point operations: 5.24288e+006

Run timings, to and from host (in ms):

Average execution time: 15.329ms
Overall average execution time: 15.3299ms
Minimum execution time: 14.7271ms
Average Mflops: 342.024
Peak Mflops: 356.001

Run timings, GPU-local (in ms):

Average execution time: 13.1125ms
Overall average execution time: 13.1131ms
Minimum execution time: 12.8642ms
Average Mflops: 399.839
Peak Mflops: 407.557


******************EXITS WITH ERROR***************

The Instructions at "0x6962e876" referenced memory at "0x0000045c".
The memory could not be "read".

Click on OK to terminate the program

******************End Message********************



I hope this helps...let me know if you need anything else.

-citroja

[attachment deleted by admin]
« Last Edit: 14 Feb 2007, 10:20:51 pm by citroja »

Offline Devaster

  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 653
  • I like Duke !!!
Re: GPU crunching question
« Reply #25 on: 15 Feb 2007, 09:34:25 am »
spat out an error for both...

fft.exe (top in pic)
ftt2d.exe (bottom in pic) showed that then went to the same as fft.exe

BoB


maybe you have not a compatiblle hardware with rapidmind (shader model 3.0)...
it is trying a cpu backend and then you must have correctly set  c++ compiler ....

citroja : i dont now what happend, on all machines i have tested is it ok ....

to all : please os version too thanx

(this algo is heavy tuned for rapidmind)
« Last Edit: 15 Feb 2007, 09:37:05 am by Devaster »

Offline Simon

  • Ni!
  • Knight who says 'Ni!'
  • *****
  • Posts: 1045
    • Is it a bird? Is it a plane? No...its-the.net!
Re: GPU crunching question
« Reply #26 on: 15 Feb 2007, 09:58:47 am »
Hi Devaster,

how is RapidMind working out for you? From what I saw when looking at the documentation, it seems pretty usable compared to having to write direct shader code.

Since I have no base for comparison, how would you say brookGPU, RapidMind, CUDA or other solutions you tried compare in performance, also how long does it takes you to code for them?

From what I found out, RapidMind seems the most useful because it can use both ATI X1K+ and nVidia 6x+ GPUs without modification; guess you'd need to compile different kernels with it still, or include more DLLs.

Do you know whether your code works on ATI GPUs right now? Should be possible, with RM.

<edit>Yes, it does. Amazingly, even on AGP ones; just tested on my ATI X800. Interesting, though RapidMind would only work on X1K+ ATIs. Very slow though, it needs PCI-Express to work correctly (AGP isn't all that bidirectional). Results attached - XP32, A64 3500+ (2.2 GHz) </edit>

Regards,
Simon.

[attachment deleted by admin]
« Last Edit: 15 Feb 2007, 10:08:40 am by Simon »

Offline Devaster

  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 653
  • I like Duke !!!
Re: GPU crunching question
« Reply #27 on: 15 Feb 2007, 10:29:25 am »
i have tested only brookgpu and rapidmind - for cuda i have not a gpu and NDA ....
my implementation of fft in brook was very slow , but nagas (in GLSL - GPUFFTW) is comparable in speed with rapidmind ...

usability of rapidmind ... is cool ....

rapidmind gpu backend would running on all cards that have SM 3.0 and GLSL  ...
cell backend on cells and cpu backend with classic c++ compiler ...

citroja

  • Guest
Re: GPU crunching question
« Reply #28 on: 15 Feb 2007, 08:34:55 pm »
spat out an error for both...

fft.exe (top in pic)
ftt2d.exe (bottom in pic) showed that then went to the same as fft.exe

BoB


maybe you have not a compatiblle hardware with rapidmind (shader model 3.0)...
it is trying a cpu backend and then you must have correctly set  c++ compiler ....

citroja : i dont now what happend, on all machines i have tested is it ok ....

to all : please os version too thanx

(this algo is heavy tuned for rapidmind)

OS is Win XP Pro SP2

hmm do you need .NET 2.0?

I did run it with BIONC running and not running and got the same thing....maybe bad RAM?  though it tests fine???

-citroja

popandbob

  • Guest
Re: GPU crunching question
« Reply #29 on: 16 Feb 2007, 12:11:50 pm »
My OS is the same as citroja
Card is an ATI HIS 9250 Excalibur
I do have .net 2.0

Bob

 

Welcome, Guest.
Please login or register.
 
 
 
Forgot your password?
Members
Total Members: 97
Latest: ToeBee
New This Month: 0
New This Week: 0
New Today: 0
Stats
Total Posts: 59559
Total Topics: 1672
Most Online Today: 40
Most Online Ever: 983
(20 Jan 2020, 03:17:55 pm)
Users Online
Members: 0
Guests: 35
Total: 35
Powered by EzPortal