+- +-
Say hello if visiting :) by Gecko
11 Jan 2023, 07:43:05 pm

Seti is down again by Mike
09 Aug 2017, 10:02:44 am

Some considerations regarding OpenCL MultiBeam app tuning from algorithm view by Raistmer
11 Dec 2016, 06:30:56 am

Loading APU to the limit: performance considerations by Mike
05 Nov 2016, 06:49:26 am

Better sleep on Windows - new round by Raistmer
26 Aug 2016, 02:02:31 pm

Author Topic: Seemingly Silly Question  (Read 12209 times)

Offline VoidPilot

  • Squire
  • *
  • Posts: 18
Seemingly Silly Question
« on: 07 Jul 2010, 05:44:22 am »

I was trawlling through the seti board and saw soehting that caught my eye at the time.  I started wondering about it much later on and my curiousity has been tweaked so i was wondering if anyone knew the answer to this.

Is it possible to get the GPU to process 2 seti WUs at the same time and if so, does it take each WU, say, 10% longer or 50% longer etc...

If this can be done, how does one do it

rgds

VP

Offline Jason G

  • Construction Fraggle
  • Knight who says 'Ni!'
  • *****
  • Posts: 8980
Re: Seemingly Silly Question
« Reply #1 on: 07 Jul 2010, 07:04:58 am »
Is it possible to get the GPU to process 2 seti WUs at the same time and if so, does it take each WU, say, 10% longer or 50% longer etc...

Short Answer 'Yes and No'.

In more detail, pre-Fermi Cuda capable cards are architected to do basically one thing at a time.  Context switches (if doing multiple things at once) in older drivers & hardware, that usually fail for memory or other resource reasons, because of the physical memory model used by XP style drivers.  The Fermi cards have hardware devoted to context switching of applications, at least in the WDDM ( vista/win7 ) device driver model, which enables multiple Cuda contexts to be run at once. 

This new driver model allows each application to be isolated ( ignoring diver bugs) and so theoretically use the full resources of the card, by paging things in and out.  That's an expensive process across the PCI exoress bus, so overloading the cards wouldn't be advised, however current Fermi applications don't use the whole card's resources.  That means that some gains have been seem by running, on the Fermi's under the newer driver/OS, more instances can be run.  For now, running two instances makes each task run slower, but the total throughput seems to increase by about 50% on GTX 480.  I would expect, though, that figure to reduce greatly as we use more resources to speed thins up, and take advantage of greater capabilities in the Cuda framework.

That's acheived in Boinc using the anonymous platform, setting the number of GPUs needed to say 0.5.  As I suggested, the value of doing this will likely reduce, but it is something that may (or may not) help in the short term if you have a Fermi card running with WDDM drivers.  ( I doubt it will work on previous generation hardware/drivers)

Hope that helps, Jason
« Last Edit: 07 Jul 2010, 07:19:01 am by Jason G »

Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14349
Re: Seemingly Silly Question
« Reply #2 on: 07 Jul 2010, 01:28:31 pm »
And for pre-FERMI NV GPUs it was tested few times with negative result - total performance dropped.

Offline VoidPilot

  • Squire
  • *
  • Posts: 18
Re: Seemingly Silly Question
« Reply #3 on: 07 Jul 2010, 05:52:07 pm »
J, R

thnx

VP

hellsheep

  • Guest
Re: Seemingly Silly Question
« Reply #4 on: 07 Jul 2010, 09:06:33 pm »
I'm going to assume for any performance increase with the fermi's we require nVIDIA to release some new CUDA files of some sort? Maybe work on newer drivers too?

Offline Jason G

  • Construction Fraggle
  • Knight who says 'Ni!'
  • *****
  • Posts: 8980
Re: Seemingly Silly Question
« Reply #5 on: 08 Jul 2010, 01:38:08 am »
I'm going to assume for any performance increase with the fermi's we require nVIDIA to release some new CUDA files of some sort? Maybe work on newer drivers too?
Not quite the full picture, but tool & SDK refinement would probably help as things go on. 

 There are extra facilities in the newer Cuda libraries already that are designed to cram more concurrent processing onto the cards, and the hardware is being underutilised.  We're not using everything  yet, and only some portions use sufficient threads in the traditional cuda kernel sense.  Only difficulties so far seem to be that the existing cuda apps are full of design flaws that need to be fixed first, which requires a deeper understanding of the multibeam algorithms than I had ever needed previously. That takes a long time (for me anyway), along with that the drivers/tools and techniques for programming in parallel are fundamentally more difficult, as there is less prior work to draw upon, and much less experience on platforms like this.

It won't require newer cuda libraries to extract more performance, but will take time.  Step by step refinement in all areas, particularly reliability first, will see things stabilise in the right direction, and we can turn to making use of the resources for speed as all the components mature.

Jason
« Last Edit: 08 Jul 2010, 01:50:48 am by Jason G »

hellsheep

  • Guest
Re: Seemingly Silly Question
« Reply #6 on: 08 Jul 2010, 04:19:15 am »
Ah thanks for that Jason.

Good to know that little bit of information.

Take all the time in the world you need. :) I know a lot of people here and over at SETI are working hard and trying to do their best. Long time or short time, it doesn't matter. The point is eventually it'll be working as we desire. :)

Offline Josef W. Segur

  • Janitor o' the Board
  • Knight who says 'Ni!'
  • *****
  • Posts: 3112
Re: Seemingly Silly Question
« Reply #7 on: 08 Jul 2010, 01:41:34 pm »
...
The point is eventually it'll be working as we desire. :)

LOL, infinite speed is a target which will never be reached  ;D

Gecko_R7

  • Guest
Re: Seemingly Silly Question
« Reply #8 on: 08 Jul 2010, 02:33:03 pm »
...
The point is eventually it'll be working as we desire. :)

LOL, infinite speed is a target which will never be reached  ;D

I'll settle just for a quantum-entanglement optimized application.  :P
Anyone know how to write for qbit processing?

hellsheep

  • Guest
Re: Seemingly Silly Question
« Reply #9 on: 09 Jul 2010, 04:13:23 am »
...
The point is eventually it'll be working as we desire. :)

LOL, infinite speed is a target which will never be reached  ;D

I'll settle just for a quantum-entanglement optimized application.  :P
Anyone know how to write for qbit processing?

Give me a moment i'll just call up my Vulcan friend. :P

 

Welcome, Guest.
Please login or register.
 
 
 
Forgot your password?
Members
Total Members: 97
Latest: ToeBee
New This Month: 0
New This Week: 0
New Today: 0
Stats
Total Posts: 59559
Total Topics: 1672
Most Online Today: 17
Most Online Ever: 983
(20 Jan 2020, 03:17:55 pm)
Users Online
Members: 0
Guests: 16
Total: 16
Powered by EzPortal