+- +-
Say hello if visiting :) by Gecko
11 Jan 2023, 07:43:05 pm

Seti is down again by Mike
09 Aug 2017, 10:02:44 am

Some considerations regarding OpenCL MultiBeam app tuning from algorithm view by Raistmer
11 Dec 2016, 06:30:56 am

Loading APU to the limit: performance considerations by Mike
05 Nov 2016, 06:49:26 am

Better sleep on Windows - new round by Raistmer
26 Aug 2016, 02:02:31 pm

Author Topic: C-60 APU and Radeon HD6920  (Read 33455 times)

Offline Jason G

  • Construction Fraggle
  • Knight who says 'Ni!'
  • *****
  • Posts: 8980
Re: C-60 APU and Radeon HD6920
« Reply #15 on: 04 Jan 2012, 06:37:10 pm »
I would not talk about any convergence after these tests done. Looks like divergence much more adequate term...
Look for number of builds that DON'T run on these new AMD chips...
Whole x64 AKv8b2 set fails to even report error...

When Intel specific target chip libraries & builds were used by design, and it runs at all (let alone better in some cases), I find that surprising, since a static Intel build should run badly if at all on the wrong chip, even within Intel silicon due to micro-architectural optimisation being on the heavy side.  With the instruction sets, I'm more referring that general SSE3 performs across the board pretty well on newer chips from both vendors, where neither with Core2, Athlon nor PhenomI/II was 'plain Intel SSE3' a good choice of code & libraries. 

With juggling, We *should* find, that a static build for each x86 & x64,  each with generically optimised SSE3, SSE2 with static IPP in both flavours &  FFTW, should have a workable combination for most chips except Core2 & AVX.  Obviously AVX availability being OS dependant, that would need to be tacked on with proper detection, so similar with SSSE3 seems viable.

Jason
« Last Edit: 04 Jan 2012, 06:53:15 pm by Jason G »

Offline Mike

  • Alpha Tester
  • Knight who says 'Ni!'
  • ***
  • Posts: 2427
Re: C-60 APU and Radeon HD6920
« Reply #16 on: 04 Jan 2012, 07:42:40 pm »
I would really like to see how the FX performs with a 64 bit app.
I´m almost certain it would benefit more as an Intel to be honest.

Mike

Offline Jason G

  • Construction Fraggle
  • Knight who says 'Ni!'
  • *****
  • Posts: 8980
Re: C-60 APU and Radeon HD6920
« Reply #17 on: 04 Jan 2012, 08:11:55 pm »
I would really like to see how the FX performs with a 64 bit app.
I´m almost certain it would benefit more as an Intel to be honest.

Mike
  I tend to agree there would be some benefit, though would expect only around 10% by just changing bittage, solely because much of the hot code is 128 bit vectorised SIMD already anyway (and will be 256 bit with AVX support additions later). The periphery CPU non-vectorised code (which would become 64 bit) only has a marginal possible impact in this kind of application.

It's working out how to approach that correctly given 64 bit Intel libraries don't contain suitable generically optimised 64 bit libraries, that becomes the technical challenge.  For a 64 bit build, at the moment it looks as though a worthy option to try will be bypassing the issues by using newest FFTW, with automatic AVX support inclusive, ironically built with Intel compiler, & use generic arch:SSE3 optimisations for the core, with a arch:SSE2 path for some earlier chips.  Where I talk about convergence is that it's looking as though i3-i7 might also work well under that arrangement, so only poking around as Raistmer's V7 updates are merged in will really point to the best methods.

All I can say with absolute certainty at this point, is that separate per chip builds for every chip, or even class of chips , would not be a sustainable approach, though it worked well in the past.  Even 'simply' rebuilding the full AKv8b2 set for the bugfix, was far too much work for that amount of subtle difference. That knowledge could & should  be embedded in one app per platform instead, as stock does.

Jason
« Last Edit: 04 Jan 2012, 08:17:36 pm by Jason G »

Offline Mike

  • Alpha Tester
  • Knight who says 'Ni!'
  • ***
  • Posts: 2427
Re: C-60 APU and Radeon HD6920
« Reply #18 on: 04 Jan 2012, 08:20:26 pm »
10% isn´t to shabby IMHO.

Offline Jason G

  • Construction Fraggle
  • Knight who says 'Ni!'
  • *****
  • Posts: 8980
Re: C-60 APU and Radeon HD6920
« Reply #19 on: 04 Jan 2012, 08:35:54 pm »
10% isn´t to shabby IMHO.

In micro-architectural optimisation terms it's a bucketload.  For more impact it's quite possible that newer FFTW may fly on that chip compared to Intels x86 SSE3 (Pentium 4!!!!) library that  appears to work best on it now.  We'll just have to make the comparisons easy with switches or similar, then hard wire defaults to suit the findings later.

Jason

Offline Urs Echternacht

  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 4121
  • ++
Re: C-60 APU and Radeon HD6920
« Reply #20 on: 04 Jan 2012, 09:59:48 pm »
For Bulldozer arch try CompilerOptQuickRef-62004200.pdf
_\|/_
U r s

Offline Jason G

  • Construction Fraggle
  • Knight who says 'Ni!'
  • *****
  • Posts: 8980
Re: C-60 APU and Radeon HD6920
« Reply #21 on: 04 Jan 2012, 10:42:23 pm »
For Bulldozer arch try CompilerOptQuickRef-62004200.pdf

Thanks! Interesting they recommend aggressive unrolling & prefetch, which suggests long pipelines. That's opposite to Core2 onward, which use loop stream detectors, often preferring to remain rolled up.   Probably when optimising for those it'll be worthwhile cross-checking what Agner Fog says for extra insight. 

I'm open in planning to try other compilers as well, so that's some good starting info.

Jason

Offline skildude

  • Knight o' The Round Table
  • ***
  • Posts: 168
Re: C-60 APU and Radeon HD6920
« Reply #22 on: 05 Jan 2012, 12:35:44 am »
I did the benchmarks for the SSE3 X64 non AMD.  All WU's failed to start.
No real data to report at all on that test.
 The Following are the results from the AMD SSE3 on Win7 64 bit OC to 3.9Ghz.  The app doesn't state if it is 64 bit but it is from the 64 bit lunatics installer.
I think there is a dramatic speed difference from Mikes 32 bit testing.  I don't think the minimal OC can account for the speed difference.  In fact these times are substantially faster than Mikes!!!

WU : PG0009.wu
AK_v8b2_win_SSE3_AMD.exe : 326.697 secs CPU
AK_v8b2_win_SSE3_AMD.exe : 328.554 secs CPU
Speedup     : -0.57%
Ratio       : 0.99 x

WU : PG0395.wu
AK_v8b2_win_SSE3_AMD.exe : 307.104 secs CPU
AK_v8b2_win_SSE3_AMD.exe : 306.776 secs CPU
Speedup     : 0.11%
Ratio       : 1.00 x

WU : PG0444.wu
AK_v8b2_win_SSE3_AMD.exe : 249.430 secs CPU
AK_v8b2_win_SSE3_AMD.exe : 250.740 secs CPU
Speedup     : -0.53%
Ratio       : 0.99 x

WU : PG1327.wu
AK_v8b2_win_SSE3_AMD.exe : 201.584 secs CPU
AK_v8b2_win_SSE3_AMD.exe : 200.134 secs CPU
Speedup     : 0.72%
Ratio       : 1.01 x


Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14349
Re: C-60 APU and Radeon HD6920
« Reply #23 on: 05 Jan 2012, 04:38:23 am »
Currently I preparing new build environment on netbook. It will be x64 one cause it came with x64 Win7 onboard.
Wanna take opportunity and do great upgrade of buiuld environment too.
Ultimately will use VS2010 (unfortunately, I have access only to x86 prof version so will sit with VS2008 little more cause have full x64 pro suite).
Looks like Intel's part should be upgraded too. Perhaps, new Intel's composer? Should it support AVX? Should VS2010 support AVX? VS 2008 apparently should not ?Or some patches/service packs available?
I put new MB7 OCL NV onlyne, still CUDA 3.2 but will try CUDA 4.1RC2 on netbook so some more speed comparisons will be needed.
Testers stay tuned ;)

Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14349

Offline Mike

  • Alpha Tester
  • Knight who says 'Ni!'
  • ***
  • Posts: 2427
Re: C-60 APU and Radeon HD6920
« Reply #25 on: 05 Jan 2012, 05:15:06 am »
The FX benefits dramatic from clock speeds.

Evenso on my last test i still had 6 cores running on boinc.
It was just a speed comparision between AMD and Intel app not overall speed test.

FX @3.9 GHZ

AK_v8b2_win_SSE3_AMD.exe -verb -nog / PG0009.wu :
AppName: AK_v8b2_win_SSE3_AMD.exe
AppArgs: -verb -nog
TaskName: PG0009.wu
Started at  : 10:28:53.820
Ended at    : 10:34:18.863
    324.981 secs Elapsed
    322.875 secs CPU time
Speedup     : 8.87%
Ratio       : 1.10 x

AK_v8b2_win_SSE3_INTEL.exe -verb -nog / PG0009.wu :
AppName: AK_v8b2_win_SSE3_INTEL.exe
AppArgs: -verb -nog
TaskName: PG0009.wu
Started at  : 10:34:22.592
Ended at    : 10:39:43.889
    321.204 secs Elapsed
    319.131 secs CPU time
Speedup     : 9.93%
Ratio       : 1.11 x

AK_v8b2_win_SSE3_AMD.exe -verb -nog / PG0395.wu :
AppName: AK_v8b2_win_SSE3_AMD.exe
AppArgs: -verb -nog
TaskName: PG0395.wu
Started at  : 10:39:47.555
Ended at    : 10:44:51.209
    303.607 secs Elapsed
    301.550 secs CPU time
Speedup     : 14.89%
Ratio       : 1.17 x

AK_v8b2_win_SSE3_INTEL.exe -verb -nog / PG0395.wu :
AppName: AK_v8b2_win_SSE3_INTEL.exe
AppArgs: -verb -nog
TaskName: PG0395.wu
Started at  : 10:44:54.860
Ended at    : 10:49:49.590
    294.684 secs Elapsed
    292.673 secs CPU time
Speedup     : 17.40%
Ratio       : 1.21 x

Also @3.9 GHZ the FX is up to 20% faster as the Phenom on 3.6 GHZ instead of 28% slower.  :o

Bench attached.





Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14349
Re: C-60 APU and Radeon HD6920
« Reply #26 on: 05 Jan 2012, 05:58:42 am »
I think those AVX-enabled need at least FFT update if not whole IPP functions used.
AFAIK Joe implemented some hands AVX opt in stock. Would be good if it would be incorporated in opt apps also... USE_AVX macro-governed perhaps.

Offline Jason G

  • Construction Fraggle
  • Knight who says 'Ni!'
  • *****
  • Posts: 8980
Re: C-60 APU and Radeon HD6920
« Reply #27 on: 05 Jan 2012, 07:29:09 am »
I think those AVX-enabled need at least FFT update if not whole IPP functions used.
AFAIK Joe implemented some hands AVX opt in stock. Would be good if it would be incorporated in opt apps also... USE_AVX macro-governed perhaps.

Yes, definitely looking at that.  Any AVX enabled app should really have the AVX path and at least one viable alternative path/library, since AVX availability must be detected at runtime, and on Windows is only supported under Win7 w/sp1 (or presumably Win8beta)

So far I have managed to build an AVX enabled static fftw lib, both x86 & x64 (which will be useful at least in particular for AP), that uses its own internal detection, but that build is only MSVS2010sp1 so far & will definitely need to bench an ICC built variant soon (supplied bench in fftw project at least),.  So I'll probably try linking it in to AKv8 as well as slotting in your V7 updates & splitting core functions into different SSE base versions.  I'm hopeful when I get going to have that operational pretty quickly.

I haven't gotten around to test linking in several sse level static IPPs yet into the same build, but looked at how to do it & doesn't appear to be difficult.  What I'll probably do is a huge build with every kind of FFT available linked in at the same time, then switch variants with a command line parameter so the testers can tell us which library is faster for which chip  ( :D  let the testers work out the tricky mappings )

Jason

Offline skildude

  • Knight o' The Round Table
  • ***
  • Posts: 168
Re: C-60 APU and Radeon HD6920
« Reply #28 on: 10 Jan 2012, 08:33:21 am »
it looks like Primegrid has an AVX app for their LLR projects for linux and Apple but not currently any windows

Offline cristipurdel

  • Knight o' The Realm
  • **
  • Posts: 123
Re: C-60 APU and Radeon HD6920
« Reply #29 on: 11 Jan 2012, 08:38:24 am »
Are these libraries already included in the amd app?
http://developer.amd.com/libraries/appmathlibs/Pages/default.aspx

 

Welcome, Guest.
Please login or register.
 
 
 
Forgot your password?
Members
Total Members: 97
Latest: ToeBee
New This Month: 0
New This Week: 0
New Today: 0
Stats
Total Posts: 59559
Total Topics: 1672
Most Online Today: 4
Most Online Ever: 983
(20 Jan 2020, 03:17:55 pm)
Users Online
Members: 0
Guests: 46
Total: 46
Powered by EzPortal