Forum > Windows
optimized sources
_heinz:
--- Quote from: Jason G on 28 Nov 2008, 12:30:43 pm ---Ahhh, 6 meg per package ( 1.5 meg per core )... Okay, yep it is 12 meg total for the 8 cores.
Compared 32 bit ICC 10.1 / TBB 2.0 build of fibonacci, and it IS slower than Parallel composer 32 bit build under XP64 ... Will have to try that build under XP32 to confiirm though. I will probably update all my ICC/IPP base packages as soon as I get time, in a few week.
Jason
--- End quote ---
12 MB per chip
BX80574E5405A Aktivkühler oder für 1-HE-Systeme 45 nm E5405 2,00 GHz (80 W) 1333 12 MB gesamt
we have 2 processors so we have 24MB for 8 Cores
Jason G:
Err well CPU-Z shows only per core then? In any case:
Hmm, not a lot of Fibonacci difference here, but some: (fastest thread number was 2)
Built under xp32 with ICC 10.1 + TBB (run on XP 32)
--- Quote ---Threads number is 2
Shared serial (mutex) - in 0.286294 msec
Shared serial (spin_mutex) - in 0.196978 msec
Shared serial (queuing_mutex) - in 0.301214 msec
Shared serial (Conc.HashTable) - in 4.313505 msec
Parallel while+for/queue - in 1.485761 msec
Parallel pipe/queue - in 1.980293 msec
Parallel reduce - in 0.523162 msec
Parallel scan - in 0.338611 msec
Parallel tasks - in 0.566134 msec
--- End quote ---
and Built under XP64 with Parallel Composer Beta Update 2 + TBB 2.0 ( but run on XP 32 also)
--- Quote ---Threads number is 2
Shared serial (mutex) - in 0.279819 msec
Shared serial (spin_mutex) - in 0.208223 msec
Shared serial (queuing_mutex) - in 0.284642 msec
Shared serial (Conc.HashTable) - in 4.461598 msec
Parallel while+for/queue - in 1.718736 msec
Parallel pipe/queue - in 2.188073 msec
Parallel reduce - in 0.571781 msec
Parallel scan - in 0.357319 msec
Parallel tasks - in 0.534837 msec
--- End quote ---
So some things look a bit slower, but I will carefully consider shifting to ICC 11 soon, and check how our projects of interest compare.
_heinz:
How many number let you generate ? 1000 ?
Jason G:
No, just used default which was 100... will try 1000
[Later:] Fastest 32 bit run built on XP32 ICC10.1 / TBB2.0 now 3 threads :o:
--- Quote ---Threads number is 3
Shared serial (mutex) - in 162.014407 msec
Shared serial (spin_mutex) - in 11.609819 msec
Shared serial (queuing_mutex) - in 50.960339 msec
Shared serial (Conc.HashTable) - in 401.327768 msec
Parallel while+for/queue - in 93.399315 msec
Parallel pipe/queue - in 164.994829 msec
Parallel reduce - in 27.500117 msec
Parallel scan - in 22.918168 msec
Parallel tasks - in 25.904447 msec
--- End quote ---
Getting parallel composer build data:
--- Quote ---Threads number is 3
Shared serial (mutex) - in 76.449678 msec
Shared serial (spin_mutex) - in 13.449323 msec
Shared serial (queuing_mutex) - in 50.961819 msec
Shared serial (Conc.HashTable) - in 413.186277 msec
Parallel while+for/queue - in 93.995606 msec
Parallel pipe/queue - in 171.541281 msec
Parallel reduce - in 28.647254 msec
Parallel scan - in 27.231642 msec
Parallel tasks - in 24.389762 msec
--- End quote ---
_heinz:
--- Quote from: Jason G on 28 Nov 2008, 01:00:42 pm ---No, just used default which was 100... will try 1000
[Later:] Fastest 32 bit run built on XP32 ICC10.1 / TBB2.0 now 3 threads :o:
--- Quote ---Threads number is 3
--- End quote ---
Now you know why I choosed 5 .. a not even number
We can create every number of threads 1, 2, 3, 4.. 128, 256, 512 etc. not even numbers also.
and we can use /QxHOST ---> Best performance on latest features of the processor supported by the compilation host.
::)
heinz
--- End quote ---
Navigation
[0] Message Index
[#] Next page
[*] Previous page
Go to full version