+- +-
Say hello if visiting :) by Gecko
11 Jan 2023, 07:43:05 pm

Seti is down again by Mike
09 Aug 2017, 10:02:44 am

Some considerations regarding OpenCL MultiBeam app tuning from algorithm view by Raistmer
11 Dec 2016, 06:30:56 am

Loading APU to the limit: performance considerations by Mike
05 Nov 2016, 06:49:26 am

Better sleep on Windows - new round by Raistmer
26 Aug 2016, 02:02:31 pm

Author Topic: Benchmarking Older p4  (Read 9452 times)

Offline Jason G

  • Construction Fraggle
  • Knight who says 'Ni!'
  • *****
  • Posts: 8980
Benchmarking Older p4
« on: 19 Apr 2007, 11:37:14 am »
Hi,
    I have been running the p4-sse2 2.2B app since you released it, but decided to finally try the benchmark tool.  I was surprised to find the times for a long test showed the SSE2-generic app ran the benchmark much quicker.

My questions are:
  Is this an indication that the p4-sse2 version is meant for newer p4's with larger L2 cache?( mine only has 512k)
Is the difference in long test (below) a realistic indication of difference I would get moving to the generic sse2 version?

Thanks in advance :D

Code: [Select]
Application started. Click "Test science apps!" above.Starting tests. This will take a few minutes, please be patient!

Testing KWSN_2.2B_SSE3-C2_Ben-Joe.exe...does not work on your system.

Testing KWSN_2.2B_SSE3-P4_Ben-Joe.exe...does not work on your system.

Testing KWSN_2.2B_SSE2-P4_Ben-Joe.exe...ran for  776 seconds.
Validating...Result      : strongly similar.   

Testing KWSN_2.2B_SSE2-PM_Ben-Joe.exe...ran for  794 seconds.
Validating...Result      : strongly similar.   

Testing KWSN_2.2B_SSE2-generic_Ben-Joe.exe...ran for  739 seconds.
Skipping other apps - SSE2 is quicker than SSE if supported.
Validating...Result      : strongly similar.   

Finished with test run!
Restarted BOINC service.
« Last Edit: 19 Apr 2007, 11:41:31 am by j_groothu »

Offline Crunch3r

  • Knight who says 'Ni!'
  • *****
  • Posts: 602
    • 64 bit boinc clients
Re: Benchmarking Older p4
« Reply #1 on: 19 Apr 2007, 11:46:45 am »
Hi,
    I have been running the p4-sse2 2.2B app since you released it, but decided to finally try the benchmark tool.  I was surprised to find the times for a long test showed the SSE2-generic app ran the benchmark much quicker.

My questions are:
  Is this an indication that the p4-sse2 version is meant for newer p4's with larger L2 cache?( mine only has 512k)
Is the difference in long test (below) a realistic indication of difference I would get moving to the generic sse2 version?

Thanks in advance :D

Code: [Select]
Application started. Click "Test science apps!" above.Starting tests. This will take a few minutes, please be patient!

Testing KWSN_2.2B_SSE3-C2_Ben-Joe.exe...does not work on your system.

Testing KWSN_2.2B_SSE3-P4_Ben-Joe.exe...does not work on your system.

Testing KWSN_2.2B_SSE2-P4_Ben-Joe.exe...ran for  776 seconds.
Validating...Result      : strongly similar.   

Testing KWSN_2.2B_SSE2-PM_Ben-Joe.exe...ran for  794 seconds.
Validating...Result      : strongly similar.   

Testing KWSN_2.2B_SSE2-generic_Ben-Joe.exe...ran for  739 seconds.
Skipping other apps - SSE2 is quicker than SSE if supported.
Validating...Result      : strongly similar.   

Finished with test run!
Restarted BOINC service.

Hi,

If your P4 has a "Willamette" core then the answer is yes. The generic sse2 app. is build for those and for AMD cpus.

The sse2 apps are build for the later cores like Northwood (compiler switch -xN).

I suggest going for the genric sse2 app in your case.

HTH

I want to share something with you: The three little sentences that will get you through life. Number 1: Cover for me. Number 2: Oh, good idea, Boss! Number 3: It was like that when I got here.

Homer Simpson

Offline Jason G

  • Construction Fraggle
  • Knight who says 'Ni!'
  • *****
  • Posts: 8980
Re: Benchmarking Older p4
« Reply #2 on: 19 Apr 2007, 11:49:29 am »
Actually it's a Northwood 2.0A, but I think your point still applies to this one because of the small cache,   thanks!

olomlufi

  • Guest
Re: Benchmarking Older p4
« Reply #3 on: 19 Apr 2007, 03:19:12 pm »
Please take a look at THESE:

http://www.digit-life.com/articles2/cpu/insidespeccpu2000-part-r.html
http://www.digit-life.com/articles2/cpu/insidespeccpu2000-part-n.html


The benchmark numbers are rather chaotic sometimes, consider the opteron 875 specFP test, on overall /QxW is rather slow, but in some tests it just leads by a lot, some others it's horrible. It's very good in 179.art 188.ammp 189.lucas and smokes the other ops in 189.equake, while sucks in others. Of course looking at the graphs same could be said of most other opts, and even no opt wins sometimes.
By the way just look@ my earlier post. I came to the same conclusion: for the test WU's at least: generic SSE2 (QxW) ran fastest on my northwoods both normal and celeron, and even on the Core2Duo I had on my hands. (However I didn't test QxT)

I wonder, iif it would be possible to find the best opts for each sub-unit and CPU and compile them with different opts before linking.

The belief that the higher level of opt the better is flawed. (no opt,xK,xW,xN,xB,xP,xT in that order) It seems to me they are NOT just instuction set but also other architectural optimizations too, like unroll tresholds, sheduling of instructios, chace, prefetch considerations, and the complexity of all those makes the speed change ofsome applications highly chaotic.

Looking at WU crunch times I can confirm it with small uncertainity the best for PM / A64 is QxB ( Pentium M build ) and for Northwood and Core2 the QxW ( Generic SSE2 ) build.

Is there any chance to make such a build, with mixed opts, or even one with PGO, it wont let me use it...?

olomlufi

  • Guest
Re: Benchmarking Older p4
« Reply #4 on: 19 Apr 2007, 03:26:59 pm »
Oh I forgot the opts seem to stand for CPU codenames.....

K - Katmai (First P3, the Slot 1 one)
W - Willamette (Socket 423 P4 0.18 micron)
N - Northwood
P - Prescott
T - Tejas???? (The presumed successor of prescott that got cancelled, bc it was more suitable for BBQ than computing, they probably forgot to change the opt letter to C for Conroe when 9.1 came out - Just my theory, do not take seriously.)
Also, I read somewhere /QxW does not use floating point SSE2 instructions only integer, can someone confirm that?

 

Welcome, Guest.
Please login or register.
 
 
 
Forgot your password?
Members
Total Members: 97
Latest: ToeBee
New This Month: 0
New This Week: 0
New Today: 0
Stats
Total Posts: 59559
Total Topics: 1672
Most Online Today: 158
Most Online Ever: 983
(20 Jan 2020, 03:17:55 pm)
Users Online
Members: 0
Guests: 145
Total: 145
Powered by EzPortal