+- +-
Say hello if visiting :) by Gecko
11 Jan 2023, 07:43:05 pm

Seti is down again by Mike
09 Aug 2017, 10:02:44 am

Some considerations regarding OpenCL MultiBeam app tuning from algorithm view by Raistmer
11 Dec 2016, 06:30:56 am

Loading APU to the limit: performance considerations by Mike
05 Nov 2016, 06:49:26 am

Better sleep on Windows - new round by Raistmer
26 Aug 2016, 02:02:31 pm

Author Topic: Single vs Dual memory channel - no effect ?  (Read 14843 times)

Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14349
Single vs Dual memory channel - no effect ?
« on: 14 Nov 2009, 06:34:45 pm »
I tested fully loaded by CPU AP quad with single memory module installed (2 GB) vs dual-channel enabled config (4GB).
It seems there is almost no difference. AstroPulse should be very L2-cache hungry so cache misses (and delays because of memory accesses) should be pretty often if all cores do AstroPulses. Nevertheless I see no benefits from dual-channel config...

Any thoughts?


Offline Richard Haselgrove

  • Messenger Pigeon
  • Knight who says 'Ni!'
  • *****
  • Posts: 2819
Re: Single vs Dual memory channel - no effect ?
« Reply #1 on: 14 Nov 2009, 07:37:36 pm »
Try it again with CPU MB VHAR - then I think you'll see a difference!

I think all you've proved is that AP does too little memory transfer to saturate even a single-channel bus.

Offline _heinz

  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 2117
Re: Single vs Dual memory channel - no effect ?
« Reply #2 on: 14 Nov 2009, 08:23:31 pm »
Hi Raistmer,
keep in mind that you fill all two or four slots with single or dual-mode memory.
Remember on my Xeon, at first I filled still two slots and get 6000, after fill all 4 slots I get 12000 throughput
with 2 modules the quadcannel works still as dualchannel.


Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14349
Re: Single vs Dual memory channel - no effect ?
« Reply #3 on: 14 Nov 2009, 08:34:27 pm »
I have dual-channel motherboard AFAIK, not quad-channel.
Gigabyte GA-Q35M-S2

Offline Jason G

  • Construction Fraggle
  • Knight who says 'Ni!'
  • *****
  • Posts: 8980
Re: Single vs Dual memory channel - no effect ?
« Reply #4 on: 15 Nov 2009, 12:29:00 am »
Another good test would be to see if relaxing the memory timings completes the picture (lengthening the elapsed time or not).

IMO, what makes our 45nM chips 'difficult' is the intelligent prefetchers,   If the TLB entries for the dataset are still resident in the page table, then memory fetches for single or dual channel will be bound by the prefetcher speed itself, rather than the memory subsystem directly. 

For our 45nM Core2's we really have obscenely large cache relative to the dataset size, and are accessing fairly linear,  medium sized datasets, so really we should be 'moslty' L2 bound for single task.  The dual channel is presumably interleaved, so it should halve the effective latency into L2, which to my mind improved caching accounts for the large difference to earlier Intel implementations, by our chips.  It's the hyperthreaded p4s & other smaller cache designs where the thrashing becomes a problem.

So I think you've proved the application matches our chips better than older ones (Dual channel or not).

A good indicator with latency (instead of bandwidth)  is elapsed time variation, since you can get 'lucky' some of the time:
Dual channel, while offering enhanced prefetchability, doesn't add more 'read/write ports' to the CPU cores unfortunately.   Min to max elapsed variation  there, for single channel I make to be 3079 - 3044.5 seconds = 34.5 seconds  ( a bit over 1%, Doesn't sound like much does it ? )

But,
Contrast this with the dual channel elapsed variation:
3052.5 - 3031 = 21.5 seconds

So a full one third of the elapsed time variation has been removed ... Still doesn't sound like much, *except* that This can mean that a big difference over a long run.

That IMO, is where the Dual channel benefit is supported by your numbers, in reducing the 'Worst case' scenario, not the 'best case' (or even the Average case much)
« Last Edit: 15 Nov 2009, 12:54:21 am by Jason G »

Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14349
Re: Single vs Dual memory channel - no effect ?
« Reply #5 on: 15 Nov 2009, 03:59:00 am »
Quote
That IMO, is where the Dual channel benefit is supported by your numbers, in reducing the 'Worst case' scenario
Unfortunately, there is another factor that can cause slowdown it seems.
[That is, I think I just didn't see worst case for dual channel still, cause I rebooted OS recently]
I've seen speed degradation when system is up for long time for hybrid build. There this effect pretty big. But it seems it presents for CPU-only app too.
So I would not trust timings that were recived w/o OS rebooting (or count them separately from "just rebooted" ones).
Why exactly I see such speed degradation - it's another (and pretty important) question...
[ ADDON: windows memory pool becomes fragmented??? ]

And yes, it seems CPU AP not much sensitive to memory speed directly (on my CPU). I trying to study what system components have big impact on SETI performance and what not.
For now it seems (at least for AP) that one can save money and buy usual memory, not the best possible, provided he managed to get CPU with really big cache :)
Today I will complete experiment proposed by Richard and post new data.
« Last Edit: 15 Nov 2009, 04:38:34 am by Raistmer »

Offline Jason G

  • Construction Fraggle
  • Knight who says 'Ni!'
  • *****
  • Posts: 8980
Re: Single vs Dual memory channel - no effect ?
« Reply #6 on: 15 Nov 2009, 05:08:33 am »
...
Why exactly I see such speed degradation - it's another (and pretty important) question...
[ ADDON: windows memory pool becomes fragmented??? ]
...
Hmmm, yes I don't really see with with CPU app & long uptime, so will give it some thought.  Yes I think aggressive memory management of windows might be related ( which vmm gets less aggressive with successive Windows versions) so heap management accumulating stale crap could be an issue, which is why I was originally trying to bypass a layer by avoiding CRT & using vmm directly instead. 

Just a theory: TLB Misses for code (rather than data), caused by accumulated entires, could be exacerbating front end stalls, which would indeed IMO suffer after longer uptime due to later windows not so aggressively paging out, so accumulating more stale entries for rarely accessed services & drivers etc.   Over Christmas, will be putting some special performance counters to see if we're getting certain kinds of stalls in the decoding that are prone to happen on Core2,  If not then will keep looking for the culprit.   If front end stalls is a problem (OS/uptime induced or otherwise), then might need to do a bit more reading on how to rectify the situation.  Not sure what part new OS features like Super-fetch might play, but will have a chance to look in a couple of weeks, when migrating to Win7.

Jason

Offline Raistmer

  • Working Code Wizard
  • Volunteer Developer
  • Knight who says 'Ni!'
  • *****
  • Posts: 14349
Re: Single vs Dual memory channel - no effect ?
« Reply #7 on: 15 Nov 2009, 06:35:37 am »
In my current config swap enabled ( :o usually I disable it completely if host has enough memory) and SuperFetch stopped (setted to manual instead of automatic).

 

Welcome, Guest.
Please login or register.
 
 
 
Forgot your password?
Members
Total Members: 97
Latest: ToeBee
New This Month: 0
New This Week: 0
New Today: 0
Stats
Total Posts: 59559
Total Topics: 1672
Most Online Today: 355
Most Online Ever: 983
(20 Jan 2020, 03:17:55 pm)
Users Online
Members: 0
Guests: 195
Total: 195
Powered by EzPortal