Forum > Discussion Forum
Difference?
Josef W. Segur:
--- Quote from: sunu on 09 Jun 2011, 04:45:32 pm ---Yes, don't think about that stuff. Let's say we have x and y. Why sum(x) / sum(y) or avg(x) / avg(y) is different from avg(x / y)?
--- End quote ---
Methods 1 and 2 give more weight to long-running tasks. Take two tasks, one which runs in 6 hours and gives 100 credits, another which runs in 2 hours and gives 40 credits. The six hours of the first task makes the 2 hours of the second task only 1/4 of the total time. So you get 17.5 credits/hour which is closer to the 16.7 c/h of the first task than the 20 c/h of the second.
But method 3 gives equal weight to the tasks no matter how quickly or slowly they run. So you get 18.333 c/h.
BOINC uses method 3 for its server-side averages, a 100 hour task is weighted the same as a 1 minute task...
Joe
Jason G:
--- Quote from: sunu on 09 Jun 2011, 07:17:27 pm ---The last method now seems goofy but why is it right or wrong? And is the difference just a rounding error or avg (x / y) calculates something different?
--- End quote ---
--- Quote from: sunu on 09 Jun 2011, 04:45:32 pm ---Yes, don't think about that stuff. Let's say we have x and y. Why sum(x) / sum(y) or avg(x) / avg(y) is different from avg(x / y)?
--- End quote ---
--- Quote from: Josef W. Segur on 09 Jun 2011, 09:04:51 pm ---But method 3 gives equal weight to the tasks no matter how quickly or slowly they run. So you get 18.333 c/h.
--- End quote ---
That's right they are different, nothing is goofy (except maybe me), because the order is important. so it's a different calculation with or without precision issues.
#1: sum(x) / sum(y) simplifies to the same as #2 by n/n,
#2: avg(x) / avg(y) is the ratio of two averages, which will weight by large x,
#3: avg(x / y), is the arithmetic mean of x/y , so likely the one you want,
but depending on what you want to achieve, if you want a more robust statistic you could possibly use the medians instead, or even truncated means to chuck out outliers.
sunu:
Yes, "weight" seems the magic word here. After Josef's post I looked at various weighted means but still avg(x / y) doesn't look anything like them.
--- Quote from: Jason G on 10 Jun 2011, 01:54:25 am ---but depending on what you want to achieve, if you want a more robust statistic you could possibly use the medians instead, or even truncated means to chuck out outliers.
--- End quote ---
I just wanted to calculate the credit / sec output of my machine broken down to CPU, GPU, AP, MB etc. :)
As for the problem with the car above, the answer isn't as simple as I thought http://en.wikipedia.org/wiki/Harmonic_mean#In_physics
Well, I guess we need a professional statistician :D
Jason G:
--- Quote from: sunu on 10 Jun 2011, 06:50:30 am ---As for the problem with the car above, the answer isn't as simple as I thought http://en.wikipedia.org/wiki/Harmonic_mean#In_physics
Well, I guess we need a professional statistician :D
--- End quote ---
Hahaha, Yep, Don't know about Joe but my statistics is certainly rusty. If you intend to process a lot of results, do work with a general idea of the golden rules in mind with floating point as well, since anything that could compound tiny error in unexpected ways will change the result as well.
Jason
Miep:
I do plain linear regression. mainly to prove that credit new is not linear ;D
0.188 credit/second on beta with some flavour of x37.
Navigation
[0] Message Index
[#] Next page
[*] Previous page
Go to full version