JANUARY
2001
ARCHIVES

 

LAMB CHOP HOME

JOIN TEAM LAMB CHOP

TEAM STATISTICS

SETI BENCHMARKING

BENCHMARKING NEWS

BENCHMARK FILE

SUBMIT BENCHMARKS

VERSION 3.03+ RESULTS

1 - 100

101 - 200

201 - 300

301 - 400

401 - 500

unofficial top 200

VERSION 3 RESULTS

1 - 199

VERSION 2 RESULTS

1 - 100

101 - 200

201 - 300

301 - 400

401 - 441

VERSION 1 RESULTS

1 - 163

SETI TIPS ET. AL.

BENCH ARCHIVES

PUBLIC QUEUES

ARS SUPERCOMPUTER

SETI@HOME PAGE

ARS DISTRIBUTED FORUM

TEAM BEEF ROAST

TEAM CHILLI PEPPER

TEAM CRAB CAKE 

TEAM EGG ROLL

TEAM FROZEN YOGURT

TEAM PRIMORDIAL SOUP

TEAM PRIME RIB

TEAM STIR FRY

TEAM VODKA MARTINI

THE SUSHI BAR

ARS TECHNICA

LINKAGE

 

PERSONAL STATS LOOKUP:
SETI@Home ACCOUNT:
COMMENTS? EMAIL: WEBMASTER
(delete spamoff)

mucho thanks to our host:
Ironbits
  • blah
    • blah
January 28, 2001       v3.03+ results latest: 28 January 2001 (9)

A thought provoking last few days...
One of the greatest concerns with benchmarking is relevance. Does the result allow comparison of the relative performance between systems. Hence the eventual change to a 0.417 angle range WU just before Christmas. For those who don't know the unit was actually submitted and tested by Beyond from a large pool he accumulated specifically for the purpose. He contributed  greatly to the discussion on whether to change the WU and it seemed a small honour to allow him to choose it (with a few criteria that I felt were warranted for inclusion in a benchmark WU). Though it is possible to construct arguments over it's absolute validity - (I'd say nit-pick) I think it was good choice and the uptake of benchmarking with it has been above my expectations - I wrongly believed that the extra time involved (combination of greater processing incurred by v3.03 and having to using a slower WU - lower angle ranges process significantly slower than faster ones) would deter a lot of you.

A little benchmark history…
The next greatest concern was security/falsification of results. In my time with RB compiling the benchmarks, there were many weird and wonderful results submitted. But, you may be surprised to learn, none that could be labelled 'malicious' or intentionally trying to misrepresent the speed a system processed WU's at. Almost all 'dodgy' submissions were correctible errors that a conversation with the owner sorted out. I felt quite confident that the numbers put up on the results tables were to the best of our abilities accurate and honest. One up for the SETI/TLC community I felt. I think it is safe to say that by far the majority of TLC members put processing accuracy before processing speed  Even those who appear to froth eagerly when trying to achieve a few seconds gain would be quite upset if they discovered their systems were returning corrupt data. They would alter it to a more stable, SETI useful configuration (we are talking reducing the clock speed mainly but there are other factors) if they knew there was a problem.
As processing time depended almost exclusively on the angle range of a WU there was some merit in the idea of allowing people to submit any time as long as it
utilised a 0.417 WU. This was vetoed as no easy checks could be made on the submissions accuracy. With the introduction of the new WU benchmark it was possible to tighten up on the security of the numbers you submitted by asking for the result.sah (a file generated by the SETI client containing everything that Berkeley wanted to know about you and the data you had extracted from the WU). At the time of asking I didn't appreciate how important this would become, initially only thinking of it as a useful double-check. I could scan the header info for the cpu time, OS and client version and even whether the right WU had been used to run the benchmark. Even more you could examine the spike/pulse/triplet details for anomalies.

Unsettling news...
The 3.03 benchmarks started to trickle in and a few results were obviously in question because they contained extra spikes or the values were ridiculous. About this time along came Roelof and the fun started - not content with my slow old ways he cobbled together some code to check the result.sah far more thoroughly than a mere eyeballing could achieve. Boy! Did the results ever give us a shock. Many of them produced on highly-overclocked processors contained errors. If you overclock there’s a good chance you fall into this category. Many of you OC to the point where it locks and then reduce it a few MHz believing it to be now ‘stable’. The debate about whether overclocking and its ramifications was acceptable in a scientific enterprise suddenly loomed up. It seemed that although machines completed the benchmark in seemingly reasonable times the results showed that errors aplenty had been generated in the result.sah file! So actually reaching 100% completion of a WU is not a satisfactory measure of your systems reliability. You cannot be sure that your machine is producing kosher results just because it completes WU's at a close to average time! Just because you can play games, burn CD's and run the SETI client concurrently does not give any guarantee your system is error free.

Criteria for accepting bench results.
There have been several 'amusing' reports on alt.sci.seti and the TLC forum of people who have had whole strings (hundreds even) of 2 or 3 minute completions, great for stats but an obvious anomaly and a clear indication that some component in your box was F.U.B.A.R. But now we know that entirely acceptable looking systems are also capable of  creating bad results. Just altering the overclock by a few MHz can push your system into uncharted, error-producing territory and you will never know. As far as the results table was concerned we both agreed that if the result.sah contained any errors it would not be included as a genuine bench. This obviously set a clear new standard for benchmarking. Accurate results became the instant, absolute priority. This new knowledge is a bit disturbing as it makes me wonder fundamentally about the TLC benchmarks for earlier client versions. Good faith not withstanding many could be erroneous. History, I guess.

Problems in paradise...
Ignoring the small matter of TLC benchmarks, there are some serious implications here...Berkeley collects all your downloaded sahs and compares them for each WU to cross-check and thereby authenticate the data (multiple duplication being an intentional and necessary statistical validation) - as soon as two results (or three or whatever the requirement is) for a WU are identical then that becomes the ‘result’ and anything different can be discarded as in error. They have a massive database of results all referenced by a user id number. A little analysis would immediately show up which users were submitting duff results and which were reliable. Hard to believe that they haven't already done data sifting along these lines already! How much do they know about corrupt results. And have they decided to let people keep on downloading WU's to avoid bad publicity about the link between 'competition' (driven in part by overclockers) and wasted effort? Very recently the sad, disturbing events concerning the ‘gti’ hacked client being used have surfaced. It’s initial use by a small caucus of people has lead to a number of names high up the SETI top users page being deleted or modified to ‘waiting’ (6th, 9th, 11th, 16th and others at time of writing). I can only assume that the Berkeley crowd are furiously sifting through the many hundreds of thousands (perhaps topping a million) of results they submitted to discover the extent of the damage. Remember that if the only results returned for a WU came from the hacked client then in effect the WU was not processed at all. A monster cross-referencing job is being done right now to locate unprocessed or invalidated WU’s. Since a significant percentage of WU’s legitimately do not return any data in the result.sah in the first place the hacked clients blank results are as bad as false data. A superficially embarassing and unpleasant prank has actually fundamentally compromised the SETI projects results. The whole distributed arm of the project has become a multi-headed monster that though not beyond control is certainly not firmly on the rails. It will be sorted out but the damage is done.

What this means to us…
Are your overclocked results of any value at all? Are your normally clocked results of any value? Only one way to find out at present...we now have a reliable test of stability, run the benchmark and let Roelof compare it with his latest software. The numbers involved in the result.sah data have lots of decimal places. If you produce identical results you can be confident that for SETI purposes your under/normal/over-clocked system is running clean, smooth and producing valid data. Roelof will reply to your result submission with a short email letting you know whether your benchmark was flawed or okay. Eventually the acceptable ones will be added to those already in the table. Even if you were not thinking of submitting a benchmark it now becomes the best tool for validating your system for SETI work.

Final thoughts
The average number of resendings of WU’s is probably far higher than expected – due to corrupt result.sahs it might take several users to process a WU before two results were the same. Is there a list of users whose results are ignored routinely due to regular 'unrepeatable' returns? A significant number of results come from people like us who pride ourselves on running kit that is fast but stable (because we think we know how to do it properly) - are we deluding ourselves, are many of our overclocked boxes producing junk!
(this was written at leisure several days before the ’hacked’ client surfaced and  has been hastily edited by Roelof and myself in light of that fact).
Max out.

January 25, 2001       v3.03+ results latest: 24 January 2001 (8)

Short, sweet & fast (sounds like my life)...
An updated results page from Roelof is up and we have a confirmed kill  sub-5 from Tim Cole at a monster 4:39. This is the fastest reliably verified benchmark so far. I will scribble more on that subject (verification) soon. Also new to the table are ten more results including Duron, PII Xeon, PowerPC G3 and Celeron silicon.
Max out.

January 23, 2001       v3.03+ results latest: 22 January 2001 (7)

Ladies and Gentlemen the Analyst has entered the building...
As you might have noticed the results table has been revised extensively by Roelof (TLC Benchmark Analyst) and includes a number of extra columns which should give the data-devourers and comparison-kiddies amongst you a small thrill. For details about CpF (cycles per flop) you'd be best off at the SETI Spy site which has quite (understatement) a detailed explanation. Suffice to say it gives a reasonable estimate of the relative efficiencies of processors.

On the benchmarking scene a number of things stand out from submissions so far. Looking at the systems appearing it seem that everyone and their 'pets' are running monster hardware. A longer time to complete a benchmark WU does not seem to have put people off as much as I had expected! One exception to this is older slower systems (Pentiums IIs and K6-2s for instance). There is also a dearth of Celerons and Durons. So if you have a little time and want a small line in the TLC table crunch the benchmark and submit.

Though the top of the table is occupied by Gibbo205 at 5:05 the 5:09 result immediately (RogerW) underneath gives an indication of how well Alpha processors can compete while being only half the speed of the competition. Not that I think you are going to find too many salespeople quoting SETI benchmarks to prospective business clients (though admins. might have it in the back of their minds)! A happy thought.
At present breaking 6 hours is impressive and everything under 6:20 has needed 1GHz or more (excepting the raw power of the Alpha 21264 of course). But sub-fives can only be an update way! Stay tuned.

As a matter of  policy the results table gets priority for updating over my fulsome verbiage so I'm including the 'results latest...' on the date header for your information - one of those many small things that needed implementing. Any technical points, errors, corrections or discrepancies results-wise to Roelof, anything pleasant, helpful, funny or thoughtful to me (Max), anything else to Hanser.
Max out.

January 21, 2001

Hello goodbye, tears and cheers...
RB's decision to go back to the real world and family life is a sad loss to TLC in general and his mentorship of my efforts will be very much missed. He put considerable time, effort and humour into the site and helped me get up to speed long ago when I first volunteeered to help out. So the wheel turns and his resignation brings to a close his excellent contribution to TLC. I wish you luck and fun to your family in whatever direction you decide to head in.

Of course you can never know when something unexpected (that being the definition of the word) is going to pop up to brighten an otherwise miserable day. Roelof Engelbrecht contacted zAmboni and myself shortly after RB officially announced his return to more important things, children, partners, work etc. Roelof has volunteered to help on the benchmarking page and I don't think I can imagine anyone better to bring some light and understanding to this corner of the web. Initially he is going to oversee benchmarking results collation but we shall see what the future brings by way of his input. For those of you not familiar with Roelof's contribution to SETIdom he is the author of SETI Spy, speedy squasher of bugs and keen respondent on alt.sci.seti. He is always ready to supply factual advice to forum devotees being generally knowledgeable on SETI implememtation on a wide range of hardware. If you have ever emailed him, you already know that his almost legendary support speed for SETI Spy is justified. His appearance here brings wisdom and enlightenment in abundance. [Good enough Roelof or do you need more?]

Minor but important note to benchmarkers: the result.sah is a vital piece of authentication to include with your submission. Almost everyone has been gracious enough to include this small file and as such I am now making it mandatory for inclusion of your results in the table.
Max out.

January 17, 2001

Backroom action only...
I've put more systems up on the results page - don't you just love those sexy olive hues, I'm an Autumn person myself. Also there are a couple of minor extras to the 'benchmark file' and 'submit' pages...so things are buzzing along in fits and starts and I'll try and keep the benchmarks table a little more up to date. If you really like hunting for changes take a look around but there's nothing that will explode your footwear (probably a good job). For H.Oda fans there's been a new WCPUid out for a few days (thanks Roelof, I should check more often).
Max out.

January 15, 2001

Catching-up backwards
A new results table for v3.03 has started (at last) and thanks to the contributors so far. The obvious reluctance of manic crunchers to benchmark while the sands are still flowing for v3 is understandable. Plus of course the 0.417 benchmark takes longer and there is not a great deal left (if anything) to discern from such activities - in the land of WU, Angle Range is King. For those of you who wish to 'competitively benchmark' you can begin all over again with v3.03, submit your times and they'll be entered (in the table). Fun for all the family. 
We (the Royal 'we') are receiving upgrade messages from Berkeley and when it becomes mandatory everyone will be 'even' for a while - who knows the next quirk of SETI software progress. As promised there will be a round up of the final v3 submissions and words on the v3.03s so far but I will be pleased just to put this in place for tonight...
Just a final thought - Being a very conservative IE user (and Netscrape at work by imposition from on high) I decided it was time to try out Opera 5.01 and very pleasant it is too, except of course for the TLC site! I have roamed a fair few net nooks and crannies and it has displayed delightfully. Yet here on my own patch it produces some rather ugly formatting quirks. Could be me and my devotion to Front Page 2000 (sad) or just some Operatic non-adherence to standard html tags - I don't know. But to those of you (and they are multiplying) using Opera my apologies and I understand a little of your angst now! 
Max out

January 3, 2001

A short emptying of the head to start the year afresh...
It was very good having a break and now I'm back to the grind there's nothing too major to report, but you know the style - some minor housekeeping activity here, archiving, posting final v3 old bench WU table (at last)...worth noting that it loads and then appears to hang with blue bar and 'done' message in IE5.5 (well mine anyway) - give it a few seconds and up it comes, interesting. Nice (slight understatement) to see the old firm making it to number one (officially), Mike Ober being too popular for his own good (and now has a new host it appears), the Berkeley machines being a 'little' more consistent in sending out WU's, in updating data and being able to write '2001' for the year gives me a little thrill every time! Just in case you feel like taking a crunching vacation give SUN a couple of weeks to decide whether to lie down and accept the situation or whether they'll find an extra few thousand boxes to fight us with. All to the good methinks but as Larry Loen (amongst several) pointed out and many of you have probably started to grasp is, what next for TLC? The forum is awash with questions but few answers, when will 3.03 become mandatory is one of the more practical ones and so far the Berkeley Boys'n'Girls have been anything but predictable. Hang in, enjoy the ride and make your thoughts known. The last few unacknowledged v3 benches will get some words later this week. Any corrections, errors noticed or worthy words to me and I'll try and make good the imperfections in my little patch of this planet that I have some control over.
Max out.