Basic SETI FAQ
Join the friendly hardware freaks of TLC!

LAMB CHOP HOME
JOIN TEAM LAMB CHOP
TEAM STATISTICS

ARS TECHNICA FOOD COURT:
ARS TECHNICA ARS DISTRIBUTED FORUM LINKAGE
PERSONAL STATS LOOKUP:
SETI@Home ACCOUNT:
CONTACT WEBMASTER
(remove NO.SPAM)

Mad props go out to IronBits for hosting the site and all the others who have kept this site going for the past couple of years!.

The Search for Extra Terrestrial Intelligence at Home -- The Basic, Getting Started Frequently Asked Questions

By Larry Loen

Introduction

This is not the official FAQ of the SETI@home project.

This FAQ is independent of SETI@home and not officially affilliated in any way.

For official information, including a FAQ, see: http://www.setiathome.ssl.berkeley.edu.

That said, the official FAQ file doesn't cover every topic. This FAQ is written entirely from the point of view of a participant, especially the enthusiast who wants to make stronger than average contributions to the project. It also covers topics frequently seen in the Ars Technica forum and other SETI-related newsgroups on the net.

Basic Introduction

Basic Jargon

Uploading and Downloading

Client Performance

Miscellaneous

Getting Started

Caching

Buying and Borrowing Machines

SETI@home as competition

History, Hacking


Basic Introduction to SETI@home

What is SETI (the Search for Extraterrestrial Intelligence) and why do I care?

See the main SETI@home site for a great explanation and pointers to many resources. 

My personal explanation is simple: The goal of SETI is to see if we can successfully identify a signal from an alien world. If we did, it would be as momentous an occassion as any in human history, right up there with Columbus' reaching the New World. We'd know, if we learned nothing else, that once and for all we aren't alone in the universe. Who knows what that would mean?

And, all anyone has to do to be part of it is contribute spare CPU cycles. Pretty cool.

If that doesn't make you want to participate, there are plenty of other worthy projects which will be happy to take your spare computer cycles, but this is about SETI.

Who is "SETI"?

If you read the sci.astro.set and alt.sci.seti newsgroups, or the Ars Technica SETI discussions, you'd think (most of the time) that there was only one group doing SETI and one possible approach. Participants tend to say "SETI" and really mean "SETI@home." But, the SETI@home project is only one particular approach to the search for alien life-forms. There are many others.

Will SETI as a project really succeed?

Who knows? You can't find something if you never look. And, SETI@home has reported having found many interesting signals worth more analysis beyond what we do for them. All that said, there's a lot going against it. 

Some respected scientists say that the current SETI@home project in particular is looking in altogether the wrong set of frequencies or not in enough of the sky. There are serious alternative proposals for "optical SETI." The SETI@home FAQ itself implies that interstellar distances may simply be too great to allow us to detect alien radio signals from far away that aren't explicitly designed for us to discover. Maybe the ET congresses on other worlds will uniformly refuse to fund such a project that doesn't benefit their district. If so, we're going to be looking, de facto, at a much smaller, and closer, set of possible stars.

In short, one has to admit this is a bit of a longshot by any standard, even if there are plenty of inhabited worlds out there, which itself may not be so.

There's a famous equation, largely attributed to a scientist named Drake, which attempted, very roughly, to figure out how many civilizations are waiting to be found. 

The findings are roughly this: Plenty of stars out there, so even if life is rare, the many stars make up for that. But, whether a given planet will achieve a stable radio-based civilization, required for SETI to succeed, is a lot more chancy. If most civilizations nuke or pollute themselves to oblivion within a century or two of when they achieve radio, odds are against us hearing their signals. And, that's assuming all life-bearing planets evolve a critter to produce that first radio.

It is a fascinating quirk of human nature that SETI@home participants have all levels of expectation about the project's ultimate success, including expecting it to be a wild goose chase. Yet, regardless, they participate. Most people care passionately about the science, but there are other motivations to participate than being excited about whether ET will actually be found. And yet, the possibility is always there.


Getting Started with SETI@home

How do I participate?

Download and install the SETI@home software from the official web site. The installation procedure varies but is fairly painless and pretty typical for the environment. A Unix version looks like any other Unix install (i.e. a tarball). The Windows' version most participants start with is pretty simple and normal to deal with. It functions as a screen saver. But, as you'll discover, SETI fanatics configure the Windows version to run all the time. And, without the screen saver graphics. This is surprisingly benign for most computer users and lets you run SETI all the time, even when using the computer normally.

How do I get the SETI@home software?

The SETI@home license agreement specifically states that you must download the code from their web site. The software is free, but you are legally bound to download from them. That license provision is to protect you. This is powerful software that runs all the time and accesses your disk and your network frequently. Cyber-vandals and hackers would adore it if people got in the bad habit of downloading this kind of thing from all over the net. Ordinary prudence would say "don't accept this kind of code from any unofficial source."

What preparation must I do before executing SETI@home for the first time?

Have you got permission to run it? Not just from the boss at work (if your work machine) but whomever runs the network? Do you know if you can run SETI-related 3rd party programs (you may be able to run SETI@home and not the 3rd party code)? Get this all straight ahead of time. Trust us on this: You don't want to be explaining yourself after the fact. Get the required permissions up-front.

What happens when I run the SETI@home code for the first time?

When you start up SETI@home for the first time, you'll need to answer a few benign questions (the main bit of information that counts is your e-mail identifier). Then, to join Team Lamb Chop, you'll need to go to the SETI@home site and join. The instructions on the site are easy to find and follow. Welcome to the greatest team in SETI!


Basic Jargon

What is the SETI "client"?

The SETI@home software is typically called the "client." This comes from a bit of common computer jargon. Generally, when one computer hands out work to lots of other computers, the one computer is called the "server" and the rest are called the "clients." SETI@home and SETI@home participants use this nomenclature. Thus, the SETI@home software is usually just called "the client."

What is the GUI? What is the graphical user interface version of the client all about?

The Graphical User Interface is the "Windows screen saver" version of the software. Being more intensely interested in the project, few of us run this code much for reason that will become clear soon. But, there's nothing wrong with starting out with this code and running with it. It is stable, easy to run, and produces results. We usually call this the "GUI" or "GUI client."

What is CLI? What does Command Line Interface mean?

Most people who get enthusiastic about SETI will find the GUI is too cumbersome. Fortunately, SETI has developed (originally for Unix machines) a version that runs from a command line. This means, you open up a DOS Window, move to the correct directory, and type the name of the command (e.g. setiathome). Basically, in the Windows case, you download it, probably rename it to something handier, and run it in the directory of your choice. On many non Windows and non Mac machines, this "command line interface" or "CLI" is all there is.  While it has "winnt" in its name, it runs on everything from Windows 95 on.  To get started, see the Tips page for an excellent .pdf guide.

What is a "WU" or "work unit"?

SETI@home divides the project up into "work units." Each work unit (also called WU by participants) is a sample of some location of the sky at a known time and a narrow band of radio frequencies. SETI@home gets recorded tapes from radio telescopes, such as the one at Aricebo, and breaks them into these discrete and independent units. Eventually, if you install the SETI client, you end up downloading one of these work units onto your computer and the client on your computer analyzes it on SETI@home's behalf.

A work unit is about 350,000 bytes of input. When the client completes its analysis, you'll end up with a result file on the order of 4,000 bytes, though occassionally larger.

Even on a fast PC, a typical work unit will take many hours to run. Once it completes a work unit, it will wish to upload the results and download another work unit.

What's all this talk about "crunch" and "crunching"?

This is a venerable bit of computer jargon. The "crunch" is simply the act of processing a work unit in whole or in part. We sometimes call ourselves or our machines "crunchers" as well.

What is an "outage"?

This is a real-world project. That means that while the SETI@home site is surprisingly robust, it is sometimes unavailable for many hours at a time.

Why does this matter? Well, while it takes many hours to crunch a unit, but the crunch does end. Once it ends, it's time to upload the results. Your contribution is invisible until the answer is uploaded. If the site is unavailable, then no upload, at least not for a while. Moreover, your computer can run out of work and simply wait. Yuck! Fortunately, this isn't as bad as it sounds. Team Lambchop is prepared!

The good news is that the SETI@home site has been available at least 95 per cent of the time, really more like 98 per cent. While this is very good, objectively, it can be a strain to dedicated crunchers' minds when an outage happens. Many of us just hate to see our machines stand idle. Moreover, the outages tend to come in bunches. A famous case had a contractor cut a key cable on Berkeley University's grounds somewhere. That took three or four days to recover.

Even better news is that we have helped develop a technology called "caching" to work around outages even when they happen. This will be described later and it even has whole other FAQs devoted to it.

What is a "brownout"? What is all this talk about "bandwidth restrictions"?

Recently, we've experienced a new and slightly different form of outage. Instead of the SETI@home site being down, it is merely overloaded. Since it has been so successful, SETI@home has an upper bound on how much communications bandwidth it can consume. Sometimes, especially first shift US Pacific Time, it reaches these limits. When it does, people have more or less difficulty with both uploading and downloading. Persistence and caching (I promise, we'll have plenty on caching in due course) are again the solutions.

What is a "farm"?

Some of us have managed to get more than one machine, even many more than one machine, running SETI@home. Somehow, running a bunch of machines came to be known as a "farm" and the term stuck.

What is "borging" or "assimilating"?

Running (with permission!) any number of machines at work. The idea here is to convince friends to run SETI@home on their machines, ideally on your behalf or at least for Team Lamb Chop. See the "Buying and Borrowing" section for more.


Uploading and Downloading Work

I guess this means that the client will have to access the Internet from time to time. How can I get a new work unit when I'm away from my computer?

The simplest way is to run the client as-is. If you start with the easiest method, running the Windows GUI client, this is how you will operate: The client will need access to the Internet when it finishes one WU and needs to download another. You can set it up to automatically download a new unit (if you're on-line all the time) or have it to prompt you for the new one so as to save on connect charges.

If you pay for Internet access by the minute, even the prompting technique will become unwieldy. To be sure, the GUI code does a fairly good job of letting you manage this if you own only a single computer. And, if you have software that manages a feature called "dial on demand" skillfully, you still let SETI@home automatically download new work. The "dial on demand" feature can ensure that when the SETI@home client needs new work, the computer will connect itself, download the unit, waste a bit of connect charges (based on a time limit you specify), and disconnect. For some, this is satisfactory.

But, many of us try to run SETI on many machines. We try to run them twenty four hours a day, seven days a week. The facilities built into the SETI GUI client frustrate most people who try and run many machines or even one machine operated so intensely. No matter what speed your computer is, running all the time means it will want a new work unit at 3AM. For many people, this leads sooner or later to something undesired -- dial tones in the middle of the night, expensive connect charges and so on.

There are several schemes that enable you to "cache" work units so that you don't need to be baby sitting your computer all day and all night.

The CLI can be run under the "continuous connection" rules as the GUI or it can be used with "caching" to manage this whole upload/download cycle much better.

What about firewalls? Can SETI@home work in a company setting?

SETI@home has built-in proxy configuration. You can use ordinary proxy or socks proxy. See the instructions -- it is easy to set up and use.


Caching SETI@home Work Units

What is "caching" and why do I care?

The short answer is: Caching cancels problems. Those who cache will see their machines busier, oftener.

Caching is any scheme that lets you have work units set aside ("in your pocket," as it were) for later use. Reasons to cache include:

  • Overcoming outages at the SETI@home site.
  • Controlling "connect charges" if you have a dialup connection
  • Dealing with "brownout" conditions at SETI@home
  • Managing a local network of machines (see "farm").
  • Moving some specific work unit types to machines that "crunch" them better.
  • Obtaining statistics about the work units you processed and tracking progress.

For Windows users, there are many free "3rd party" products created by SETI@home enthusiasts to manage the mechanics of caching. In addition, some permit you to keep records of the type and duration of the work units you run.

See our "Tips Page" for articles relating to many available caching and monitoring tools.

Most Team Lamb Chop participants use at least one of these facilities and love them. Typically, one machine is chosen as kind of an intermediate server. It deals with uploading and downloading to the SETI@home site. It, in turn, distributes work units to (usually several) other machines. While these third party tools all seem to run on Windows, some have varying capability of managing nonWindows machines.

Once you begin running the CLI, you really should give these tools a chance. There are a few exceptions to this, especially for cases where you can't have 3rd party software running or can't or don't wish your own centralized server (which may idle many machines if it fails). A supplementary FAQ is available to assist with these cases if they apply to you.

How Many Units Should I Cache?

This is a surprisingly difficult question and getting more difficult to answer as time goes on. The most obvious answer might be "as many as you can get." Indeed, for quite a while, this was a good answer.

Factors:

  • Many members just can't abide to have their machine idle. Even one per cent idle time is too much loss. This reality drives much of what is written in the forum. Our demands for uptime (at least from SETI@home) are extreme!
  • We know that SETI@home sends out a given work unit more than once (largely for security reasons).
  • We know that the first two results for a given work unit are likely to be more important than any additional returns (see "Miscellaneous" for more).
  • "Brownout" conditions, a more recent phenomena, make obtaining a lot of cached units at least intermittently difficult and may change how and why we cache.

Analysis of these factors suggest that a result ought to be returned back within about a week of when it was obtained. The sooner your result is returned, the more certain your contribution is genuine. Results returned after a week are still credited to your account. Formally, these "late" results are used with the earlier results, according to the SETI@home project designers. In that sense, they contribute. But, there are technical reasons to doubt that the third and fourth returned answer really contribute. While late units "count" on your stats, they won't help find ET.

Accordingly, most Team Lamb Choppers have caches of several days to a week's worth of production. After one has run SETI@home for a while, one will know how many units this is. Having a week's worth "in hand" reflects experiences like the time the cable was cut and SETI@home was offline for many days. It has cost little or nothing, up to now, to have a "deep" cache of this kind. 

Only the new factor of "brownouts" could cause this very popular strategy to change. Brownouts make filling a cache much more difficult to do. We may eventually (and reluctantly) have to settle for fewer units in the cache. The reasons for this are complex and dealt with (see the Tips page for an appropriate article).

Some Team Lamb Chop members, writing in Ars Technica's forum, may go so far as to suggest that it is impossible to enjoy the project without the kinds of "deep" caches we're used to. Don't believe it. The author, who has been forced by unusual circumstances to run without caching, can state with confidence that for all the effort involved, the difference between caching and not caching is only about three per cent a year.

For now, one should get two to seven days' worth of units in the cache and wait to see if brownouts change our strategy. So far, brownouts are intermittent, which means one can have the cache depth one wants nearly all the time.


Client Performance

How do I know if my machine is as fast as it should be?

The Team Lamb Chop site has a standard benchmark. It is a specific work unit we have set aside just so we can measure our machines. Someone long ago processed it and returned the result to SETI@home.

What we do is run it in a special mode provided by the SETI client software just to see how long the work unit processing takes. The result isn't uploaded nor is a new work unit downloaded. The client itself records the run time to do the work unit processing, which is far and away the majority of the time spent.  That run time is recorded on this site by many participants.

See: bench/303results.htm for results for many computers.

Keep in mind that these are often highly tuned machines. See the next question.

Hey, my machine is a lot slower than the benchmark results tables show, why?

There are a lot of reasons. 

One is that many participants "overclock" their machines. Many of our participants own machines that they know how to "tweak" in special ways. Their BIOS allows them to do things ordinary users would never dream of. This enables them to do two major things: Run the CPU faster than its official MHz rating and, almost as important, run the main memory of their machine faster than its official rating. Either can make the machine much faster than it "looks." These tricks are too varied and arcane to list here. But, Ars Technica is a great place to learn how to do them when and if you are ready.

More basic slowdowns come from taking the defaults on the Windows GUI. If you take the defaults, the screen saver runs all the time when SETI@home is running. Moreover, SETI@home shuts down when not screen saving. Both of these really stretch out the amount of time a work unit may take. It is easy enough to set the Windows screen saver to both run the GUI and to have the screen "blank" fairly soon after the GUI starts up. Access the control panel and "display" to do this. This greatly speeds the SETI@home time. It is also important, for high performance, to set the GUI to the "run continuously" mode. This is a simple option in the SETI@home program itself. Right click on the little SETI@home antenna and you'll soon find this this option. Better still, switch to the CLI (command line interface) and that will, by itself, ensure faster work unit processing.

Another can be that the work units vary in how much they cost. If you are within 20 percent of the posted benchmark times, you are probably doing OK. If you want that last 20 percent, or just want help, a posting in the Ars Technica fora will get you prompt answers.


Buying and Borrowing Machines -- The Joys and Sorrows of Multiple Machines

Should I buy one or more computers to run SETI@home?

As you may have gathered, many of us run more than one machine on SETI@home. But, it is a major personal watershed to actually buy a machine whose sole or principal purpose is to run SETI@home.

Doing this is a very personal decision. It is not required for team membership. That said, some people get the 'bug' very badly and do buy their own machines (often "stripped down" in various ways so that they really are just for SETI). However, the history of the 3.03 client (see "History" later on) reminds us of what could happen and, to a degree, has happened a couple of times now. If you buy a machine just for this project, you must be prepared to see arbitrary changes made to the client software. Most of them will 'devalue' how many work units you will be able to produce. If you know this and understand this, then you can make an informed decision about building up a SETI farm. You have been warned. That said, there's a lot to be learned about building systems on the cheap, running Linux, and overclocking standard Intel or AMD boxes that come from this.

How does one "borg" machines at work?

There's a fine art to this. Always remember the other person is doing you a favor. When I approach someone about running SETI@home on their machine, I always prominently offer to run SETI@home for the benefit of whomever has the machine. That is, I offer them the credit on their user ID.  With near universality, they are happy to let me have the credit instead and are interested in letting me run the project, business condidtions permitting. Two cautions: 1) Get permission. People have been criminally prosecuted (really) for running this stuff without permission. Also, do you want to be grousing in the unemployement line about the dumbo who fired you for running SETI? 2) Don't interfere with real work, ever. You don't want to talk to your boss about crashing the month end report program so you could squeeze off a few work units. There's plenty of idle time -- I've left machines alone for weeks until the time was ripe. I'm thousands of WU richer for it.  I have also written bits of operational code to make my running invisiable and painless.  There are techniques to run SETI@home out of the system tray, which can help you get permission to run on work machines, including when its owner has signed off, but left the machine running.

Aren't there other machines than Intel machines involved in this project? How do I get my system involved?

There are plenty of nonIntel machines involved. The author of this FAQ runs many nonIntel machines himself. But, it is a fact of life that Intel CPUs dominate the world in terms of raw CPU count, with the Intel-compatible AMD rapidly moving into second place, if not already there. So, in terms of sheer volume, postings on the project (especially for custom-built machines) will sometimes be so Intel-oriented as to drown out other voice as a matter of sheer demographics. And, in our group, AMD machines seem to have at  least equal footing with Intel these days. Since they are largely compatible, a lot of comments for one applies to the other.

But, there is a substantial inventory of other machine types. As always, the Macintosh crowd has made a good showing. All major Unix boxes are there in force; Sun had been the leading team in terms of production for most of the SETI@home period (through March of 2001 when we overtook them for the number one team spot). Indeed, by now, nearly anything with any market share has a SETI@home client. Most CPU types have not only their "home" operating system (e.g. OS/400, Solaris) but also a Linux version available. Things like BEOS, BSD, and OS/2 are also available if you like those operating systems, so even the Intel crowd is fully represented by its various OS alternatives.

Your machine and its operating system probably can be set up for SETI. See the "text only" download page at the SETI site and look for your combination.


Competitive SETI@home -- Running SETI as Blood Sport, Running "Gauntlets"

What are motivations besides sheer science to run SETI@home?

Many of us get very excited about SETI@home as a sheer competition. For some, this is the entire motivation to participate. Just like fishing or golf, SETI@home can be done for fun or as a near blood-sport. Some even admit that SETI is as addicting as either golf or fishing.

The only downside is that some people lose sight of the science. See "Hacking".

I see references to "gauntlets". What's that about?

Many members challenge each other as individuals or as organized "subteams" to short contests to see who can produce the most over some relevant interval, usually several weeks. Some of us have added resources that we do not always apply to SETI@home. "Gauntlets" can bring such resources to bear on producing more SETI@home results for individuals and for the team at large.


Hacking and SETI@home

Has anyone cheated and done any hacking of the SETI@home project?

Yes. However, the damage to the project itself so far appears to be minimal. Most hackers seem motivated by "putting up big work unit completion numbers," so the known hacks have been crude and easily segregated from valid results. You may sometimes see arguments about the hacks that have been done, and whether it has hurt the project. The controversy is: Have any "hack" results been accepted as valid and if so, how many and what does it mean?

The SETI@home FAQ admits that it has certain added information that goes back with each result to help prove that the regular SETI client code created the result. This has not been perfect, but it also means some work needs to be done to hack in. In addition, they have also revealed in newsgroups that they send out each work unit more than once and require "at least" two results to be returned before a work unit is discarded. Even simple modelling suggests requiring multiple results by itself is a very powerful limit on hacks (to say nothing of hardware bugs.  We also know some overclocked machines have turned in incorrect results).

More fundamentally, any positive results will be reanalyzed by the SETI@home scientists themselves, which will catch any bogus positive results. Thus, any hacks that stay hidden would have to report "no interesting signal found."

Ultimately, the most hackers can expect to do is reduce the number of work units the project has available to process. When discovered, hacked results are purged from the SETI statistics, which makes their "big numbers" rather irrelevant.

Things aren't all sweetness and light, however. The most significant known hack was self-confessed and of long duration. The hackers weren't detected before the confession in that case. We've since seen certain "participants" suddenly disappear from the project statistics. So, it is clear that some of this is detectable. In a few cases, we helped the SETI@home administrators find such people.

What's a replay attack and why should I care?

The most recent hack attack was simultaneously less technically challenging and yet more exciting to participants. Basically SETI@home advertised that it has a check in case someone tries the simplest hack of all -- simply sending back the same result multiple times. The SETI@home client will return (and re-return) any result file it sees. It turns out that if you work things just so, you can replay (return) the same result to SETI@home over and over again without getting caught. The reason SETI@home's check doesn't always work is thought to be a bit of economy. There is no doubt that any replay unit can be easily detected and eliminated. We know this because the SETI@home administrators done so for individual cheaters. Since the returns are of validly crunched units, all this attack really does is boost up someone's participation statistics. This gets honest participants excited, but at least there's no harm to the project.

At this writing, most of the "replay" units seem to have been eliminated, though some of the cheaters may be trying again with different user identifiers. If so, we can expect these units to be removed whenever the SETI@home administrators feel like it.

Still, it was exciting for a while since the SETI@home administrators were not "feeling like it." It took press attention for them to take organized action. One of our own members, fragile, was particuarly energetic in uncovering and exposing this so that action was taken at long last. This action protects the participants' statistics (i.e. if you're in 540th place, the other 539 ahead of you are honest) and renders this attack pointless in the extreme. Before SETI@home took action, some of us were wondering if cheaters were going to dictate things like who wins the team competition. Now it looks like there may be a few small scale cheaters left flying below the radar, but any greedy cheaters (are there any other kind, really?) will eventually get zeroed out.

I notice a phrase WTK a lot. What's that?

See the "Hacking" question. One of the self-confessed hackers' name began with a K. It has become a sort of swear word. In many forums, WTF is "what the (expletive)." Substitute the name and you get the idea. 

Some of those long duration hackers were members of Team Lamb Chop for a while. This is a bit of an embarassment, in fact, but one can't control who joins a given SETI team. Suffice to say that when they confessed, they were read the riot act by us. They did have the good grace to leave our team before confessing, taking the unwanted bogus work units with them, but since they confessed in our Ars Technica forum, we still feel the sting of their time with us. Hence, things like WTK.


SETI@home History

This project has lived long enough to have a history. This section covers a lot of this history. If you don't want to spend time on this now, come back when the references in the forum get too obscure to follow.

I notice there's a discussion about the "3.03 client" or the "2.4 client." Why has SETI@home created so many versions of its code?

SETI@home has exceeded its own expectations. This has created both problems and opportunities. It has created problems in that its sponsoring university has been forced to put an arbitrary "cap" on how much communications bandwidth it consumes. It has created opportunies in that one response to the large number of participants, and the bandwidth problem, has been to add to the amount of signal processing done in each work unit. This was done when the current 3.03 client was created. This has resulted in several versions of the code, each one looking a bit harder than the last for ET, and (except for the 3.0 version), each taking longer on the very same machine to calculate a work unit.

But, if they need to conserve bandwidth, won't they someday have to confront this directly instead of just adding more and more analysis to the client code?

Unknown. Veteran Team Lamb Chop posters strongly suspect this will be true. However, the SETI@home team has so far claimed they keep adding new science analysis to each client. Many prominent Team Lamb Chop participants, however, strongly suspect that the 3.03 client did not add significant scientific value over its immediate predecessor, 3.0, which delivered faster performance for some added science (the only time this happened). When the March 2002 brownout problems began, postings from the project designers in the various SETI@home newsgroups seem to have tacitly confirmed this long-held belief or at least said there was nothing left to add. They now appear to be attacking the bandwidth problem more directly. However, at this project's scale, the costs of the bandwidth will be significant and the project may end up with a more-or-less permanent bandwidth limit. We are preparing ourselves for such a situation, if it occurs. But, SETI@home has so far sidestepped this issue.

Can someone explain why there's so much fuss about 3.03?

While it has pretty well died down now, you'll still see comments about the 3.03 client.  A couple of reasons. One is that 3.03 came out in comparative haste, suggesting accountants more than scientists increased the analysis added over 3.0. SETI@home had long trumpeted the improved science and speed of the immediate predecessor, 3.0, which took longer to arrive. The 3.03 client is much slower than the 3.0 client; 60 percent slower to twice as slow. It is suspected of including a lot of marginal processing simply to reduce bandwidth requirements at the main site. SETI@home conceded all but the "marginal" part when 3.03 came out and circa March of 2002 when bandwidth apparently hit some internal limits.

Some also don't like the 3.03 because it (and its immediate 3.0 predecessor) was re-coded to be less sensitive to large L2 caches. Earlier clients were much faster with large L2 caches. Why does this matter? Some serious SETI@home fans purchase their own machines just for SETI@home. This far back, some of these users were using buying older and fairly inexpensive 400 MHz Xeon processors, with large caches. These gave, for the 2.4 client, results comparable to regular 800 MHz Pentiums, hot machines at the time.

When SETI refocussed its emphasis on its then real-world inventory (that is, Pentium III 256 KB cache Coppermine chips, P IIs of all kinds, and Celerons), even the 3.0 client came as an unpleasant surprise to these Xeon owners. The 3.03 only added to the injury. The 400 MHz Xeons performed like ordinary (and cheaper) 400 MHz Pentiums because SETI reduced its memory sensitivity. It still cares about memory speed, but it now cares substantially less than before.

Even without this, the net effect is that the 3.03 client greatly reduced the value of computers purchased just for SETI@home. 

Finally, the 3.03 client is the first to replace the old clients in total. This was done by the draconian means of invalidating the original web address of SETI@home as 3.03 "crossed over." The "forced march" nature of this upgrade created practical problems for anyone who had managed to get SETI running on a lot of machines.


Miscellaneous

What is the lifetime of a work unit and why does it matter?

The exact details of the "life of a typical work unit" are not fully known. Some details may be held back simply because they haven't gotten around to telling anyone. Some may be held back for anti-hacking reasons. But, a lot of interested, smart people have made some good surmises.

The site www.roving-mouse.com/setiathome has pretty good looking analysis of the probable work flow. A work unit starts life as part of a long, continuous recorded signal at a radio telescope (usually, the "big one" at Aricebo). Tapes are sent from Aricebo to the SETI site in Berkeley. The tapes are "broken up" or "split" into work units of about 350,000 bytes each.

Once the WU is born, it has a name, is recorded in some data base at SETI@home, and is part of a fairly large pool (probably about 150,000 work units minimum) to be handed out to SETI@home clients. If roving mouse has this right, a typical work unit is currently shipped out between 2.4 and 4 times. This has also been informally admitted by various SETI@home officials in the fora.

What happens when the work unit results come back is less clear. What is known is that once two results come in, the work unit can be deleted (not made available for further crunching). That means that the first two results for a work unit certainly contribute to the project. SETI officials have said that every unit returned, no matter how late, will be used in a resolution procedure to determine whether the WU has anything to do with hearing an alien signal.

But, there is actually some room to doubt this, at least in practice. It is certainly clear that the first two results, if agreeing within the bounds of floating point accuracy, contribute, because it has a very practical value in eliminating error and fraud.

Work unit that come in after that may officially participate in some sort of resolution procedure, but if they are virtually identical to the first two units, then all those later returned units really did was waste SETI@home's time later on.

For enthusiasts, this matters, because if one's strategy for caching delays the return of results too long, it means that SETI@home will have handed out each of their work units to at least two others and those others would have returned theirs long since. This means that one's crunching could be only for personal statistics. Being one of the first two "holders" of a given work unit to return data will ensure, under any scheme, that the work counts the most significantly.

The obvious answer is to make sure results are returned as soon as possible. Strategies that "dump" caches infrequently put the value of the crunch somewhat at risk.

The current best guess (informed by a little crude modelling) is that work units returned within a week (certainly within two or three days) are highly likely to contribute to the project under any scenario.

What is the VLAR problem?

VLAR means "very low angle range." The angle range is an aspect of the work unit and affects the content of the data. There is a chart somewhere on the SETI@home site that describes this in exhaustive detail. Suffice to say that what is looked for in a WU varies because some things won't be detected at certain angle ranges.

Windows users of the client, analyzing this information, noticed that the VLAR units were taking long, sometimes very much longer, than one would expect given the declared analysis that should be taking place. We still are not certain what the cause is. Some, including this author, think it is in some of the operating system calls (the client checkpoints about once a minute). Others suspect the client itself. That such angle ranges are slower in Windows is certain. Some of the queuing software has gone as far as to steer such units to Linux machines, which do not have the slowdown. Newcomers to SETI@home shouldn't get too excited about this. If it bothers you, find out how to steer them to Linux or to your slower Windows machines.

I think the SETI client should do multi-threading. How come it doesn't handle my multiple processor machine?

Actually, it can and it will handle multiple processors. There is nothing magical about multi-threading versus non-multi-threading applications. The questions on caching have already talked about this. SETI is about as "pure" a distributed application as there has ever been. A single copy of the SETI program (using, therefore, only one "thread") will consume 99 per cent of a single CPU in all the environments this author knows of. Therefore, the way to use multiple CPUs is to create another directory and start up another copy of the SETI@home client that points to the new directory. This will still eat up about 99% of both CPUs, which is about as good as it gets. Multi-threading the SETI client could be done, but in terms of raw production, it has no advantage whatever over this approach. The author has successfully operated 24 way multi-processors this way, with all the CPUs being very effectively utilized. Simple and effective. Moreover, it is occassionally helpful to be able to start and stop individual copies of the program, something difficult with multi-threading. Given that, why build something more complicated than necessary? (PS, this approach assumes the command line interface client, but anyone running SMPs, especially larger ones, will want and even require the command line).