Self-caching and Sneakernetting
Hardware freaks won’t let lack of tools and networks get in the way of crunching!

LAMB CHOP HOME
JOIN TEAM LAMB CHOP
TEAM STATISTICS

ARS TECHNICA FOOD COURT:
ARS TECHNICA ARS DISTRIBUTED FORUM LINKAGE
PERSONAL STATS LOOKUP:
SETI@Home ACCOUNT:
CONTACT WEBMASTER
(remove NO.SPAM)

Mad props go out to IronBits for hosting the site and all the others who have kept this site going for the past couple of years!.

By Larry Loen

Introduction

Some of us are just a bit unlucky. We may have many machines available to us at work, but our work environments may not allow us to run anything but SETI@home itself. The bigger the company, the more likely some legal impediment would arise.
Don’t miss your chance, go to new australian online casinos only here good luck awaits you!

Hurry up and start winning with casino 25 euro bonus ohne einzahlung at our casino. Limited supply!

Or, perhaps there are a whole bunch of machines that for any number of reasons aren’t connected to the Internet, but are otherwise usable.� Or, maybe you just like writing your own scripts. Buy best baby toothbrush. Monitor your child’s dental health.

This section is dedicated to schemes that allow you to function quite successfully on your own, without programs like SETI Queue and SETI Driver when special problems make you wish to do special things.

Scenario One. A Manual Process

The SETI@home README file suggests the solution adapted next; fairly adequate facilities that anticipate caching are built into the SETI@home command line client as options.

When to use: When you have few machines and when you wish to manually control when they do and don’t run SETI@home.

A simple Windows script (.bat file) looks like this (assumes the “command line” client is in c:Program Files\setiathome and that the short name for the Windows command line ends up being setiat~1).� It further assumes that you run SETI@home out of the “home” directory (…\setiathome) and then switch to subdirectories seti2, seti3, seti4, etc. as far as you wish to go:

xferseti.bat:
cd c:\progra~1\setiathome
type result.sah >> results.txt
setiat~1 -stop_after_xfer
cd seti2
type result.sah >> results.txt
..\setiat~1 -stop_after_xfer
cd ..\seti3
type result.sah >> results.txt
..\setiat~1 -stop_after_xfer
cd ..
compseti.bat:
cd c:\progra~1\setiathome
setiat~1 -stop_after_process
cd seti2
..\setiat~1 -stop_after_process
cd ..\seti3
..\setiat~1 -stop_after_process
cd ..

In a typical use, xferseti would be run “by hand” from an ordinary DOS prompt. It would go to your main directory, c:\program files\setiathome and download a new work unit to that directory.

It would then stop, change to the directory c:\Program Files\setiathome\seti2 and download another work unit into that directory and stop. Then, it changes to …\seti3 and does this again.

Thus, there are as many as three outstanding work units. If more are wanted, this idea can be repeated as needed.

Actually, it does one other (optional) thing. If you include the “type” statements show above, it will also save your results (always in result.sah) to a file for later analysis if you wish it. If there is no current result file, the “type” statement does nothing and leaves “results.txt” as it was. If you don’t care about that and don’t want a slow consumption of disk space, omit the lines starting with “type result …” in the above.

The second case, compseti.bat, is also started up by hand from the DOS prompt using:

start compseti.bat

as the command. It will be the one that does the actual computation. It can be running while xferseti is running, because the setiathome code guards against anyone trying to run two copies of the program in the same directory.

Accordingly, the number of directories should be chosen so that the total number of outstanding units are one or two day’s worth of production. Or six or seven days if you really need a large cache.

Selecting the number of directories is a bit of an art. Letting compseti run right out the bitter end can risk idle time, because you run out of work units. But, too many directories means work units can get a bit “old” since the last directory or two are never reached at all. The solution? Windows lets you rename directories readily, so this is no problem. Overall, this means you can cancel compseti whenever you like. If you are worried about an aging work unit, just rename, say, directory seti7 to, say the seti2 directory. The old unit will then get crunched reasonably soon after the next restart or even the one you just did if it is still in the first directory. Just make sure you aren’t running in the seti2 or seti7 directory to start with (imminently recent files in the directory tip you off).

Since SETI@home’s client checkpoints every minute or so, you should feel free to stop and restart the client code whenever you need to. This is modestly important because we have long held that you are surer to actually contribute to the science if you don’t hold onto work units too long. If you don’t do it, it’s not the end of the world — the units still count and (at least formally) will be processed by SETI@home.

The rename trick to move some “older” directory to seti2 is the classic:
�

cd c:\progra~1\setiathome
rename seti7 seti7a
rename seti2 seti7
rename seti7a seti2


This process will work very well for those with intermittent dialup access to the Internet on a couple of machines.� It requires no programming and is pretty foolproof.

Scenario Two — Continuous, Self-hosted Access

There will be those who have access to many machines, but who cannot use the regular caching software. What’s looked for in this case is “hands off” execution.

When to use: When you want the machines to operate themselves, with little effort on your part after setup.

Basic strategy: Execute the client so that it crunches the work unit. Then, have it upload the result, download a new work unit, and stop. It would then go on to the next work unit. SETI@home has options on its command line that enables this. One critical feature (shown here with a bit of Java code): A “sleep” function. More on that after the scripts:

cd c:\progra~1\setiathome
call setupYourOwnJavaIfYouHaveItElseOmitThisLine
:loop
rem Replace with "java" if you don't like jview
jview Sleep 600
rem Main directory
setiat~1 -proxy yourproxy:8080 -stop_after_process
type result.sah >> results.txt
setiat~1 -proxy yourproxy:8080 -stop_after_xfer
rem Second directory
cd seti2
..\setiat~1 -proxy yourproxy:8080 -stop_after_process
type result.sah >> results.txt
..\setiat~1 -proxy yourproxy:8080 -stop_after_xfer
rem Third directory
cd ..\seti3
..\setiat~1 -proxy yourproxy:8080 -stop_after_process
type result.sah >> results.txt
..\setiat~1 -proxy yourproxy:8080 -stop_after_xfer
cd ..
goto loop

Note that the “Sleep” program is a Java program you must supply. Unlike Unix, there is no built-in “sleep” command in DOS that delays execution. The “sleep” is highly desirable because if the SETI site is down long enough to exhaust your cache, it prevents you from beating the heck out of the network when doing so benefits no one, least of all you. It also avoids a lot of redundant stuff in the results.txt file. In normal operation, the cost of the “sleep” is immeasurable. If you ever run out of work, you’ll be grateful for sleeping.

The heart of the Java sleep program is simply:

public static final void main(String args[]) {
int sleep_time= 100000; // 100 sec default
�try {
�if (args.length>0) sleep_time = (Integer.parseInt(args[0]))*1000;
�Thread.currentThread().sleep(sleep_time);
�} catch (Exception e) { System.err.println("Error: "+e.toString());
������������������������ System.exit(1); };
}

If you have Microsoft C/C++ available, the alternative is to create a program called “sleep” and use that.� The heart of such a program is:

#include <windows.h>
#include <winbase.h>
#include <stdlib.h>
int main(int argc, char *argv[]) {
��� int sleeptime = 10000; // default is 10 sec.
��� if (argc>1) sleeptime=(atoi(argv[1])*1000);
��� Sleep(sleeptime);
��� return 0;
}

Note carefully that you cannot use this technique without some form of the sleep function.

The reason is simple:� If there is an extended outage (or, nowadays, a brownout), you may loop more-or-less continuously.� In that case, the looping script will resemble a “script kiddie” attack of some kind.� You may face very unpleasant discussions with your boss, some local or even not so local network administrators.� Meanwhile, when running normally, the sleep time (ten minutes in perhaps two days’ running time) is insignificant.

Adding Linux to the Mix

But, what if you don’t do Windows? What if you do Linux, Unix, etc.

The self-hosting scripts can be very easily adapted to run on Unix machines. These vary slightly as there are so many “shell” or “scripting” facilities.

For Linux users, here’s how the popular bash script could do the looping version:

while (/bin/true ) ; do
# Linux and Unix have a built-in sleep command
� sleep 600
#� Main directory
� setiathome -proxy yourproxy:8080 -stop_after_process
� cat result.sah >> results.txt
� setiathome -proxy yourproxy:8080 -stop_after_xfer
# Second directory
� cd seti2
� ../setiathome -proxy yourproxy:8080 -stop_after_process
cat result.sah >> results.txt
� ../setiathome -proxy yourproxy:8080 -stop_after_xfer
# Third directory
� cd ../seti3
� ../setiathome -proxy yourproxy:8080 -stop_after_process
� cat result.sah >> results.txt
� ../setiathome -proxy yourproxy:8080 -stop_after_xfer
� cd ..
done

Multiple Copies Instead in Linux/Unix

Finally, Unix and Linux machines have a separate form of caching, main storage permitting: Simply run more than one instance of SETI@home per CPU! If you have one CPU, run two simultaneous copies in different directories. If you have two, run four, and so on. This is very simple; each copy has a short script that just does a CD to some directory and runs setiathome. This overcomes all but the longest outages.

This can be particularly easy, robust, and effective with the cron daemon facilities. Since SETI@home sometimes has outages, even brief ones, restarting can be is important to avoid idle time for unattended machines.

Another special benefit in a corporate setting: Starting and stopping can be valuable when one might only have permission to run overnight. It can also overcome machine situations which you don’t have full control over. For instance, machines that might be rebooted without your knowledge.

A typical setup would look like this:

5 21 * * * cd /home/you/seti ; ./setiathome\
��� -proxy yourproxy:8080  > /dev/null 2> /dev/null &
7 21 * * * cd /home/you/seti/seti2 ; ./setiathome \
��� -proxy yourproxy:8080  > /dev/null 2> /dev/null &
55 7 * * 1-5 cd /home/you/seti ; kill `cat pid.sah`
56 7 * * 1-5 cd /home/you/seti/seti2 ; kill `cat pid.sah`

The reverse quotes (`) are significant and important.� The backslash is for readabilty of this FAQ file only.� Delete it and join the next line at that point.

This scheme would start up SETI@home (two copies) at five and seven minutes after 9 PM and stop each every week day at 7:56 in the morning Monday through Friday. Thus, this scheme would have you running overnight every weekday and all weekend.

Another setup would be:

5 21 * * * cd /home/you/seti ; ./setiathome \
��-proxy yourproxy:8080  > /dev/null 2> /dev/null &
7 21 * * * cd /home/you/seti/seti2 ; ./setiathome \
��-proxy yourproxy:8080  > /dev/null 2> /dev/null &

Here, two copies are fired up every night around 9 PM. If the existing copies are still running, SETI detects this and terminates the new versions. If the existing copies terminated for any reason, the cron entries above will restart new ones. The effect should be nearly around-the-clock operation with little or no outage time, including local problems like someone rebooting your computer.

Sneakernetting — What to do Without an Internet Connection

The above concepts are almost everything required to deal with a machine lacking an internet connection.� What does that mean?

There are two basic cases — a totally standalone machine and a set of machines in a standalone network.� The latter is more common nowadays.

The basic requirement over and above the preceding discussion is some form of media (diskette, classically) and a little added ingenuity.

Let’s change the example we’ve been using a little bit.� Let’s have the remote machine crunch entirely out of secondary directories.� Just to keep the numbering constant, let’s start with seti2 and then seti3, seti4, and so on all the way to seti21 so that there are twenty total directories.� Let’s assume that the machine is a 2way SMP.

Now, it turns out that with either Linux or Windows, the gzip or regular Windows zip facilities, a work unit will compress to about 250 K or so.� This ends up about five work units per diskette.

So, with twenty directories, that’s four diskettes to deal with.� How to divide things up intelligently?

First, expand xferseti to go all the way to directory seti21.� Then, expland compseti the same way.

That would work fine if there was only one CPU.� But, since we have two to keep busy, wouldn’t we want that?

Therefore, we need a compseti1 and a compseti2.� We also want an xferseti1 and an xferseti2.� Done smartly, we can use the compseti1 and compseti2 to run each CPU.� Then, in a separate move, we can use xferseti1 and xferseti2 to manage each “half” of the total work unit “stash” and do the upload/download cycle while both CPUs are in the “alternate” half of the work unit stash.

Watch the directory names carefully and realize that this is what we want:� To upload or download half of the units at a time and to ensure both CPUs are in the “other” half when doing so.

This is what you end up with for upload/download (Linux version):


cd seti2
../../setiathome -proxy yourproxy:8080 -stop_after_xfer
cd ../seti3
../../setiathome -proxy yourproxy:8080 -stop_after_xfer
cd ../seti4
../../setiathome -proxy yourproxy:8080 -stop_after_xfer
cd ../seti5
../../setiathome -proxy yourproxy:8080 -stop_after_xfer
cd ../seti6
../../setiathome -proxy yourproxy:8080 -stop_after_xfer
cd ../seti7
../../setiathome -proxy yourproxy:8080 -stop_after_xfer
cd ../seti8
../../setiathome -proxy yourproxy:8080 -stop_after_xfer
cd ../seti9
../../setiathome -proxy yourproxy:8080 -stop_after_xfer
cd ../seti10
../../setiathome -proxy yourproxy:8080 -stop_after_xfer
cd ../seti11
../../setiathome -proxy yourproxy:8080 -stop_after_xfer
cd ..
rm seti.dl1.tar.gz
# delete the backslash and join the next line there.
tar -cf seti.dl1.tar seti2/work_unit.sah seti3/work_unit.sah \
��� seti4/work_unit.sah seti5/work_unit.sah seti6/work_unit.sah
tar -tvf seti.dl1.tar
gzip seti.dl1.tar
rm seti1.dl2.tar.gz
# delete the backslash and join the next line there.
tar -cf seti.dl2.tar seti7/work_unit.sah seti8/work_unit.sah \
���� seti9/work_unit.sah seti10/work_unit.sah seti11/work_unit.sah
tar -tvf seti.dl2.tar
gzip seti.dl2.tar

and then for the second two:


cd seti12
../../setiathome -proxy yourproxy:8080 -stop_after_xfer
cd ../seti13
../../setiathome -proxy yourproxy:8080 -stop_after_xfer
cd ../seti14
../../setiathome -proxy yourproxy:8080 -stop_after_xfer
cd ../seti15
../../setiathome -proxy yourproxy:8080 -stop_after_xfer
cd ../seti16
../../setiathome -proxy yourproxy:8080 -stop_after_xfer
cd ../seti17
../../setiathome -proxy yourproxy:8080 -stop_after_xfer
cd ../seti18
../../setiathome -proxy yourproxy:8080 -stop_after_xfer
cd ../seti19
../../setiathome -proxy yourproxy:8080 -stop_after_xfer
cd ../seti20
../../setiathome -proxy yourproxy:8080 -stop_after_xfer
cd ../seti21
../../setiathome -proxy yourproxy:8080 -stop_after_xfer
cd ..
rm seti.dl3.tar.gz
# delete the backslash and join the next line to it.
tar -cf seti.dl3.tar seti12/work_unit.sah seti13/work_unit.sah \
��� seti14/work_unit.sah seti15/work_unit.sah seti16/work_unit.sah
tar -tvf seti.dl3.tar
gzip seti.dl3.tar
rm seti1.dl4.tar.gz
# Delete the backslash and join the next line to it.
tar -cf seti.dl4.tar seti17/work_unit.sah seti18/work_unit.sah \�
��� seti19/work_unit.sah seti20/work_unit.sah seti21/work_unit.sah
tar -tvf seti.dl4.tar
gzip seti.dl4.tar

. . .and note carefully the difference order for execution for each CPU when we launch these to crunch as compseti1 and compseti2:


cd seti2
./setiath2 -stop_after_process -nice 19
cd ../seti3
./setiath3 -stop_after_process -nice 19
cd ../seti4
./setiath4 -stop_after_process -nice 19
cd ../seti5
./setiath5 -stop_after_process -nice 19
cd ../seti6
./setiath6 -stop_after_process -nice 19
cd ../seti12
./setiath12 -stop_after_process -nice 19
cd ../seti13
./setiath13 -stop_after_process -nice 19
cd ../seti14
./setiath14 -stop_after_process -nice 19
cd ../seti15
./setiath15 -stop_after_process -nice 19
cd ../seti16
./setiath16 -stop_after_process -nice 19
cd ..

and then:


cd seti7
./setiath7 -stop_after_process -nice 19
cd ../seti8
./setiath8 -stop_after_process -nice 19
cd ../seti9
./setiath9 -stop_after_process -nice 19
cd ../seti10
./setiath10 -stop_after_process -nice 19
cd ../seti11
./setiath11 -stop_after_process -nice 19
cd ../seti17
./setiath17 -stop_after_process -nice 19
cd ../seti18
./setiath18 -stop_after_process -nice 19
cd ../seti19
./setiath19 -stop_after_process -nice 19
cd ../seti20
./setiath20 -stop_after_process -nice 19
cd ../seti21
./setiath21 -stop_after_process -nice 19
cd ..

Notice that in the execution versions on Linux, the name “setiathome” is not used. Instead, soft links are used as in:

cd seti2
ln -s ../setiathome setiath2

When adapting this for Windows, you can do the same trick, though it has a little less value.

The value of this little trick for Linux and Unix is that you can use the Linux “top” command and see where you are in the flow of the execution script:


42 processes: 39 sleeping, 3 running, 0 zombie, 0 stopped
CPU0 states: 100.0% user,� 0.0% system, 100.0% nice,� 0.0% idle
CPU1 states: 99.1% user,� 0.2% system, 99.2% nice,� 0.0% idle
Mem:� 255100K av, 247360K used,�� 7740K free, 612K shr, 69944K buff
Swap: 184708K av,����� 0K used, 184708K free����������� 19588K cached
� PID USER�� PRI� NI� SIZE� RSS SHARE STAT %CPU %MEM�� TIME COMMAND
16129 yourid� 19� 19 15364� 15M�� 664 R N� 99.9� 6.0� 36:34 setiath11
15952 yourid� 20� 19 15664� 15M�� 664 R N� 99.7� 6.1� 84:41 setiath6
16269 yourid�  9�� 0� 1032 1032�� 836 R���� 0.1� 0.4�� 0:00 top

This enables you to manage your “halves” and do the manual uploads and downloads easily, since you know where you are in the flow.� Here, we are near the end of the first “half”.

Note also that the “-nice 19” is optional.� It allows the cycles to be split evenly between setiathome and G@H if one choses to run both simultaneously.

This scheme works when you have the machines dedicated enough that you don’t have to control execution with cron facilities.� The cron trick shown above really doesn’t work with the compseti and compseti1 types of scripts.� Why?� Because if you restart them, they tend to hang up, trying to return units when, of course, there is no network connection to do so in this case.� For the same reason, they can’t be “looped.”� You just have to have enough units on the machine so you can get back to them either once per 20 directories or (as I’ve shown) once per 10 directories.� Or, whatever overall multiple of four makes sense.