|
Ars Super Computer
Database Online
IronBits and Jtrinkle
need your help. They are in the process of compiling the firepower
involved in the different Distributed Computing projects. I would
like for ALL TLC members and members of different Ars projects to check
out the database site at http://asc.dbestern.net/.
Login and input the specs of the computers in your DC farm and see how you
rank compared to other Ars DC members. Right now we have the stats
for 282 Boxen running for TLC and I know there are tons more machines that
aren't in the database yet. We have over 900 active members per day
for the team, lets get those numbers in there....I'm sure we have more
firepower than any of the other teams!
If you have any questions about the Ars Super Computer you
can check out the thread
on the Ars DC forum.
Hey Look!
I finally got around to updating the weekly stats for the past several
weeks. If you want to check out the past weeklies you can find links
to them on the weekly archives
page.
Sad News
Welp one of the best S@H addons will no longer be updated.
News comes from the SETI
Spy webpage that there are some incompatibilities with Windows XP
and SETI Spy.
11/14/2001: Thanks to everyone who
responded to my request for information on how to improve SETI Spy's
stability under Windows XP. Unfortunately, I have to come to the
conclusion that SETI Spy is not completely compatible with Windows XP.
Setting the compatibility option to Windows 2000 and disabling visual
themes may help, but you will probably still experience crashes,
especially if you have SETI Spy start up automatically. Due to other
commitments and interests I will unfortunately not be able to fix this
problem. Please consider SETI Spy 3.0.7 the last release. My apologies
to the very loyal SETI Spy user base out there. I had a lot of fun
developing and supporting SETI Spy, but after 27 months it is time to
move on to other things. Keep 'em crunching!
I have used SETI Spy for what seems to be ages...and to me
this is one good reason not to upgrade to XP ;)
Purge Old Users?
There was a thread on the SETI newsgroups trying to get a petition to get
rid of "useless users". Matt Lebofsky chimed in with the
following:
There are a lot of inactive users,
as you know, but the information about these users take up about 3-4
Gbytes, tops. The entire database disks space for SETI@home is somewhere
over 300 Gbytes. So removing dead users would save us, tops, 1% of our
disk space. And that's grossly overestimating.
As well, the user database is completely separate from our science
database. Two different machines with two different disks arrays. Since
the user database is so small, we never have problems with it. The
science database is more of a headache, and reducing the user database
size will do nothing to cure that.
And furthermore: What happens if a user who signed up ages ago decides,
after many months, maybe even a year or so, to pick it back up and start
again? What if their account is missing? What if we accidentally remove
an active user? What are the exact criteria?
Since the effort to do this would be significant, yet the payoff
minimal,
We work on other things. Yeah, it's ugly we have so many extra entries
in our user database. It doesn't slow down queries and it doesn't hamper
our science, so we just deal with it for now.
I guess it is a good explanation...but I do with they
would get rid of the people who have signed up for the project, but never
returned a work unit. It would make things look better since there
are are several hundreds of thousands of "members" who never
have turned in a work unit.
Ug...
The computer that is running the stats decided that it had to
spontaneously reboot during the stats pull today, so they didn't get
updated on time. Sorry for the delay, but they should be up the same
time as this post here.
Search User Profiles
The S@H site now allows you to search people with user profiles.
There are several different options for you to search and view them
by...you can check it out here.
Tech News For S@H
There are several entries on the technical
news pages on the S@H site which are new since the last time I
updated things here. None of them are earth shattering posts, but if
you want to know what is going on behind the scenes, take a look.
Old Newsgroup Stuff
A shade over a month ago there were some problems with the S@H
servers, and the problems were traced back to their RAID configuration on
the servers. Here is what Eric Korpela said about the problems on
the newsgroup:
I can tell you about the bulk of
the increasing problems in 4 words
"malfunctioning hardware RAID cards". Just about all of
the problems of the
last year (including lack of inodes, error -22, error -63, sluggish
response,
duplicates, today's outage, etc.) can be traced to those cards.
It's a shame,
hardware RAID seemed like a great idea, but the controller is suseptible
single point failures. We learned the hard way that these cards
weren't even
acceptable as non-RAID caching SCSI controllers. Whoever wrote the
firmware
and drivers decided that certain errors didn't warrant the attention of
the OS
even if they caused data corruption. Search through
http://setiathome.ssl.berkeley.edu/tech_news.html
for occurrances of the
word RAID.
As of today, they are gone. As of tomorrow we will be back to full
capacity
(once the software raid groups have fully rebuilt) with simpler, more
reliable SCSI controller cards.
Shortly thereafter, we're going to try mirroring the database to the new
Network Appliance filer. If all goes well and performance is
adequate, we'll
break the links to the SCSI drives and rely on the NetApp box.
Another problem related to the screwy RAID situation and
has to do with the database indices. When the RAID screws up it
tends to screw up the indices. This resulted in older work units
getting sent out, and as a result many people were getting duplicate
results sent in and not getting credit for them. Here is what Eric
had to say about those problems also:
One of the problems we've had is
that when a page in the database
workunit table gets corrupted it results in a problem with the index
that controls which workunits are sent. The symptom is that we end
up
looping through a few thousand (or a few*ten-thousand) workunits.
We
ended up in that state over the weekend (possibly it started thursday
or friday), people who process a large number of workunits probably saw
some duplicates.
and a follow up on the same lines...
>I don't mind re-processing
WU's but will we get credit for them?
Probably not, we'd need to deactivate duplicate result checking which
would open us up to the people who continue to send us the same result
every 5 seconds
The final Eric post had to do with a user installing Red
Hat 7.1 and not being able to get S@H to run on his machine....
>My problem is that setiathome
won't run. It always generates the following
>error:
>
>Couldn't get lock file. This is probably because
>another instance of SETI@home is
running in this directory.
>Each instance of SETI@home must run
in a separate directory
and Eric's reply...
At some point in the distant past,
the people who put together RedHat
distributions of Linux decided that file locking on NFS volumes was
unnecessary and set the default configuration to have NFS locking
disabled.
IMHO, they made the wrong decision. There are two ways to fix the
problem. The preferred method is to find out which daemon handles
NFS file locking and turn it on. (Usually this is statd or lockd).
The other method that appears to work is to link lock.sah to a file on
a local filesystem such as /tmp.
-Front Page |