These are some notes about the organization of the terravivo website
for internal consumption.
Starting with Jeff's .txt files, I built some HTML and pseudo-cgi's to see
how it might look on a browser.
Most of the orientation in Jeff's stuff
is encapsulated here (e.g., manager information, etc.) but I added a few
things (e.g., "Complete genomes section"), simplified others (e.g., unified
configuration page), and fleshed a lot out.
It was an educational experience!
The example state is a server which started recently (dates and times are
all bogus) and is serving ENZYME (monthly update), PDB and Genbank (daily),
SWISS-PROT and TrEMBL (weekly), and updating itself daily.
I'll try to explain the putative lessins in the following notes.
What do you guys think?
- This could be really, really neat!
- Simplification is critically important.
- There's a lot of it, automation is also critically important.
- It will be important to developing a control language (as makefiles?)
corresponding to a consistent directory structure for internal files.
the name "terravivo"
We've been using "terravivo" as the working name for a unified pillage
and vnfs product. We should decide on the real name soon.
"terravivo" is fine with me, some advantages are that it hints at the
scale of what we're up to ("earth life"), it's available as a domain
(it would be nice to get www.terravivo.com and ftp.terravivo.com as for
terravivo server), the initials are easy to remember (root directory /tv)
and it's generally distinctive (sort of scientific, not too English-oriented).
I personally like "Global Pillage" but it's perhaps a bit flippant.
Here, I've continued to use terravivo and /tv.
It would be nice to have a distinctive and consistent look to this stuff.
To this end, I used "metaphorics paper" BACKGROUND with white BGCOLOR
and tables in most places. The background that we previously used
(sand.gif) was a bit dark, making it hard to read black text, so I
lightened it up and used that (sandy.gif).
All pages have a logical title in the HEAD section and the same text
visible as centered title followed by a short description of what the
page is about. Good to maintain orientation and support bookmarking.
A "standard Metaphorics trailer" appears at the bottom of each page.
I think it's important to have a consistent trailer so people know they
can get back to us from all our pages (not just the tv ones).
I'm not married to the specifics of this particular trailer, though it
seems OK to me.
organization of pages
I started out with the main pages being "status" and "configuration".
By the time I entered all the databases that Jeff picked, and imagined
that this number might grow significantly, it became apparent that a
more "star-like" top-level organization works better. In turn, this
makes it handy to have a site map, but
I don't think a big site map on the first page (ala www.daylight.com)
is called for. Perhaps a more graphical site map than the example here
would seem less clunky and more professional.
Are we ever going to have more than one Terravivo server per network?
Or for that matter, per host?
Assuming it's possible, we need to identify the one we're talking about.
I've used the IP number throughout (e.g., 18.104.22.168) but a string
indicating host:service name would be better (e.g., "origin:terravivo2").
This is the page entitled just "Terravivo".
It's Terravivo's home on the local network, not the global home page
at Metaphorics' site.
Is the use of the phrase "home page" too confusing?
If so, what else can we call it?
The home page provides a server synopsis,
manager information including a local message (i.e., information about
the manager for the benefit of users, is this what you intended Jeff?),
news from us to everyone (probably the 10 most recent messages, if more,
a link to the rest?), and links to the rest of the site.
Only the first link (25-Oct-1998) is live in this example.
It would be nice to keep this first page short and sweet.
This is a read-only page describing all possible Terravivo resources.
My gut feeling is that it's better to have a comprehensive list which
is consistent with the configuration page rather than to separate "current"
and "potential" resources. If neccessary, the categories could appear on
I like the idea of an overall condition synopsis, if we can make it clear
what it means to the users, e.g., "healthy" (working as it should),
"unreliable" (down more than 50% of the time), "disfunctional" (not doing
anything useful), "misconfigured", "brand new" (not configured), etc.
This might be linked to a more longwinded explanation of any problems.
The main part of the status page is the table of resources.
Each resource name is linked to a page describing the resource and
its status. (I created preliminary pages for all the ones here).
Users don't get to change anything, they just get to see the description
of the resource, its status and look at any resource-specific messages
(e.g., the hypothetical warning about GenPept).
message from manager
This seems like a nice feature.
It seems useful to separate resources into categories.
I suggest "Protein datbases", "Nucleic acid databases", "Complete genomes"
and "Software". These will be soft (like everything else) so we will be
able to add/split categories in the future
(e.g., add "Small molecule databases",
or split "Software" into "Public software" and "Metaphorics software").
Given the eventual orientation of this stuff, it seems appropriate that
we have a "Complete genomes" category. Splitting it out by species is
sensible, but this list might get quite long quite soon and merit its
own page. Perhaps by then the "Complete genomes" resources will be
implemented as Biothor genomic universe databases.
These are linked to both user (Terravivo status)
and manager (Configure Terravivo) pages.
The information shown on these pages seems appropriate,
though the content did evolve a bit as I compiled them
(abandoning "author:" and "reference:" items in favor of "home:").
A link to the resource-specific log would be nice.
We might also need to put in copyright/licensing messages.
Other stuff we might need: copyright/licensing messages,
original format, list of original files and their URLs?
IMO, the resources should be organized logically at a relatively high
granularity, i.e., from the user's POV, resources are not broken up into
their components (e.g., sub-databases or individual programs).
Does it make sense to have links to the original sources here,
or to maintain local copies of documentation, or both?
Here's an approach that might work for management security.
All routes to the configuration page lead here
unless access is from the console
(in which case, the manager's password can be changed).
This page asks for a password and generates a cookie (timestamped hidden field)
for the real "Configure Terravivo" page.
For this HTML-files-only example, the "for now, click here" link moves you on.
This is a CGI FORM which looks as much like the status page as possible,
except that optional fields can be changed.
The "Submit changes" button on the bottom takes the manager to a confirmation
page. There is also a "Reset factory defaults" button which seems OK but
might be better split into "Restore" and "Factory defaults".
We need to make each of these choices individualized via our updating.
E.g., recommended update frequencies* and servers may vary;
there is no "never" for terravivo updates.
Re update frequency, settings such as "daily" and "weekly" seem
better than specific days and time-of-day. Even if we use the crontab
format, having the manager specify general frequencies allows terravivo
to pick the apporpriate time of day (perhaps adaptively?)
One choice for the inital state would have everything is set to "never" and
require the manager to configure things. Or we could ship live (but slightly
obsolete) data and have the initial state reflect that.
The ftp servers you see here are real ones for the specific resources.
(Jeff, weren't you going to send me your list?!)
I made them up based on intuition and a look-see, so they're not authorative.
The general idea is that a manager can pick a single server only (in which
case only that one is used or pick primary and secondary servers (in
which case, if they both fail, terravivo can use other ones).
The ftp service is the CGI choice VALUE, the domain/location is what appears
on the form, e.g. "ftp.ndbserver.rutgers.edu" appears as "rutgers.edu".
Since this page is autogenerated from our "configurables" data, the values
don't really matter as long as the choices are non-ambiguous.
This is a page to allow us to do consistency checks and make the manager
confirm what was specified on the previous page.
It seems prudent to require that the manager re-enter the password.
This is page reports the success (or not) of Terravivo configuration changes.
This page might offer to move on to the status page and remind them to
shift-reload to see changes.
Aside from availability and capacity plots, I find it useful to think of
Terravivo as a box that acquires information from various data producers
and and delivers it to data consumers. The "Data gathering" and "Data
delivery" tables reflect this POV (numbers are totally bogus but they add
up correctly). I think "last week, last month, to date" is adequate
granularity in a table, but a line plot might be better.