Doyen

The {problem, need, market} seems to be to develop a scientific
computation platform. The goal of such a platform would be to
create an ecosystem where scientists can

develop research work

use common tools which are readily available

share and publish their work

effectively use work done by others

teach

Computational science seems to be in a state where people work
independently. They develop whole systems from scratch just to
support their research work. Many make an effort to distribute
their system but fail for lack of interest. Worse yet, the
research that gets published often cannot be used by others or
verified for correctness because the supporting code is based
on a specialized system and gets lost. There is currently no
community expectation that software should accompany research
results. The end result is a loss of significant scientific wealth.

We need to build a collection of systems in such a way that they
will attract attention and use. The collection needs to achieve
a critical mass of users. Thus we have to not only build it but
promote it widely.

In the ideal case scientists should be able to do research on
common platforms that are centrally supported but locally available.
The research results and its supporting software could be
"published" to a central repository and made available worldwide. Conferences
would use this repository as both a publication and presentation
center. Scientists would use this repository to dynamically update
local systems with the latest research results and software changes.

So there appear to be some design criteria that we can express.

There needs to be a "mother" of the doyen which contains

a resource for current systems

a set of updates including

latest system versions of a wide variety of scientific software

published software algorithms from research work

associated papers and conference proceedings

collaborative software tools

collaborative publishing tools

organization of scientific materials

There needs to be a "daughter" doyen package which contains

a way to install/update software

support for

collaborative software tools

collaborative publishing tools

a method of "synchronizing" with published materials

live-CD distribution mechanisms

The breakdown of "Mother Doyen" and "Daughter Doyen" seems necessary
to achieve certain goals. The Axiom wiki is most useful in a shared
environment. Software and Research papers are best distributed in a
one to many, publish-when-ready model which argues for a server (the
mother doyen). However, research usually isn't published until it is
ready. Heavy computation and special purpose rewrites are best done
on a local machine. This argues for a client (the daughter doyen).
Since neither completely covers the issue it seems best to assume
both are needed and architect the solution to have both.

The Mother Doyen

The browser model of a front end, such as the Axiom wiki,
supports working on a remote server.

The browser model also supports collaborative work with the
advantage
that there can be hyperlinking to other online work. This would be
especially useful if combined with online conference proceedings.
Perhaps there needs to be some "privacy" mechanism that would allow
only certain groups to view and modify pages that represent active
unpublished research. Further, a centralized model allows
shared work
of indexing and cross-referencing that makes the repository more
valuable.

It would also be possible to develop and maintain systems such as
Axiom directly on the host. Suppose the user is writing a literate
program (using a tool such as noweb). This literate program can be
rendered using the wiki software (zope, python), or printed
(assuming
a local tool chain (noweb->latex->dvips). The program can
be versioned
under a DARCS or CVS system automatically.

The key advantages of the mother doyen is that software can be
centrally maintained, it could be demonstrated and used without
being locally installed, it could be run from systems which do not
have a port and it could be updated on a local machine with a single
request.

The Daughter Doyen

The browser model as a front end also supports working locally.

Using the live-CD approach (Quantian) is a good way to distribute and
advertise the available software. In addition it is useful for building
a "standardized" scientific platform with a defined set of available
tools like noweb, latex, darcs, a browser setup, as well as certain
scientific software and libraries that are difficult for a user to
install and configure. An initial implementation of this approach is
described in DoyenCD and a copy of the CD is available for download.

Another advantage of the daughter live-CD model is spreading awareness.
CDs? can be given out at every scientific conference, of which there must
be at least one per week. If we think about scientific software beyond
the mathematical we can find other packages, such as Molgen, which is
used in Molecular Biology.

Yet another advantage of the live-CD approach is that they can be
distributed in an educational environment. There both online (such
as the MIT online course work) and standalone, class specific, work
can be used.

The daughter CD can be easily updated with the latest software using
the yum update facility allowing the user to install new or needed
software as well as fetch and use research papers.

The 30 Year Horizon

I think it is important that we focus our attention on building for
the longer term. We need to think about the fundamental issue of how
the computer will affect scientific research, collaboration, and
teaching. We need to think about what is needed for the
long term and
try to architect it now. Thirty years from now everyone
will take this
work for granted. We might as well start on it now.

I tried to set up a mailing list on the axiom site but the
mailer program is broken. I'll search for another way to host
a mailing list.

You might not be aware that the Axiom MathAction wiki is able to
operate like a mailing list. Basically, anyone can "subscribe" to
the individual web pages or to the whole web site. First they must
identify themselves by clicking preferences (or logging in) and
specifying their name (or psuedonym) and email address. Then all
they have to do is click the "subscribe" link at the top right
side of the page. Any comments subsequently attached to a page
will be automatically distributed by email to all subscribers.

If you are subscribed to any page on the MathAction web site,
then it is also possible for you to use email to reply directly
to the emails sent out by MathAction. These replies will in
turn be attached to the original MathAction web page and again
sent out to subscribers. This way a chronological record of the
discussion is kept with the web page and later (if desired) this
discussion can be editted and kept for posterity.

So ... I have just set-up a web page on MathAction for doyen.
To subscribe to it, all you have to do is click on the following
link

Then click subscribe and fill-in your email address, click
change and then click a the appropriate subscribe option
(this page) or (whole wiki).

If you want you can click preferences (on the upper left) and
fill in your name and email address. Click Save options and
the back button. This way it will remember you for next time.

Email sent to mathaction@axiom-developer.org with [doyen] in
the subject (like this one) will be automatically attached to
the doyen web page and also sent out to all subscribers.

Let me know if this is ok and if it works for you.

Second. I have to admit that I am not so fond of the project
name doyen and phrases like daughter doyen sound even
more stange to my ear. But for now we need at least a temporary
name so I'll go with it... :)

Third, although I see your point about promotion at conferences
etc. I have tried the Knoppix/Quantian distribution. It's got
a lot of "neat stuff" but really as a more or less experienced
linux user, my point of view is: "I wouldn't install it on one
of my machines" ... but if someone wants to try it, well ok...
I don't really know what level of experience one should assume
for the "average scientific computer user" these days but I do
believe we are moving more and more to the state where the issue
is no longer whether "linux or not" but really how easy is it
for me to install this thing. So in that regard it is pretty
hard to beat Debian apt-get (with the RPM format a close 2nd).

Then click subscribe and fill-in your email address, click
change and then click a the appropriate subscribe option
(this page) or (whole wiki).
Let me know if this is ok and if it works for you.

Actually, I was unaware of this feature. Very nice.
I subscribed to the doyen page.

I see you've arrived at the "MathAction" name for the wiki.
I'll start using it (although Bill Page's wiki has a nice ring :-)

Second. I have to admit that I am not so fond of the project
name doyen and phrases like daughter doyen sound even
more stange to my ear. But for now we need at least a temporary
name so I'll go with it... :)

Point taken but we need to name things and finding an unused name
that has any relation to anything is quite challenging. If I
recall the mother-daughter distinction came up in our phone
conversation. You can't blame me for all of it :-) The idea is
important though. If the terms are painful please suggest others.

Third, although I see your point about promotion at conferences
etc. I have tried the Knoppix/Quantian distribution. It's got
a lot of "neat stuff" but really as a more or less experienced
linux user, my point of view is: "I wouldn't install it on one
of my machines" ... but if someone wants to try it, well ok...
I don't really know what level of experience one should assume
for the "average scientific computer user" these days ...

In the beginning there was Rosetta. I collected and distributed FOSS
computer algebra systems and gave them away at every conference I
attended. The effort was entirely my own and the CDs? were all at my
own personal expense (CAISS eventually supported the concept). I'd
hoped to get people aware of the range of systems and to start using a
standard distribution. There is a Rosetta document (I thought it was
on MathAction but I don't see it) that detailed the syntax differences
between the systems.

In general, this was well received and I received requests for
additional copies after every conference. One fellow set up a
mirror for Rosetta and built a windows version for distribution.

One issue that arises is that each algebra system has to be built
for a particular opsys distribution. I included a prebuilt runnable
version for RedHat? and the sources if someone wanted to build for
some other system. I helped a couple people get algebra systems
running on non-RedHat? so somebody actually used the Rosetta CDs?.
But they only chose one of the systems to use and there was no
way to update the software easily.

Quantian is a similar idea (except using quantitative software). It
has two key advantages. The first is that all of the systems are
pre-built to run under Quantian so you don't have to assume that a
user has a Linux system to run the examples. I found that a lot of the
feedback was related to assumint the user had RedHat? linux. Quantian
fixes this issue by including the system.

The second feature is that users can "try and buy" since Quantian can
be easily installed on a system. This is important because you can
set up examples in the user's area of work (e.g. physics) and they
can see the results without much personal cost. Windows is still
quite pervasive and the choice is either a canned demo or a trivial
redirect to a website. Neither one shows the power of these freely
available systems. Quantian steps around Windows for the initial try.

But both Rosetta and Quantian fail to provide a comprehensive,
attractive platform with wide support. It needs to be more than
a collection of "neat stuff".

Both Dirk and I tried by individual effort and experiment. The "take
away" lessons are useful. One is that I don't think we can change the
world without heavily marketing the idea and, your point, a central
location to focus the awareness. I rented the axiom-developer virtual
machine years after starting Rosetta and it never occured to me to
make it into a Mother Doyen. Even if I had done so I was only
personally capable of distributing a few hundred Rosetta CDs? at a few
conferences.

I also tried to pioneer the "proceedings on CD" for the ACM. ACM
still wants to bind up the result so that only subscribers can get at
the published papers. I was hoping to introduce the idea that all of
the papers (as well as the supporting source code) would be
electronically available to all. There is great institutional
resistance to this and it will take great effort with the ACM to
change (since they make money off the proceedings and
subscriptions). Science may be free but you can't get the results
(yet) without paying for them. This needs to change and the whole
Doyen approach might make a dent in the current thinking.

And the ACM, despite their mission, has not been promoting a
standardized computational science platform. If we can enlist
their help (as well as other societies) we can get much better
leverage. Of course, we have to build the infrastructure at the
same time. If eggs didn't morph into chickens the conundrum would
be solved :-)

.... but I do
believe we are moving more and more to the state where the issue
is no longer whether "linux or not" but really how easy is it
for me to install this thing. So in that regard it is pretty
hard to beat Debian apt-get (with the RPM format a close 2nd).

Would that were true but in fact I don't use apt-get or yum or
update because almost every system I touch is either firewalled
against it, lacking the software, or not net connected (like my
366Mhz laptop). I do use RPM for installation of most things.

A second issue is that I know quite a few people (including computer
science professors at my college) who are unaware of these
facilities. The school uses a mix of windows, linux, apple, and
solaris but it is very rare to find a savvy linux user on my campus.

As a future direction your point is well taken. The Mother Doyen
(perhaps even the MathAcion? wiki) can be the primary collection,
distribution and update site. The Lindows distribution (a debian
desktop) has institutionalized apt-get in such a way that the naive
user doesn't know they are using it. Most mathematicians care only for
the result, not the underlying machinery We need to make the whole
process very "user-affectionate" if we are to succeed in promoting a
standardized platform.

{apt-get, yum, update} also need to extend their range so they
support more than just the software. Computational science is going
to need online archives of papers which are trivially accessed and
cross-indexed. These tools need to know how to find and fetch
bibligraphic references.

I've taken to rambling again. 'tis way past bedtime and the light
dawns on the morrow already.

In the beginning there was Rosetta. I collected and distributed FOSS
computer algebra systems and gave them away at every conference I
attended. The effort was entirely my own and the CDs? were all at my
own personal expense (CAISS eventually supported the concept). I'd
hoped to get people aware of the range of systems and to start using a
standard distribution. There is a Rosetta document (I thought it was
on MathAction but I don't see it) that detailed the syntax differences
between the systems.

The RosettaStone document was only loaded on
http://test.axiom-developer.org since I was experimenting
with conversions from LaTeX to HTML. But I think it looks
pretty good, so I have transferred it here for easier
reference and updating. Perhaps we should split it into
several smaller pages...

Perhaps this only works if mathaction@axiom-developer.org is listed
in the To: field? The other thing is that for security reasons, the
sender of the message must first be subscribed to the MathAction
wiki.

Dirk Eddelbuettel has pointed me at the build instructions for
Quantian. Quantian can also be built on DVD and I now have access
to a DVD burner (although no experience yet).

I have set up a Fedora Core 2 box.
I have set up a Quantian box.
I have a DVD burner.
I have the Rosetta pile of algebra systems.

Steve Grubb can build Fedora liveCDs.
Dirk Westfal at Linux4all was mentioned but I can't find an email addy.
Please copy him if you can find him.

WORK AHEAD

So, the basic vision is to build a Doyen (Linux Science Platform)
consisting of two parts, a "Mother Doyen" which is a wiki website
and a "Daughter Doyen" which is a liveCD. The platform should support
a wide range of scientific software and science activities.

The following steps need to happen to put together a prototype:
1) Build the daughter doyen
a) build a liveCD example
Currently Quantian is built on Knoppix/Debian but is
agnostic in terms of platform.

1) explode/rebuild a liveCD (just for the experience)
2) explode/rebuild a liveDVD (same)
3) make a local MathAction wiki
- involves configuring Apache, setting up necessary
local packages, setting up zope, tayloring the
local html to look professional and be clear.
4) mod the liveCD to include the MathAction wiki
5) rework the local MathAction wiki to include links to
the mother doyen on the axiom-developer version
- make some decisions about what to include vs
what can be downloaded.
6) work out an example of yum/apt update of a math package
from the host to a local copy.
- figure out how to host yum/apt.
- make the packages available
- test the download/install process per package
6) work out an example of "publish/upload/CVS" a research
paper that includes runnable examples
- zope code to communicate whole pages
- CVS/DARCS backup of changed pages

At this point we have a working version of the daughter doyen.
A person could boot it up, start the wiki, get an updated
package from the host, write a paper, and publish it back to
the host.

b) build a fedora liveCD example
Since RedHat? is in the game we'd like to use Fedora as
the basis for the platform
1) explode/rebuild a liveCD of fedora
2) perform steps 2-7 above
...
8) set up a bugzilla to handle problems
9) discuss ways to "customize" the platform for particular targets
e.g. a physics platform vs a chem platform vs math
Ideally we'd have a selection tool thru the wiki.

At this point we have the ability to support a larger number of
users, bug feedback, and smaller, more focused target markets.

At this point we can start a campaign to get the daughter doyen
distributed at conferences, thru mailings, and thru schools.
Users have the ability to search (and possibly contribute to)
electronic collections.

e) test market the idea
We'd like to have a few live users so we could pick a
conference and give out copies at the conference. Also
useful would be a presentation at the conference.
1) reproduce in small quantities
2) give it out at a conference
3) get feedback.

From Dirk Eddelbuettel I'm going to need support for items like

building a liveCD, adding items to menus, etc. Also need support for
moving to a Fedora base.

From Bill and Bob I'm going to need support for creating a clone of

the MathAction site locally and ideas about publish/upload mechanisms.
We may need support for modifications to allow whole page/subtree
uploads from a daughter wiki to a host wiki.

From Steve I'm going to need guidance about building a Fedora liveCD.

From Ed I need some discussion about his offer to move the Quantian

domain name. Perhaps we can point it at the axiom-developer IP address.
I need to know what support has to exist.

Dirk Eddelbuettel also raised the possibility of a P2P architecture
rather than a mother/daughter (hub and spoke) architecture. I haven't
thought thru this comment but I throw it out for discussion and
thought.

With the proper set of programs in the bootable CD this could essentially
be the Daughter Doyen. There is, however, a fair amount of work to achieve
that "simple" result. For now I continue to trudge the path I previously
laid out.

Second, I managed to set up a system that can build and burn Live CDs?.
It takes an amazingly muscular machine so I had several failed attempts
before I consed together a large enough horse. The Live CDs? partially
boot; the cloop argument is wrong but I don't know where this is stored
yet. Once that is solved I will have a filesystem and the boot should
complete.

Patrizia Gianni is a researcher at the University of Pisa in Italy.
They have been working on AJCA, an Active Journal for Computer Algebra
which is an approach to writing papers which include executable content.
(http://mega.dm.unipi.it/submissions.html)

I think this is directly in line with the idea of a science platform
we discussed and might be of interest to you.