Twiki upgrades are currently very painful. Generally you only want to upgrade the code, and not
modify all the twiki webs. Also replication of content between servers is complicated by the
fact that things aren't all in the same subdirectories.

As a result what I've done is move everything to do with a particular web into its
own directory. Since Twiki doesn't support this kind of setup, a whole bunch of
symlinks are in place as a temporary measyre. Eg my directory structure is:

Symbolic linking of directories can be useful in certain environments. Symlinks work fine on Unix, but are not available on Windows. Also, you need to create two new symbolic links each time you create a new web.

Another way is to keep the main installation and create some upgrade directories to test drive an upgrade before going live with the new version. Sample scenario:

Existing production environment:

bin directory: /user/twiki/bin

data directory: /user/twiki/data

template directory: /user/twiki/template

pub directory: /user/twiki/pub

Test upgrade environment:

bin directory: /user/twiki/bin.upgrade

data directory: /user/twiki/data (same as production!)

template directory: /user/twiki/template.upgrade

pub directory: /user/twiki/pub (same as production!)

The config file in /user/twiki/bin/u is set to use the same data and pub directory, but different bin and template directories. This way it is possible to test an upgrade first, then to go live by renaming two directories and editing one config file.

Your breakdown assumes that everyone uses the exact same codebase and that everyone
uses the same templates... (I know this only hits the default templates, but it also
hits the templates for any subdirectories a new release contains) For a local customised
twiki for a local look & feel, And also if you want to take the setup and duplicate it elsewhere
the steps are rather more a pain - especially when you consider that wikicfg.pm has been
renamed - which leaves no constant place to move things, the plugins system whilst useful
needs installation instructions (hence why I'm working on TWikiUnixInstaller) - currently I'm
hardcoding the list into Plugins.pm so that's clearly not working correctly on my system,
and that's in a state of flux - as indicated by the problems the DrawPlugin currently has
due to this.

BTW this discussion is simply designed to illustrate more reasons why the separation
would be very useful, not to say "that's bad", "that's bad", Twiki's good, and has
alot going for it

Existing production environment:

bin directories:

/user/twiki/bin/

/user/twiki/bin/TWiki

/user/twiki/bin/Plugins

data directories:

/user/twiki/data/Web1

/user/twiki/data/Web2

template directories:

/user/twiki/template

/user/twiki/template/Web1

/user/twiki/template/Web2

pub directories:

/user/twiki/pub/Web1

/user/twiki/pub/Web2

What gets touched during an upgrade?

bin directories:

/user/twiki/bin/

/user/twiki/bin/TWiki

/user/twiki/bin/Plugins

data directories:

/user/twiki/data/Web1

/user/twiki/data/Web2

template directories:

/user/twiki/template

/user/twiki/template/Web1

/user/twiki/template/Web2

pub directories:

/user/twiki/pub/Web1

/user/twiki/pub/Web2

What do you need to touch inorder to duplicate Web1 & Web2 onto another system, where
TWiki has already been installed? (Assume this also means plugins need copying over, and
some specialised scripts (ala view/edit, but for other non-Twiki stuff) also need copying...)

And bin needs further touching after untarring due to local config info being stored
in there. - hence the etc directory I'm starting on. (see also TWikiUnixInstaller)

I'm actively working on modifications to make this part of the code base when I get
the time, but in the meantime, symlinks are in place to make life somewhat simpler.

Why is this important to me? Well I use twiki on my laptop & also have it on a desk
machine. (Largely so that I won't get bugged on holiday If I could rsync/CVS sync/whatever
the two via one simple set of directories, it'd be great. (Essentially I am using it
as a ReadWriteOfflineWiki but want to use it as a WikiCluster partly with reference
to the desire for WebsitePublishing and the current directory stucture makes it a
complete pain!)

ie this isn't a theoretical need, it is in practice something that'd be useful to me
right now - hence why I'm working on it! (eg dumping a whole web into a CVS tree, or
making it available as an rsync share, etc)

If you're going to plan this fully, then there's an important web security principal I always expound - 'only documents under the document root'.

This means that only pages/files that you want actually served or accessed by the httpd server should be visible in the web tree - '.html' or cgi-bin scripts. Everything else including configuration files, modules, plugins, libraries, includes, log files, '.changes', RCS versions, README, docco etc. etc. etc. should be outside the document root, or someday it will be retrieved, and something badwill happen.

This is a very simple principle, and does wonders for application security, but its very very hard to get people to understand and implement thoroughly for some reason.

Not sure if we need to restructure the directories of the TWiki distibution as extensively. Synchronizing webs can be done with the current directory structure. This raises the question of WikiClusters.

My Twiki started on my laptop. It's prime home - for better sharing - is now on my desktop. Having a central location
for the system, massively simplifies sharing. However this means I'm regularly faced with a 2 way sync situation.
The ideal for me is to have an N-way sync. CVS is designed for an N-way sync, but I don't want to put the
code in - just the stuff I want to N-way sync - the web stuff. As a result my preferred method would be:

All the webs as above structured - since it simplifies the stages below...

The editting clients act as clients to a CVS repository. (Twiki'd web servers) (This is opposed to the idea of the TWiki itself using CVS instead of RCS)

When the editting clients try to CVS commit their info, any failures to commit result in an email getting sent out to the person who editted the page last on their local system, so that they go in an do a local edit, and then have the ability to force a central commit at that point in time.

Backing up a single web as a single tar ball is also a damn site simpler from an admin POV. OK it's not
complicated right now, but suppose twiki became the defacto setup for say Sourceforge - where by they had
(say) 5000 twiki's on the system... From a user separation POV having support of a separated setup,
if massively beneficial.

What this really means is it would be really nice to have wider support for a different file system layout
from present that's compatible with the current setup... Also just because the above layout is good for me
and the setup I'd use it for it doesn't mean that it's ideal for all, anymore than the current twiki default
will be good for all. (Personally I hate it since it makes my life awkward

I've been thinking about this some more, and playing around with rsync more,
and have come to the conclusion that there is a need for at least 3 things:

Local Configuration

Place for new code/version drops

Place for local data.

CrisBailiff's ideas on this, and the way you normally have a web server
configured really all pull these thoughts together. The current twiki
distribution munges all 3 of these things together in one location - akin
to the situation you would have if you decided to put all the apache modules,
binaries & config hanging in the same directory. The above reconfig I
suggested is not actually that much better in this regard. As a result I'd
propose something like this:

With the key things here being putting a twiki.http.conf file into the
twiki/etc directory & the TWiki.conf file also in the etc directory.
The key benefits this really brings is to someone using rsync to distribute
out (say) read only webs, or to sync the code in use/available in an
organisation

As I indicated above for this new structure to work, without "forcing" a restructure
on people is the ability to be more flexible in the way the directory structure.
To support this on my local TWiki, I've added into TWiki.pm a dataFile function that
looks like this:

This is largely due to this supporting Webnames with multiple "."'s in and
also allows people to have a "topic" to include that looks like this:
"this/that/other/etc". As a result for this portion I'm creating a wrapper
that splits these formats up into their bits first of all.

I suspect that this lot should really go into (or is already in) TWiki::Store,
and that alot of this has already been done, but I'm in the process of removing
all references to $dataDir from the main 'binaries' as well... (After all
it's easier to change one function to be TWiki::Store compliant than a couple
of dozen files...

Mainly doing this since I don't want to create a code fork, don't want to
break everyone elses install, but reallydo want the ability to do the
above. Also the idea of having this configurable makes alot of peoples lives
easier. (Also simplifies translations since if you have the two patterns,
you can very simply create a mapping piece of code from it

It's probably obvious from the above that the above setup was done before the
CVS code changes that restructured twiki to have a lib/ & tools/ directories....

This structure is now making mirroring and setting up of new twiki servers a simple 5-10 line
list of things to type, edit the local config info, rebuild the twiki.conf file in etc, and merge
into httpd.conf. This setup makes the ability to have roving laptop access ( from Unix based
laptops for now, since that's what I use) practical, lightweight and fast.

Allows distributed twiki, with live replicationAllows incremental code upgrades, and code syncing using rsyncacross an organisation. Implementation of a new web is as simple asuntarring a predefined web or cp -R'ing a _default webUnix permissions can allow file access to the tree to the organisational ownerwhere appropriate to allow end user editting of templates.Provides an easy structure for including the genhtml plugin's HTML directory for using Twiki to edit/publish a website allowing static site generationdownloadable versions, and PDFing of a web simply. Backup and recovery of everythingto do with a single web is much simplified. Internally (*) also using this and genhtml we nowhave a static version for general browing with an edit button pointing at the dynamic versionmaking Twiki look much, much more like a standard web page, and more importantly - as QUICK asa normal web page. Also we have a directory for regular "jobs" to happen which are triggered via cronwhich automates the publishing, and any other jobs.

Not integrated with current twiki. My code fork (not released externally since I don'twant to split the community!) prevents me adding extra code to TWiki as is really.

1) Symlinks are still used. By and large my changes are compatible with the current release. The Unix TWikiUnixInstaller I wrote is designed to take a standard TWiki, and "distributify" it (the main aim of the data and code separation in the first place). 2) Upgrades are currently awkward. If there was a commitment that the benefits of separation were viewed as useful I'd be willing to make this easier for people (as I have for people making new installs via my installer...) 3) I've no idea how many people would like the feature. I know people here like the PDF'ing and downloadability etc it provides though as side effects. The distributed nature of the Twiki's this allowed (UK, US East Coast, US West Coast) and responsiveness resulting also proved popular (much more popular than centralised systems) in practice - especially given local editting for local webs.

1) If this was accepted in, I'd recommend it as a permanent change, with symlinks for compatibility in the next version (possibly 1 after next - which ever "next" we're talking about at the moment - no rush)2) All of the code for compatibility links is in the TWikiUnixInstaller. 3) Consider this a vote in favour

(*) I specifically can't mention my current employers name here since it's not allowed by the rules of the organisation. People are welcome to ask directly though Much as I'm in favour of separation of code & data (and have my changes on a personal CVS server simplifying this) I wouldn't be surpised if this doesn't get accepted at this stage. (even though I've been using this structure now with lots of benefits for over 2 years...)

Anton: "Directories under Web1 conflict with the model of heirarchical webs" - I agree in principle but if you make the assumption that Webs will continue to use an initial capital letter and we rename Daily to daily I think we can avoid conflict. Alternatively we could insist that all directories inside the web ones have a prefix that will segregate the name space. Does that allieviate your concern? Do you have any others?

John: I'm quite serious about this. If TWiki is to be aimed at the corporate market and we want people to run it standalone we have to entertain what corporate users use. i.e. Wintel XP. Now, my impression of XP is that it a lot better at separating code and data than any of its predecessors. The beauty of Michael's work is that it goes some way to implementing ReadWriteOfflineTWiki.

In the long run I would probably change the implementation to use a message based middleware instead of a file copy middleware (i.e. rsync) but I believe this can be done later with little trouble.

Martin, good stuff in we're thinking small peturbations to the extant code. .....
I'll just addres point #1 here

I'm thinking in terms of a config file that says where everything is and what to use.
This will simplify habing storage plugins that deal with, for example, out of band
access control, preferences and metadata, as well as how to deal with the migration from the rpesent system - what I call "compatability mode - to something more suited to a corporate setting.
In fact I think it can be done without thigns like rsync.

I've been playing with YAML recently and its opening up a
plethora of ideas for me. One of them relates to this issue of code and data.
(Actually this topic is more about layout). It also answers questions about symlinks vs putting
it in a DB or hash or on a CD-ROM or a remote file server

The idea is that - modulo things like RSS - each web is configured with its own idea about storage.
Right now I'm working in raw YAML, but it can be "compiled" into perl structures for performance.
I'll illustrate in YAML becuase its human readable, a lot more so than XML and is intellible
to "mere morals".

The current system I'm calling a member of the Filesys group.
Full compatability mode is everying in-band.
In the "Next Generation" that will be Storage::Compatible and will just be the current code
in a OO wrapper. See, 100% compatability is easy!

Next, there is Storage::Filesys which is everything out of band but still stored in the file system
Options will exist for different details that match the details of the corporate needs.
For example, once something is "deleted" by moving to Junk its history is thrown away.

Then there is the MySQL or other database. It will need appraoraite parameters.
Details to be worked out.

That raises the point about Filesys.
If the database name and parameters says "where" the topic is stored, what about the Filesys ?
OK so we need:

# Only set here things that don't need to be set in the per web preferences
#
Main:
title: "Sys" # rename main to sys
searchable: true # this is the default can be overridden in WebPref
skin: tiger # default skin for this web can be overridden in WebPref
autolink: true
- immutable: true # can NOT be overridden in WebPref
storage: Compatible level0 # Short form
- location: "/twiki/data/MainRenamed"

Points to note:

It may seem wordy but it makes configuration more visibleAll that Michael discusses about file system layout is now soaked up into this.Issues about where to put CVS are now non-issues. The code in Storage.pm that scans the directories doens't apply.

The name on the tabs in, for example, WinXPSkin, should follow title ... =$webobj->{title}=However as The White Knight made clear, its name is not what it is called or what its title is.

Upgrade is via an engine that reads this config.

Actually the engine does more than just upgrades. It must do convertions too.
Suppose you want to move:

Then give it the second config as a parameter and say go. Details to be worked out, of course.

Yes, I know this isn't a complete design. I've only been playing with YAML for 2 days.
I'm doing proof of concept fr skins and preferences. Since it can use Data::Dumper
and input from tied hases, it fits in with all I've suggested for storing preferences out of band.

I also suspect it may make an excellent media for supply updates.
It can specify which topics have an update available in the set:

.....
topicupdates:
TWiki: # its title may be 'Sys' but this is its name
WebRssBase:
- needversion: any
- thisversion: 1.2.1
- location: list of alternative locations
- "File://%UPDATEBASE%/data/TWiki/WebRssBase.txt"
- "Ftp://twiki.org/data/TWiki/WebRssBase.txt"
- "Ftp://mirrors.mirrors-R-us.com/twiki/1.2/data/TWiki/WebRssBase.txt.1.2.1"
- flags: force

As you see, it deals with the ability to update from a local media such as a CDROM or
expanded TAR or ZIP package, as well as from an FTP site.
Of course that's just preliminary idea. Lots of details to work out before a real design.

You see how readable all this YAML is! And you can also convert it to a Perl data structure
or into a BD_HASH for fast loading. Neat,eh?

My comment above about PCs wasn't intended to be taken too seriously. I do believe code and data separation is very important, but I've so often seen it muddled up on PCs. Mind you I haven't found TWiki too bad in this regard, although they are certainly things in an ideal world that could be better.

I think making TWiki upgrade easier and synchronisation of multi-site TWikis possible are both excellent goals. Both of these I can see would need data changes. I'm still unclear though what those changes should be. [ I'm not. I have a clear vision of this, and of how to make it very strraingforward. I just need to get it all writen up - AJA ]

Personally the thing which appeals about TWiki is this virtuous pair of qualities:

Low requirements for the server

Low requirements for the client

Requiring DBI, a database, a YAML (or XML) parser, various CPAN modules as default is not something I'd advocate. Also using either XML or YAML or even perl files for filesystem layout strikes me as a problem. (They all break TWiki's low requirement aspects)

Regarding having config files for saying where things are - sure - no problem - config files are important the twiki installer I wrote asks questions, and then generates one after all . If you don't use an approach whereby someone can point (say) a Legato client at a specific directory tree, and say "back up that directory hourly/nightly", and can you backup that tree (say the code tree) weekly/monthly, then it's going to be awkward. Whilst filesystem based approaches have problems, the advantages to systems admins cannot be overlooked. Likewise in the case of config files currently TWiki's main config is pretty useless to a system's admin - no shell script he/she uses can directly get at the values! (The installer mentioned above uses a config file that is both shell & perl parsable - something neither XML or YAML can claim...)

All said and done, if anyone wants the code & data separation above, they can just grab the installer. (Which now that I've got a chance I'm updating to the Beijing release) Oh, nearly forgot... Hierarchical webs work just as well with this layout as they do with the default TWiki install.

My cuurent model is that - like so many other systems and applications - there is a kernel of the Next Generation TWiki, a core and then add-ons. We can see that in the current implementation. I'm just drawing differnt boundaries.

And actually, one of the YAML storage formats IS perl parsable.
The more I work with YAML the more impresed am I.
The limitation is imagination.
One way to use it is a "Sysgen" - parse the YAML file, prehaps when Apache starts or by
hand whenever changes are made, and store it as a DB_Hash. Fast access.

The point was that config files which relate to file system divisions (code/data) should be both shell and perl parseable. TWiki currently breaks this requirement. (Well most TWiki installations, the one I use/maintain doesn't) Neither do YAML nor XML based configs match that requirement. By meeting this requirement for data & code sep, you allow syncing & backup to be done with many tools, not just the ones we might supply. (rsync is just one tool, there's plenty of others out there) The system described above works, works with many different types of backup and syncing mechanisms, and code is available for those who wish to use it, and has been used in a production environment with multiple sites, with multiple edit locations.

On the surface, there are structures that can be represented in something perl-parsable,
such as might look like a Data::Dumper or YAML's 'ysh' output, something that looks
like the perl code of a data representation, which cannot be represented in shell.

On another level they are both adequately powerful languages that they can parse anything.
YAML is written in Perl; Glade is written in Perl. I suppose one could write a parser for C++
in Shell. I've seen (and written) some very large applications in Shell (in the days before Perl).

If you're running both Perl an Shell you're probably on a UNIX machine and can do a lot of
pipleining. You might be using 'make' to automate things as well as 'expect' and thigns like Wget
or cURL.

As for 'many types': My model allows for things such as Storage::rsync and there is no reason,
given properly specified interfaces, that modules can use external code such as is aleady done by
the RCS code.

The point of the separation I discussed above is to make life at the filesystem level simple for any systems admin not for "elegance", or "wonder", or "beauty", etc. YAML, XML and even perl files are essentially shell hostile configuration files. Wget is not on all systems, expect is not available on all systems, many systems admins in many corporate environments don't like putting perl on systems unless they have to, many refuse to install CPAN modules.

Hence the reason of having something simple and agnostic enough for even sh to parse IMHO matches TWiki's aims of low requirements on the server and client. I'm not saying that you can't do things - I'm saying if you did some things I wouldn't take your code. Having had to build & maintain a distributed TWiki running on 4 different OSs with 4 majorly different installs, having a totally agnostic config readable (actually runnable) by the shell makes a big difference. I've seen RTSP implemented in sh, I don't view that as a valid config file format though, despite being having a key: value system.

I think I've said everything on this I can now, and AFAICT this discussion has very little to do with DataAndCodeSeparation anymore, and is more to do with configuration rather than actual separation, so I'll stop at this point.

The inclusion of modules such as CGI, IO::File, File::Copy, File::Spec, Algorith::Diff,
Time::Local, Fcntl and IPC::Open3 (to take a grep of my twiki/lib tree)
wasn't a problem. As discussed elsewhere, any needed modules can be packaged with TWiki.

As for syncing multi-sites, the example update engine I described would use the Tng Storage::URL
(or perhaps just Storage::Ftp) to achieve that. Push and pull.

Take a few conceptually simple building blocks and permute them in innovative ways ... apply
imagination.

I think Michael has a point Anton, the conversation gone to a longer-term view than is workable for the code fork I am asking Michael to bring in to CairoRelease. The other stuff you talk about sounds laudable but do need to be talked about in a different topic (perhaps you can refactor the long-term proposals you are suggesting?). Which brings us back to the short-term. With this in mind, would you mind rethinking your answer to this question?

Anton: "Directories under Web1 conflict with the model of heirarchical webs" - I agree in principle but if you make the assumption that Webs will continue to use an initial capital letter and we rename Daily to daily I think we can avoid conflict. Alternatively we could insist that all directories inside the web ones have a prefix that will segregate the name space. Does that allieviate your concern? Do you have any others?

Much appreciated.

Such confusion (between short/medium-term and longer-architecturally important matters) is exactly why initiatives such as PleaseCreateNewWeb are so important. For confusion causes misunderstanding and disharmony, both of which are bad. We don't bad stuff happening here