Monday, October 25, 2004

NAME

lab 15 - cluster file

DESCRIPTION

I'm still curious about layering of filesystems.
Ds(3) was an example I looked at before writing
signalfs. Another example was cryptfs (lab 2). Ds
currently resides in /os/port/devds.c. A while ago
I moved it into /emu/port/devds.c but hadn't made
any use of it. I have wanted to use it for a
rudimentary cluster file system, so I ported ds to
limbo to play with distributed files and further
explore files that are built from layering of other
file systems.

I translated the C code and used styxservers to
manage the simple, one level namespace. Here's
some testing. This is really as much as I've
tested it so far.

I read the googlefs paper again today. With that
in mind a cluster file system could be pieced
together using some inferno components.

A kfs(4), or any standard fs, represents the
master namespace. All files contain only a ds
configuration--the chunk IDs and partioning info.
All the inferno nodes write to the master registry(4) the chunks they will serve. Ds grows files by
reading chunk id's from a master process that uses
the registry to allocate new chunks. A client must
navigate the master namespace to the file
containing the ds configuration and mount ds in
it's namespace. Then it has a distributed file
which communicates directly with the nodes storing
the disks in the network.

CONCLUSION

Not much thought given to the deeper problems. How
would multiple writers on different clients append
to the same file? Here is the source for dsfs.b

Suppose that tick.bt is a log of time spent on
tasks where each record is the timestamp, task and
number of seconds spent on the task that instance.
Rng produces 1 or more date ranges. Remote
converts type /fd to an endpoint. Mapreduce will
then split a date range, such as one year, into M
smaller date ranges. For each subrange it calls
rexec passing it the address of an available node,
the subrange and map function as parameters.

The output from all the map functions is directed
to R endpoints. The R parition function could be
hash(key) mod R as suggested in the paper. Then
mapreduce rexec's a reduce worker, which reads in
all the data from the endpoint, sorts it, and for
each key calls the reduce function with the key
and list of values (or /fd) as parameter. In this
example tock, the reduce function, sums all the
time values for a task and outputs the total.

I've made the example specific to tickfs and the
use of triads merely because I already have these
tools and makes it easier for me to grasp. The
google paper uses key, value pairs. I'm ignoring
all the other factors they consider, such as fault
tolerance, locality, and much else.

Here's another example. In the distribution on my
homepage I include a command nsearch for searching
a tickfs index. The command is given a list of
keywords. Given the first keyword, which might be
a date range, it builds an initial set of keys. It
then partitions this set among a fixed number of
threads. Each thread test the record coming in on
a channel against the index and the search term
given as parameter to the thread. The reduce
function would be an identity function, simply
passing through it's input. This is a map, filter,
reduce pipeline. Alphabet seems to provide the
tools to express this whole query and more on the
command line, including distributing the
processing among nodes.

The implementation needs somewhere to lookup the
available list of nodes. Administering all the
nodes would need some fancy fs that managed the
status of all executing workers. I'd keep this to
an absolute minimum for now.

CONLUSION

This all sounds very promising but I don't know
how to implement it yet. Here are some more notes
while I think this through.

The revelation for me is the importance of
functional programming to distributed computing.
It wasn't long ago (lab 1) that I discovered limbo
shell supported functional programming. Alphabet
takes this to the next level by defining types.
Alphabet-grid provides the connection between
processing modules on distributed nodes.
Altogether it provides a framework for distributed
computing I'm still coming to grips with. It is a
different way of thinking about computing than I
am used to.

REFERENCES

Monday, October 18, 2004

NAME

lab 13 - implement the flute instrument from STK.

DESCRIPTION

I implemented more of the STK library but his time
as a straight forward translation to a limbo
module. Much of the protected classes and filters
are in dsp.b as ADTs. They all share a similar
interface that includes functions mk for building
the object and tick for processing the next
sample.

The instruments are generally larger to implement
but follow the same interface. They can be plugged
into a signal module and then read and controlled
from within signalfs.

I've included a few simple modules that can be
used to start a new instrument. I also tried to
implement the more complicated Flute. It's close,
but still doesn't sound right. It all needs a lot
more debugging.

Wednesday, October 13, 2004

I'm converting all of STK to limbo, but not directly into signalfs. I'm creating
a module that will contain all the sound sources, filters, and effects in the STK,
with one ADT for each sound. This can then be used by signalfs to serve a
file, which can be a combination of any of the ADTs, or by any other limbo
app.

Rog has suggested an alternative application using the shell alphabet.
I will try this once the library is written.

Rog pointed out how inefficient signalfs is in its current form.
I agree; the performance is terrible, which makes it compleletly unusable
for realtime sound support. This re-implementation will improve performance.
But any hardcore DSP programmer is only likely to snicker at our attempt
to implement DSP in limbo. At the end of the day I'm doing this to create
a framework for ease of experimenting with DSP, not to create a sound
system that will out perform all others. That is the tradeoff I make
by writing this in limbo.

Another possible implementation is to create a C library module
or device compiled into emu. This would perform well but be
less malleable. I'd rather code in limbo.

Monday, October 11, 2004

NAME

DESCRIPTION

I implemented an oscilloscope called scope to view
the signals produced by signalfs, or other PCM
data such as an iaf file stripped of it's header.

% scope < bach.raw > /dev/null

It writes it's input to it's output, but it
doesn't sound very good if directed to /dev/audio.
It writes small blocks of samples with small
delays between writes, making it sound very
choppy.

Scope tries to draw 25 frames a second, getting a
tick from a timer, and reads 1/25th of a second of
samples from the input, then draws it on a Tk
panel.

This is might be useful when recording input from
a microphone

% scope < /dev/audio > out.raw

It takes as parameter the sample rate and number
of channels, stereo or mono.

CONCLUSION

Not being able to listen and see the waveform at
the same time makes it less useful than I hoped.
How do I keep in sync the visual and audio stream?

I'd like to present a similar graph using the FFT,
but I keep getting errors from the fft module. It
correctly does the transform but isn't able to
permute the numbers into normal order (I think, I
do not know enough about this.) Commenting out the
"optional code" in /appl/math/fft.b seemed to make
it work, at least make it not exit.

FILES

Thursday, October 07, 2004

I spent the time this evening downloading the Microsoft C/C++ toolkit and the SDK. I have only built emu for linux and plan9 so far. So it's time to start building an XP version. I also expect to be modifying the audio driver to add some more advanced features, such as 8 channels.

While waiting for the downloads I found this site about audio networking interesting. A DTMF decoder would be a nice lab to do some night.

I tried compiling the inferno distribution for XP. I'm missing LIB.exe.
I used link /lib instead. Got most of it compiled, including emu.

Wednesday, October 06, 2004

NAME

DESCRIPTION

I am continuing to add signal modules to signalfs
copying the implementations from stk. Today I'm
working on the delay line, and whatever else I can
implement in two hours.

The delay line does not fit the model of signals I
have created so far. From the STK it looks like it
is used more as a utilitly class than a standalone
filter. Its used by the echo class which actually
does the mix of the current input with the delayed
input. I could of course do the same thing and
have delay as functions with the dsp module.
Trying to use the delay, or echo, brings up a
number of issues.

How am I going to stack multiple filters ontop one
another and still be able to ctrl each one
independently? To access to each ctl file I'd need
to know the conversation number. This might be
tricky to find out if I have multiple instruments
each being built from many modules.

I want to alter the effect during playback
independently of the instrument being played. But
I'm not sure how to fit it in with a simple
instrument. Where in the stack should it go? And
how will I control it if it's placed under the
instrument?

This goes back to the problem of needing some kind
of patch bay. Given a particular instrument we
need to now all the effects tied in to it. Then we
want to write to the ctl file of any of them, not
via the instrument but directly, and alter the
effect. We need to remove the exclusive access
condition on the ctl, although we could place it
on data instead.

If I didn't do this I'd need a naming convention
within the ctl file that was at the end of the
filter pipeline. But that is ridiculous because
what else am I using a fs for.

Therefore, If I put the, say, echo filter in front
of the instrument, I still send note events to the
instrument, but read sound back from the echo data
file.

Is the sequencer going to be able to manage all
this? The skini language may have to include
naming of instruments using filenames. That is,
events are sent directed to specified ctl files
(filters, instruments) but audio data is read from
only one, the one at the end of the pipeline (is
pipeline the right term here? filter chain, patch?
sink?).

I need to specify the sequencer language and a means
for retrieving all the conversation directories
for a pipeline before going further.

CONCLUSION

It works much faster and uses much less main
memory. It still uses a lot of heap memory. Tk is
nice but has it's limits. It works well for
creating and removing the highlight rectangles.
But it isn't appropriate for creating the main
image, because I don't want to keep track of
every rectangle image, and there can be so many
entries. The largest I tried was about 300,000
entries. I'd like to try creating a large image
with 1 million entries.

I'm still improving this because I intend to use
it as some kind of radar for monitoring file
system activity.

FILES

NAME

DESCRIPTION

I have been reading through spree code and
wondering what engines I could write to learn more
about this type of filesystem. Here is one idea
that incorporates signalfs.

The aim is to create a distributed electronic
music band. Each player is a client to spree and
can create instrument objects and play notes on
these instruments. The other players will have the
instruments and sounds replicated locally by their
spree client; and similarly instruments and notes
they play are heard by the other musicians. Each
spree client runs a separate signalfs but with a
common set of signal modules.

The spree action commands will be SKINI with
extra commands for creating and configuring the
instruments. The spree engine manages the shared
configuration of instruments. Note events are
broadcast to all players. The spree client is
therefore a type of sequencer, using timing and
note events from the engine to trigger reading
sound from the signalfs and writing to /dev/audio

Using OpenAL, or implementing it in signalfs, each
instrument and player is placed in 3D space. The
engine manages the positioning of objects, which
now includes listeners, and each player hears the
sound relative to their position in the virtual
room. This requires a surround sound speaker
system.

If this is implemented successfully it not be too
difficult to generalize. Surely if we get the
synchronization working for audio we could do the
same for visual object simulations.

These ideas are influenced by my recent reading
about Croquet. I'd like to see the implementation
of the TeaTime protocol. The time synchronization
seems the most challenging aspect of this idea.

Friday, October 01, 2004

Here is a new tmap.b with the memory leak removed. I just needed to call Tk cmd delete all on the canvas to remove all the old rectangles.

And finally a screenshot of tmap using output from du -a / on the root of the Inferno distribution. You can see a file and it's path highlighted in red.

I've played about with the coloring a bit, but haven't found anything I really liked. The current scheme is for colors to represent depth. It goes from dark blue to light blue, through greens and yellows and ends on red for the deepest nodes.