I recently found the software SPEECH program i used to use at college on the bbc (Early 6502 machine) all those years ago...

http://bbc.nvg.org/sw/Superior/Speech!.zip
This is an amazing piece of programming, because it allowed any sentence to be spoken in English(Possibly one of the most idiosyncratic languages around), reproduction was remarkably good for its age (1983 iirc) yet all sat in just 7.5K of memory!

Loaded up speech. had a look at some of the utilities and found the Original Speech relocator program which told me the original load address of $5500. I knew the length (7.5K or $1E00) so i just had to find this memory on the speech.ssd disk image. I also found a free binary viewer

http://www.howell1964.freeserve.co.uk/A ... BC_Mem.htm
This (And the second link) told me the locations to access the sound chip ($FE43-$FE45).
So i loaded the speech.ssd file into the binary viewer and searched for $43FE (16 bit data is stored low then high). I found it and managed to find the data around it spanning 7.5K exactly.
So i extracted it using another editor and saved the flat file as speech.bin

I then used Dbugs really cool HEADER.EXE (You can find it elsewhere on this site) to add an ORIC tape header with the known start address of $5500.
I ran up Euphoric, loaded the tape file, then saved to a new works disc.
I enabled the printer (to a file).
I then loaded ORION (This is a well known Oric Disassembler) and then PDUMPO (An extension of ORION) that redirects output to printer.

I then loaded speech file from disc($5500-$72FF), ran PDUMPO(BASIC) which calls into ORION ($8100-$97FF) and dumped the complete disassembled Speech memory range to printer.
I then returned to desktop and opened this up in my text editor, resaving it as a xa compliant .s file.

I then searched again for the $43FE and found that section of code.
Many areas had partial disassembly then BYTE which was unrecognised op-codes, this just meant data (Relied on the old code not having undocumented op-codes).
I went back to the binary viewer (HEX Editor) and found the word to phoneme conversion tables and other larger areas that hold the actual phoneme samples.

i am currently working out how the code works so that i may convert the code to work with the AY-3-8912 sound chip.
I have established thus far...
$5500 Phoneme Samples (and maybe some other tables) - 2816 Bytes!
$6000 Machine Code to play and decrypt passed sentences - 2199 Bytes
$6897 Word (or part of) to Phoneme data records - 2664 Bytes
$72FF End of Data

I finished the reformatting of the memory file, although a couple of areas are a bit strange. The mc appears to use a few BRK instructions, but i have not yet been able to determine what they are used for since i know little about the inner workings of BBC.
I provide below a link to the complete 7.5K source file of Speech!

Twilight didn't comment much of the code but as he mentioned in his post, the text to phonemes table is at the bottom and the top must be the data for phonemes.
There should be pointers to where each phoneme data starts and if it's like other speech programs I've seen it should just loop outputting data to the sound chip. The real key is figuring out the format of the data which is bound to include some sort of timing along with each sound that makes up the phoneme.
Once each phoneme is isolated and you know how the bytes are stored you have to convert the data to the AY chip and change the output routine accordingly. The AY chip involves setting the output register # and then data for that register but the TI chip sets both in the same byte. This also happens to be one of the only things commented in the disassembly.

It *might* be easier to port some speech code from another AY based machine than to port the data.

Technically, I believe they aren't actually phonemes but sounds that are joined together to make them and the conversion table lists the sound sequence.

r654C seems to set up the pointer to the phoneme conversion table at $70 and $71

r655E seems to advance through bytes in the phoneme (sound) conversion table so that's probably the text decode routine.

r61BF seems to access the first table of data using x as an offset and it updates page zero for what is probably loop parameters for the sound output. It could be pitch or other parameters set with commands listed at r1401. ??
There clearly needs to be some sort of timing for different pitches but there is no guarantee that is what this does. Odds are the delay calls in the code are in the playback routine.

r6327 seems to be testing for $255 as the end marker for something. A phoneme (sound)? It also seems to call a ROM routine. Keyboard or string input maybe?

r6334 tests for $23 so that must be some sort of command or separator in the data. ASCII code for #?

$97 and $98 appear to be loaded with a pointer or counter from a table. Phoneme/sound pointer?
$77 and $78 also appear to be a pointer or counter. I'm guessing a pointer since x is stored in $77 which is the LSB.

The first chunk of data appears to be a line starting with 0 followed by 4 sequences of 4 numbers with the last sequence only having 3, possibly taking advantage of wrap around of the data to 0 on the next line. The table is padded out with zeros, probably to align the next table on an page boundary. Anyway, expect data to be loaded in groups of 4 in the code somewhere.
The table appears to have 41 entries before the zero padding. The last digit in the 4 byte sequence is a multiple of 64 which is bit 6 or higher. Perhaps sound data for 4 sound channels but I wouldn't think more than 1 is required but I could be wrong.
The numbers don't look like pointers but the first byte of each sequence corresponds to a page number that would follow that table.

There is a noticeable change in data near line 140 implying another table starts there but I can't be sure.

Sound data may be stored as byte pairs with a register setting and duration but that's a guess.

I didn't spend much time on it so I can't guarantee any of this is correct but maybe it will help someone else make some progress.
I couldn't find a decent memory map for the BBC so system specific stuff is a mystery.

Ha! This can't be coincident.
First - yesterday I was busy to disassemble the Bulgarian speech program for Oric and Pravetz 8D- nice synthesizer talking relatively understandable . It uses "hardware" hack - one simple wire between tape-out and sound pins on audio connector.
I reverse it because sources were never released and the author Boby said they are lost forever.
Second - you mentioned Twilighte - back in 2010 hi contacted me asking to improve support of VIA's shift register in my emulator, so it can be used for sounds output - unfortunately, then I didn't understand fully what exactly should be done...
Twilighte with his knowledge, was so good!
So, my plan is to modify the speech engine and make it (somehow) working without the wire-hack. I'll try to analyze speechdump too - maybe a mixture of both programs will be possible.
Let's make "Oric speak" ! (see Twilighte's last sentence in his first post...)

Twilighte made a routine for me long ago, playing small digitalized sounds.
I then took phonems from various sources and tried it but some sounds were not working fine (t for instance).
I think I uploaded all this on DF FTP long ago, if you wish I'll check again.

Yes, I truly believe it's doable.
Did you tried ever the Bulgarian synthesizer? It's really good.
As first step I'll make needed changes to Oricutron to support the "wire-hack" (I think Xeron will agree ), so everyone can try this "speaking wonder". Together with this I'll translate program's Bulgarian strings in English and manually "preprocess" disassembled code for better readability.
(@Dbug: I have update for OSDK)