It consists of 8 x 32bit cores (Called COGs) with 512Longs of memory each, attached to a common RAM/ROM through a HUB which 'revolves' at a constant speed, so that every COG has equal access to common resources.

The Common ROM contains a boot ROM, a spin(interpreted language**) interpreter, character map and several useful tables (Link sine and cosine values).

When powered up, it will first try to establish a serial link to load a user-program, and if that fails, it will attempt to do so from a SPI-type EEPROM connected to another pin. This program may be as large as 32KB and will be stored in the common RAM.
The system will then load the Spin interpreter into the first available COG, which will then begin to execute the code at a speed of up to 80.000 instructions/s (if running at 80MHz*).
All code will be read from the common RAM as it is needed, and all variables will also be accessed in common RAM.
When executing it may spawn other processes which will result in the interpreter code being copied into additional COGs' RAM, or other COGs may be loaded with machine-code which executes in local RAM at a speed of up to 20MIPS.
As all COGs have the ability to stop or start other COGs, the first COG which must of necessity run Spin can also be restarted to run pure machine-code, resulting in a theoretical 160MIPS for the Propeller as a total.

Other resources in this chip is:
8 Semaphores which is user-controlled(for inter-process control of resources)
32 IO-pins which all COGs can access simultaneously.
8 Video-cores, which only needs 4 resistors each to produce colour Composite Video. In theory, one single Propeller can control 8 screens.
(There is a demo video found on the forum on the Parallax site demonstrating graphics compatible with late 80s game consoles)

Assembly programming is different from most processors as it does not use a stack(Really!) Instead all 'call' type instructions also contains the address where the corresponding 'return' command is stored, and will when executing, update this with the correct return-address.
This means that it is not capable of performing recursion.

Also, while it does have status flags like Carry and Zero, whether or not an instruction actually updates those is also programmer-dependent.
This means that you can have this sequence:

1. Do a comparison, updating Zero flag
2. Subtract a value from a location, discarding any Zero flag update
3. Do a conditional JMP based on the status of the Zero flag from instruction 1.

Access to common memory is also a point to watch out for as the HUB will only grant ONE READ/WRITE/ACCESS to common RAM/ROM for every 16 clock-cycles.
When doing repeating operations it is therefore vital to make certain that the READ/WRITE/ACCESS type commands match this timing, so that as few cycles as possible are lost waiting for the HUB to grant the request.

*The Parallax BasicStamp 2px(based on a Ubicom SX microcontroller, running at 32MHz), currently the fastest of their series of microcomputers, can do a theoretical 19.000 PBASIC instructions/second.

**While Spin , like PBASIC is an interpreted language created by Parallax, the Spin interpreter is not protected from being read out, as the PBASIC interpreter is. This is because it is highly optimized(less than 512 instructions) for the Propeller's instruction-set and therefore not readily portable to any other platform.