The researchers, testing a technique they created for analyzing computer memory use, found over 100 errors involving incorrect orderings in the storage and retrieval of information from memory in variations of the RISC-V processor architecture. The researchers warned that, if uncorrected, the problems could cause errors in software running on RISC-V chips. Officials at the RISC-V Foundation said the errors would not affect most versions of RISC-V but would have caused problems for higher-performance systems.

But that lead me to updating "state" from two different always blocks. Which violates all the guidelines I have ever seen.

Next, up I like to talk about state machines and "case" statements. I was wondering why people use "one hot" encoding of their cases rather than my naive simple numbering. That led me to a paper that made simple state machines look so complex I don't ever want to go there.

Since you're only playing with FPGAs it's fine to experiment. I usually draw some stuff on an actual piece of paper when it involves so few signals.

Regarding one-hot. I guess a lot of people figure there are tons of flops in an FPGA and they visualize an efficient implementation based on it. Some synthesis tools are smart enough to recode this stuff. I don't know that I have the best state machine style, so don't want to write too much on this topic. But I don't typically use a one-hot style ;-)

Talk about long days - I've been working on my taxes. In other countries (e.g. Spain) I heard that the government does your taxes for you, and you just check them for correctness if you feel like it. Here an entire industry of software and CPAs is created by the insanity. (For a lot of people it's pretty simple, but in a way that's bad news because it means that they don't get to participate in employee stock purchase plans, get stock options or restricted stock units, have much in the way of investments,...)

Oh poo, you are right, the sieve is still enabled in my latest version. I was just playing. It did once print it's results to my SimpleIDE terminal.

I have to clean up my act in the firmware department. That will have to wait till I have figured out how to get real C programs, with a main() and using newlib, to work.

The "da da da" is all about counting bits that have been clocked out and changing state etc.

As for, the one hot thing, I found a paper by Clifford Cummings discussing state machines. He made it so horrendously complicated! I could not follow.

Like I said, I have no feel for what actual logic my verilog generates just now. When I write C code or whatever I can get a dump of the compiled machine instructions. One starts to realize what a compiler can optimize and what it cannot. The best I have now is the schematic produced by the Quartus RTL viewer.

KeithE,
The "da da da" is all about counting bits that have been clocked out and changing state etc.

If you draw just a little bit of a waveform on paper, can you create it just fine only using the posedge of SCLK, except for the MOSI edge coming out early? If so, then you just need to reclock MOSI with the falling edge of SCLK. Then all of the counting etcetera can be in one posedge block.

Edited to add: I wonder if I can get the tools up on a Linux Mint laptop until I try again with the Pi. I don't want to do much on my main computer until taxes are filed.

If you draw just a little bit of a waveform on paper, can you create it just fine only using the posedge of SCLK, except for the MOSI edge coming out early? If so, then you just need to reclock MOSI with the falling edge of SCLK. Then all of the counting etcetera can be in one posedge block.

Regarding the HiFive1 - I assume that they have some sort of debugger interface? That's one thing about the picorv32 - don't you have to build the FPGA to get new software loaded? I guess you could put some off-chip non-volatile storage. Is there some sort of monitor program already written? Or some way to interface a hardware debugger?

SO much useful info...
I wanted to clarify some issues regarding clocks and some experience with SPI I gained last days... KeithE said everything I wanted to say and more...

Regarding SPI: Once upon a time I did a processor core in verilog, it could only execute from SPI RAM (FRAM was my target). I wrote a model of a SPI FRAM to test it. That is the way I saw some people use at work too, the ones that know how it is really done . But I only simulated the code with icarus verilog. I am sure it has every possible mistake in it, that you can make in verilog, as I was at the beginning of the learning curve... I may have moved a bit now...

On this other project, I wanted to drive a DOGM132 and a SSD1306-based graphic display.

Here is the SPI driver so far, do let me know if it is crappy, actually I have never bit banged a SPI device before so the whole idea might be wrong :

SPI design is a whole topic in itself.

The very simplest is just a master shift register with parallel/serial modes.
That's usually enough to talk a few bytes of data to a slave peripheral.

However, this simplest design cannot write while shifting, you must add a buffer if you want to avoid dead-zones.
Slave SPI complicates things further, as you have no control of clock or phase.
Smallest MCUs are byte-only slaves, better ones have FIFOs allowing multiple bytes as slaves.

Next is the detail of do you want 'gapless SPI' ? That means no added clocks between byte-byte, and many small MCUs fail on this one.
This tends to need a FIFO and some care in the next-value load, as you must both update the pin, and fetch the next FIFO element on the same clock.

Next is fine control of the Number of bits and the clock speed... as in the smart pins.

I also like the idea of a SPI peripheral, that can manage JTAG io too..
( I don't think the smart pins can do this ?)

Here is the SPI driver so far, do let me know if it is crappy, actually I have never bit banged a SPI device before so the whole idea might be wrong :

SPI design is a whole topic in itself.

The very simplest is just a master shift register with parallel/serial modes.
That's usually enough to talk a few bytes of data to a slave peripheral.

However, this simplest design cannot write while shifting, you must add a buffer if you want to avoid dead-zones.
Slave SPI complicates things further, as you have no control of clock or phase.
Smallest MCUs are byte-only slaves, better ones have FIFOs allowing multiple bytes as slaves.

Next is the detail of do you want 'gapless SPI' ? That means no added clocks between byte-byte, and many small MCUs fail on this one.
This tends to need a FIFO and some care in the next-value load, as you must both update the pin, and fetch the next FIFO element on the same clock.

In the smart pins, everything gets registered from the I/O pins and then analyzed. This is how SPI stuff must work. You can't just bring a clock in straight from a pin and use it to clock things. You'll have a world of synchronization pitfalls. Better to nip it in the bud and get all signals into your own clock domain and then look for changes. This means, of course, that you must be going at least twice as fast as the external clock coming in, in order to grab it and track its changes without missing anything. Maybe 3x faster, or more, is practical, since you need to accommodate external signal skew.

In the smart pins, everything gets registered from the I/O pins and then analyzed. This is how SPI stuff must work. You can't just bring a clock in straight from a pin and use it to clock things. You'll have a world of synchronization pitfalls. Better to nip it in the bud and get all signals into your own clock domain and then look for changes. This means, of course, that you must be going at least twice as fast as the external clock coming in, in order to grab it and track its changes without missing anything. Maybe 3x faster, or more, is practical, since you need to accommodate external signal skew.

Yes, if Heater wants to support Slave as well, the above is exactly what I meant by Slave SPI complicates things further, as you have no control of clock or phase.
Master-only SPI, with gaps, is the simplest, but least flexible. In this case, 'small logic' may trump 'general purpose'.

In the smart pins, everything gets registered from the I/O pins and then analyzed. This is how SPI stuff must work. You can't just bring a clock in straight from a pin and use it to clock things. You'll have a world of synchronization pitfalls. Better to nip it in the bud and get all signals into your own clock domain and then look for changes. This means, of course, that you must be going at least twice as fast as the external clock coming in, in order to grab it and track its changes without missing anything. Maybe 3x faster, or more, is practical, since you need to accommodate external signal skew.

Yes, if Heater wants to support Slave as well, the above is exactly what I meant by Slave SPI complicates things further, as you have no control of clock or phase.
Master-only SPI, with gaps, is the simplest, but least flexible. In this case, 'small logic' may trump 'general purpose'.

This reminds me. All our synchronous modes use slave clocking. Rather than time our own output with our own clock, we read the clock coming back in, because that's what the other side is going to have to do, too. We can do some data read phasing by picking taps along a 2-3 bit delay, clocked by the system clock.

This reminds me. All our synchronous modes use slave clocking. Rather than time our own output with our own clock, we read the clock coming back in, because that's what the other side is going to have to do, too.

All that is something we miss in the P1, among others, like the new all powerful SERDES. Bitbanging is ok when you have many cores but reading a clocked signal in is still a slow process. Conditional execution of opcodes helps too

@Heater: I got your project working on the DE10-Lite (MAX10). The Fmax is 54 MHz, and no it doesn't work at 100 . That means the MAX10 is not a Cyclone IV but something else but similar . I thought that it were at least as good....
The built-in Byte blaster isn't recognized when I use my propplug , the one on the BeMicro didn't have such a problem.

Heater - I had forgotten about an old MacBook with Linux Mint that was sitting unused in the office. So I installed everything on it since it's a lot faster than the Pi for experimentation. It turns out that picorv32 includes an lattice icestorm example in picorv32/scripts/icestorm. It seemed to work ok, but I had to delete an "-m32" from the Makefile because riscv32-unknown-elf-gcc didn't understand it.

Anyways seems like it would be easy to run your example through those tools and see what they have to say. You would need to disable the Altera PLL if you were targeting Lattice.