RISC V ?

Comments

There is a flow included with picorv32. I probably missed something given you said 5280, but here are results for the 7680 LC part the HX8K.

Thanks, but I may have been unclear.
I want to target the iCE40UP5K–SG48, which is a new member with 5280 LUT, but more important, it includes 128kBytes SRAM, and 8 x (16 x 16 Multiply & 32 bit Accumulator Blocks), in a QFN48

(HX8K has much less RAM on chip, at just 128k Bits)

I have Lattice iCECube2 tool flows for this, but the open-source efforts do not yet generate bitstreams for this new iCE40.
So I was looking for Lattice-Tool-Chain(Lattice iCECube2) design flows.

(from https://github.com/cliffordwolf/picorv32
For even smaller size it is possible disable support for registers x16 .. x31 as well as RDCYCLE[H] , RDTIME[H] , and RDINSTRET[H] instructions, turning the processor into an RV32E core.

Furthermore it is possible to choose between a dual-port and a single-port register file implementation. The former provides better performance while the latter results in a smaller core.

Note: In architectures that implement the register file in dedicated memory resources, such as many FPGAs, disabling the 16 upper registers and/or disabling the dual-port register file may not further reduce the core size. )

Does anyone knows how to force Quartus to implement the registers without block ram? Don't know if there is any directive or parameter to do that.

"To use the ramstyle attribute in a Verilog Design File (.v), specify the synthesis attribute delimited by (* and *) preceding the Variable Declaration of an inferred RAM whose implementation you want to control. Specify the synthesis attribute value as "logic", "M512", "M4K", "M9K", "M20K", "M144K", "MLAB", or "M-RAM" depending on the type of memory block you want the Quartus II software to use when implementing the inferred RAM. If you use the synthesis attribute on anything but a variable that represents a RAM, or if you specify an illegal value, the Quartus II software ignores that synthesis attribute.

By setting the value to "M512", "M4K", "M9K", "M20K", "M144K", "MLAB", or "M-RAM", you can choose the type of memory block that the Quartus II software uses when implementing the inferred RAM. If the attribute is set to "logic", then the RAM is implemented in logic cells."

Interestingly, if you disable counters ('ENABLE_COUNTERS64 = 0' for RV32E version) they are still defined but not used (they take space). Don't know if they can be commented (I think other code will need to be commented out too).

There are 5 control registers (reg_xxxx) that cannot be changed.

And for cpuregs, as ENABLE_IRQ is disabled by default, those are 32 by default. But can be reduced to 16.

That means that the minimum 32-bit CPU registers are 21 (16x cpuregs + reg_pc, reg_next_pc, reg_op1, reg_op2, reg_out) if we don't use the counters. Those are 672 bits, they can almost fit in an Arria MLAB (640 bits LUTRAM).

It could be interesting to check how 'ramstyle MLAB', 'ramstyle M9K' and 'ramstyle logic' compares.

As DE0-Nano (Cyclone IV) does not have MLAB, those registers are using only 14% of one M9K block. Yes, those are not the only 32 bit registers used, so it is not relevant for actual M9K usage. But it still a good exercise for those that want to check if a minimal picoRV32E can fit in a big CPLD that has no block RAM. Also it can serve to test ways to implement the CPU that do not use FPGA (like Yosys/standard cell, OpenRAM, opencircuitdesign/Magic). Or for a future Yosys to Quartus synthesis comparison (now that they seem to start supporting that option too).

Thank you, those numbers are interesting. I am curious about how far can we go on minimal resource usage.

Haven't found much info about the RV32E or RV32EC variants. It seems that almost nobody cares about a minimalist riscv (maybe because it won't let you get into a Linux kernel or any other fancy multiuser OS) and maybe the toolchains are not configured by default to that option.

There is also a report (don't remember if it is 'Analysis & Synthesis > Summary') where it will show the number of memory blocks used and other interesting things. Do you remember, or did you keep that info?

Interestingly, if you disable counters ('ENABLE_COUNTERS64 = 0' for RV32E version) they are still defined but not used (they take space). Don't know if they can be commented (I think other code will need to be commented out too).

Registers which are just declared and not used should get pruned out of the design.

Edited to add: also reg doesn't necessarily imply a storage element (flip-flop or RAM). It can be used for combinatorial logic. There's now a datatype called logic which was supposed to make this more clear. Not sure that worked.

In case there are any nutballs like me that want to build RISC V C code on a Pi. I confirmed that the icoboard image mentioned previously does work. I didn't want to reimage an SD card, so I did the following. Maybe someone will point out a much easier way. (I did this under Mint Linux)

Then I tarred up the appropriate subdirectories in /mnt/pi/opt. Things are a little large and I only wanted riscv32i. Here are the filesizes after the tarballs are compressed. They are roughly 600 MB each when uncompressed. I'm sure that there's a lot of cruft that could be cleaned up to reduce this.

I used this to build Heater's helloWorld on a Pi, and ran a simulation. I can see "Hello world!" being printed in gtkwave. Look at uartTx.buffer and view as ASCII - the simulation does NOT terminate, so you need to wait a minute or two and Ctrl-C it.
This is much easier than building these tools on the Pi. (I filled up a USB stick when I tried this a couple of nights ago - 8 GB of working space is not quite enough. Also for anyone that tries it's probably best not to use FAT. Or if you do, then you probably need to mount it with the exec option.)

Thank you, those numbers are interesting. I am curious about how far can we go on minimal resource usage.

Is this focus on size because you want to pack multiple cores into some larger FPGA ?
The gain by shrinking registers seems minimal, as that means tools chains and libraries need careful mode control.
With multiple cores, you might be able to share BOOT memory, and with XIP you may have a choice of lock-step boot on initial reset, or soft-reset done one at a time ?

Back in 1980 or so we built a Mororola 6809 board, all wire wrap. The guy I was working with added a cassette tape storage. Programs were written out to a UART that was then FSK modulated. He managed to decode the audio for retrieval with just a flop or two and some C and R. It was good for 300 baud!

We never thought to make it into an actual boot loader. We had a monitor program in EEPROM that had a load command to read programs from tape.

What kind of modulation scheme does this bootloader use? I always thought we could transfer data over audio a lot faster than 300 baud!

Keith, thank you. I recall that Heater posted that before, but did you believe that I was searching for that attachment through all pages of the thread and I was unable to find it !

And, I still cannot find the original post. I searched "firmware" and "attachment" and I can only find your post, but not the original post by Heater. I guess there should be some some attachment icon somewhere that I am not able to see yet.

I do feel you pain with downloading, building and installing things. There has been a lot of that going on around here.

I actually like downloading, building and installing. It is just that 'gcc' is soooooo special ...

4 hours spend today to build the risc-v toolchain. This time I used a relative modern computer with a E6600 at 3.00GHz.

So I followed all steps. Cloning git ... (I was fortunate to have a 7 MB download speed, otherwise I would still be downloading hundreds of MB), ./configure , make ... and first error : I need a compiler with c++11 syntax.

My 13.04 has gcc 4.7.3 by default, and risc-v developpers consider this version 'old', although it is only is 4 years old. OK, if I need 4.8 for the latest risc-v toolchain commit I will install it.

Then finding a ppa for 13.04 and GCC 4.8. Great, google saved my life, two post with complete instructions about how to do upgrade c++ / g++ to 4.8. So I follow all steps and start trying again to compile the risc-v toolchain with my new version 4.8.1 gcc.

After 15 minutes throwing garbage to the console, second error : something about unknown as opcode. OK, no problem should be some gcc first stage compiler glitch. I will not modify anything and try again ...

Then third 'make' and it is now going better, after 20 minutes it crash badly with a coredump message.

Then tried again ... 30 minutes throwing garbage to the console (now I can understand why so many changed to clang instead gcc with that wonderful percentage bar) ... and same coredump message.

At this point, I was now thinking that maybe I should try another thing. And I guess that someone would made that decission earlier, but I'll give it a try again (another make) ... but also in parallel I opened a remote shell to a xeon with 18 cores and 16 GB ECC to check what the hell is going on with this toolchain. I repeat all step for this computer (with gcc version 4.8.2) and it worked right for the first time.

So now I go back to my computer and found that after five failed attempts it worked. Now I also have the toolchain in /opt

I don't know if the problem was that my computer has bad RAM? or that I do need at least a 4.8.2 gcc to correctly compile? I think I already have enough, and I will not repeat the experience for MinGW/MSYS.

I can understand some hate for how things are done nowadays.

First: what is wrong with a simple tar.gz?
Second: what is wrong with a .deb, or .rpm, or even a windows .zip installer?
Third: why do they need to use the latest compiler?
Is it really that c++2011 is so much better than c++2003?

I remember those old days when a compiler fit in a floppy disk ... too many things are going wrong.

First: what is wrong with a simple tar.gz?
Second: what is wrong with a .deb, or .rpm, or even a windows .zip installer?
Third: why do they need to use the latest compiler?
Is it really that c++2011 is so much better than c++2003?

I remember those old days when a compiler fit in a floppy disk ... too many things are going wrong.

I remember those days as well, and even earlier, including upgrading from a 1702 eprom board to a 2708 board so I could fit a small assembler in with the eprom monitor. Those were fun days, and finding the Propeller chip and this forum brought a lot of that fun back.

In science there is no authority. There is only experiment.
Life is unpredictable. Eat dessert first.

That sounds like many of attempts to build this and that over the years.

What platform are you building on?

I built the riscv toolchain on a tiny Atom industrial PC running Debian Jessie. Worked first time. Debian is not famous for having the latest cutting edge everything.

To be fair:

a simple tar.gz will not work on all platforms.

A .deb, .rpm, or windows .zip install is not going to work everywhere. And besides that is extra work to maintain. These guys have better things to do. Like create the software. This is all very new remember.

C++2011 is six years old now. I don't think they are asking too much to use that.

But, yep. All these things trip us up. Thank God for Debian and GCC.

(now I can understand why so many changed to clang instead gcc with that wonderful percentage bar)

Hmmm...last time I built stuff with clang there was the same verbose output. No percentage bar.