Basically this system should consist of one "Mainboard" housing the Bus system an the IO connections.And then a variable amount of daughtercards using a standart format like SO DIMM or similar housing one or multiple FPGA chips.

This daugtherboards should be designed to be variable in the FPGS chips used, so there might at first be a budget low performance Line ,later a Midrange card and a high performance range in a final stage.

This modular system would allow a wide range of Bitcoin participants to start with a minimum setup with the Mainboard and one FPGA card and increase the number of FPGA cards according to their needs and financial capabilities.

So in order to bring this concept into reality i invite everybody interested to contribute to this development process.

I hope we may be abled to get to a prototyp stage this year.

Status:Design activ

Currently discussed: Layout of the first testboard for debuging. Firmware for MSP430 IC

Design features wich have been decided on

Gerneral

FPGA(Currently a modyfied layout of Olaf.Mandel is to be used)

- The FPGA used on the prototype is the Xilinx Spartan 6 LX 150 FGG 484

- The current design is going to use 2 FPGA's per DIMM PCB, at maximum.

- The prototype motherboard will hold 5 DIMM boards

Power supply(Currently Power supply by li_gangyi is to be used)

- Each DIMM board features an wide range input ~11-20V over either a Molex 8981 or a Barrel 2.5/5.5 connector.

- In addition the DIMM socket provides a 12 V rail supplied by an ATX PSU via the Mainboard if in Modular use.

- The voltage regulation providing the voltages needed for the components on the DIMM is located on the DIMM itself.

Communication

- Each board uses one USB mini B connection in standalone or the BUS system via DIMM pins in Modular operation.

- The MSP430 IC will be used for BUS (SPI) and communication via USB(both located on Daughter- and Motherboard).

Table of FPGA Performance data (estimated)(by Olaf.Mandel)

New version of the table with lower Altera prices (assuming 1USD=0.6891EUR):

So the first step will be to determine the needs of this system to give a platform to the " Open Source FPGA Miner ".

Therefore we should determine wich amount of LE's or wich chips from wich manufatures would be needed to run the miner at minimum in his full unrolled stage.

And wich additional hardware components ( flash memory, EEPROM, power supply,Bus system) are needed for the operation of the Mainboard an the daughtercards.

This approach shall use standart stock components an no customised FPGA or ASIC chips in order to get a prototype at reasonable costs. Introducing Chips customised for Mining might be an addition to this concept in the very long run.

Well, the XC2V1000 were great parts in their time, but now they are a little outdated and with the current open source design they dont perfom well. Besides that, Xilinx dropped support for them with their software, not to mention your would require a full version that is $2000 or so ...

The next point is, you need not a chip which can perform a fully unrolled loop, rather than a chips that have the best MHash/s per $ ratio, with the PCB-space and supportchips included. Then invest some time by adding some extra connectors and/or peripheral chips, so after the minig is over, you could sell the boards as overstock rather than only for scapping the chips. As you point out, time is a factor, and buying semiconductor Chips in quantities may have long lead times.

While the SO-DIMM form factor looks nice, i would think a bigger form factor will be more apropriate. Using high performance Chips, will require cooling equipment which has an impact on the mechanical stability.

For small light boards they can stand free but for heavier board with fans they could be mounted in a case (similar to ATX or not). PCIe 1x probably have enough signals for a bus here but it may be good to have more pins for power.

I'm not sure that the FPGAs are going to need such big fans but it sounds like a good idea to have the ability to be more structurally sound if need be. We should stick with connectors that are mass produced and easy to get for low cost.

For small light boards they can stand free but for heavier board with fans they could be mounted in a case (similar to ATX or not). PCIe 1x probably have enough signals for a bus here but it may be good to have more pins for power.

Maybe i got you wrong here. For the mainboard a PCIe connector might be a solution. But assuming a Altera Cyclone III (80k cells for one full unroled SHA-256 cycle) is 180 € , a SO-DIMM board plus passive components produced in china is something like 30 € , but a local bus to PCIe bridge would be additional ~80€ so i think we should stick to a classical Bus system for the daughtercards.

I case of power i'm quite sure it is feasible to transport the current over the bus system or additional pins on the card.

No, I'm just talking about using the same connectors on the main board to mate with the plugins.

I'm not saying they would be actual PCIe standard, or bridge to PCIe or anything. Just using these connectors as they are more robust than DIMM slots. PCI is wider with more pins the PCIe x1 so it may provide more free pins for power.

Edit: Of course, I really like also the idea of using a surface mount connector to get away from thru holes which could push the cost up. It would be easier if we actually knew how much cooling and how much bulk a plugin board would demand.

Well, so-dimm connector do not need drilling holes to the pcb, thats correct but i guess the connectors in turn are more expesive, the same goes to PCI-Express-sockets an so on. Look on ebay on the lowest cost boards, the use standard idc connectors, i think that will the most economic solution. Using receptangles with long pins would allow stacking of thore than one board as on the PC104 standard.

By the way, there are PCI-express interface chips from gennun for $15 or so but that would require the PC to run, a standalone solution could reduce power consumption further, even if the atom need only 10 watt, i have a ARM board that draws less than 2,5 watt running linux.

Edit: Even if the full design does only used 5 watt, 95% or so of the power will heating the FPGA. to get you an example of how much this is, take a little light bulb wit hhust 5 Watt and try to cool it down so you could touch the glas ... there should at least a passive heat sink, better in combination with some air flow. This would on the plus side some more MHz. Of course no cooling monster as for GPU is needed ...

Given the title of this thread this might be a bad question, but here goes: why make the design modular at all? I understand the intention of keeping cost down, especially in the case of upgrading of hardware. But wouldn't different kinds of hardware (=different FPGAs) require different power lines, possibly different methods of uploading a bitstream?

If one truely wants to define a universal bus tht connects different FPGA daughter boards, then the bus connector on the main board could provide some common voltages and a common (probably serial) interface. But it may still be necessary for the FPGA board to produce other voltages locally and if the bus protocol is not compatible to the FPGAs bitstream loading, then an additional CPLD on each daughter board is needed.

Let's put numbers to the case. The prices I give are in EUR for prototype quantities (=1). I may not have found the best supplier, either.

So the power supply for a single daughter board can be up to 50% of the price for one simple FPGA. Daughter boards with multiple FPGAs are therefore more cost efficient.

In light of that I would propose to make simple boards that contain one host interface (of a to be specified kind) and as many FPGAs as there is money. This is basically only making daughter boards and no main boards at all. The Pros and Cons:

Cons:

If no additional power supplies are needed on the daughter boards, then my argument from above does not hold. In this case, the modular approach can offer the cost advantages promised. But this can only be true for a specific class of daughter boards.

The host interface is needed for every board (cost can be around 5-15EUR, maybe)

Pros:

No error-prone and/or costly connectors (3EUR per connector)

No need to specify a universal interface that works across different FPGAs or later ASICS: just rewrite the firmware of the host interface for each type of board

No additional adapter logic on daughter board (maybe 5EUR)

Quicker design time: no motherboard needed

Before this thread was started, I looked into making something like this. While the original suggestion (http://forum.bitcoin.org/index.php?topic=9047.msg280549#msg280549) was based on a PIC18F97 because of its Ethernet capability, I thought of an FT2232H because it can operate as both a JTAG and an SPI master and needs minimal programming. So the board is not completely stand alone but a USB slave. The thing is work in progress. I started out with a board for a single FPGA to test if I can get it to work. More FPGAs can later be added to the design: they are connected in two serial chains via JTAG and SPI. The SPI bus is used both for configuration and later for communication.

Given the title of this thread this might be a bad question, but here goes: why make the design modular at all? I understand the intention of keeping cost down, especially in the case of upgrading of hardware. But wouldn't different kinds of hardware (=different FPGAs) require different power lines, possibly different methods of uploading a bitstream?

There are no bad questions

So let me explain my point of view:One central asumption of my idea is that the upload routine of both big companys Xilinx an Altera haven't changed during the last few generations, so i hope them to go on this way.Further the main board would supply multiple rails of 5 V 3.3 V 1.5 V and others if needed. During the last generations of FPGA's the voltages supplied were reduced step by step so in future it should just be nessesary to break this further down.

In the end the the Motherboard was just meant to provide the Lanes for basic power supply and hardware interface and storage of the daughtercards.

I have to admit i hadn't checked on all prices of possible IO solutions and Power supplies so you may have a point regarding your parts argumentation.

I think this Cyclone 3 would be a good compromise between performance an costs, but i'm still waiting for confirmation if this FPGA would be suitible for the current FPGA Miner software

As this chip is around 175 Eur and therefore a lot more expensive than your choice the costs for the power supply would become less relevant.But i think this chip would be a minimum as pipelined Miner designs for less cells have proofen to be a lot less effective.

So the approach would be to place just one FPGA per daughtercard together with its power supply and use the motherboard just für holding and basic IO and Power supply.Given the partcosts you quoted we might end up somewere around 220 Eur for one card wich might seem much but is an improvement in comparison to the current developlers boards.The costs for the motherboard would be nearly irrelevant in comparison as it would house no essential components.

I thought of an FT2232H because it can operate as both a JTAG and an SPI master and needs minimal programming. So the board is not completely stand alone but a USB slave

I like the idea of the USB device. But maybe i've missunderstood, would it be possible to use an chip for each board to provide it USB capabilities in order to comunicate with a standart Pc USB port? With this way the system should be abeled to use different FPGA's and later even ASIC's.

Thanks a lot for your project files i just had a brief look but will go through them within the next week.

As i see you use EAGLE so maybe i will come back at you with my own idea for layouts.

I would be glad to have you helping to further pursue this project and help me discuss all nessecary components.

FT2232 would be great, USB + JTAG. It's what I'm looking at if I decide to make a board.

Good point: the normal FT2232 has a lot more support by existing upload programs. I choose the FT2232H because it has two MPSS units: one for JTAG and one for SPI. The prices for both are nearly identical on Digikey, but you pay extra for the power supply the FT2232H needs. Maybe I will renege this decision and go back to the FT2232D, if I decide no more JTAG support is needed.

If SPI is adequate for both configuration and data traffic then I'm happy with that alone. No need for JTAG except as an option perhaps. The important thing is to come up with a bus standard and protocol (for job data, because the config protocol is fpga defined) and choose a physical connector.

Then people can make either plugin modules, controller boards or both and take whatever approach works for them, knowing that later they will have access to more than just their own work.

As for power I think having a look at what power the current FPGAs need is a start. I thought mostly they didn't need anything over 2.5V but I guess some older ones needed 3.3V. I could see making a main board where some power regulators were optional so that people could choose to add them in the case they need it for their design.

[...]So let me explain my point of view:One central asumption of my idea is that the upload routine of both big companys Xilinx an Altera haven't changed during the last few generations, so i hope them to go on this way.Further the main board would supply multiple rails of 5 V 3.3 V 1.5 V and others if needed. During the last generations of FPGA's the voltages supplied were reduced step by step so in future it should just be nessesary to break this further down.

I don't know Altera, but you can combine Xilinx devices of different generations into one configuration chain. If you can combine devices by different manufacturers into one chain is more than I can guess, though. Basically, you should be able to do it with JTAG, but that seems more cumbersome to me than a different interface (personal preference).

I used that chip because I used Xilinx in the past and because the XC6SLX75 is the largest chip the free ISE WebPack lets you work with. It is pin-compatible to larger chips, though. So once someone has a suitable binary, a XC6SLX150 can be used (€123.34 at avnet for the commercial version in the FGG484 package at maximum speed grade). But I agree: there are many different possibilities for which FPGA to choose.

I thought of an FT2232H because it can operate as both a JTAG and an SPI master and needs minimal programming. So the board is not completely stand alone but a USB slave

I like the idea of the USB device. But maybe i've missunderstood, would it be possible to use an chip for each board to provide it USB capabilities in order to comunicate with a standart Pc USB port? With this way the system should be abeled to use different FPGA's and later even ASIC's.

Yes, that is the idea. You will need to invest in USB hubs if you have multiple cards, but that is not that expensive (e.g. all12013: 13 ports for 20EUR).

Framed in different words: I chose a modular approach where the "motherboard" consists of a normal PC or netbook, a large 12V power supply and a USB hub (all off-the-shelf), and the "bus" consists of a single supply voltage line and USB. My "daughter boards" can then be anything (Xilinx, Altera, whatnot; 1 FPGA, 10 FPGA on one board, ...) and it mixes and works (if it works at all, that is ). But I need one power supply solution on each board.

Your approach has merit if all boards use similar supply voltages. Then your daughter boards need no dedicated supply at all, its all on the motherboard. This can work if one makes motherboards for specific FPGAs (e.g. a 1.2V + 1.8V + 2.5V motherboard). This allows the miner to really add one FPGA at a time with minimal extra cost. Just pay attention to the voltage drop across the connector between the boards: 1.2V at up to 4A requires a very low resistance.

The small granularity certainly is important to people starting at mining. But it also assumes that boards are available somewhere at a fixed (low) price in small quantities. Forget about prototyping for now, let's talk about later. Unless a company steps in, we will have to pool together board orders to get reasonable prices. These orders will come in waves, so a budding miner may have to wait some time to get a new daughter board with one more FPGA. That's why I would like to keep the granularity a bit larger (buy less often, but more FPGAs in one go).

As an example: this is for an old design that had nothing to do with computation at all, but the price breaks can be illustrative. Stated is the price per board for different total produced boards in one run.

Total boards

Relative price per board

20

100%

25

91%

40

82%

50

78%

So buying in bulk (finding more people to divide the produced boards between) is probably unavoidable.

If SPI is adequate for both configuration and data traffic then I'm happy with that alone. No need for JTAG except as an option perhaps. The important thing is to come up with a bus standard and protocol (for job data, because the config protocol is fpga defined) and choose a physical connector.

So I am hearing a vote for going back to FT2232D, drop JTAG and using the MPSS to do the SPI communication. Works for me, though I am not sure I will do it in this version of the prototyping stage. Maybe for the next board...

[...]As for power I think having a look at what power the current FPGAs need is a start. I thought mostly they didn't need anything over 2.5V but I guess some older ones needed 3.3V. I could see making a main board where some power regulators were optional so that people could choose to add them in the case they need it for their design.

Right, the Xilinx Spartan 6 can work with 1.2V and 2.5V (either a single 2.5V or two, one with additional filters to get rid of spikes). Faster FPGAs may have even lower voltages. As I just wrote in answer to O_Shovah, it depends a lot if one is willing to build motherboards for one specific set of voltages. If yes, then many FPGA boards can share these supplies. If no, then each daughter board needs to contain dedicated power supplies. Then the overhead for adding an FT2232(D/H) to each daughter board and doing away with the motherboard completely seems minimal to me.

So it comes down to these options (from my point of view):

Build motherboards that are specific to one set of supply voltages and then accept FPGA daughterboards.

If one goes with the motherboard approach, then the number of FPGA daughterboards that fit is comparable or smaller to the largest number of FPGAs that one could put on one board. It is great for slow upgrading (if boards are available), but it is more expensive and bulkier than having a dedicated board of the same number of FPGAs.

These things are not necessarily mutually exclusive: the schematic for a daughter board and the routed layout around the FPGA can relatively easily be incorporated into a design for an all-inclusive board that contains more FPGAs. So the motherboard for entry level miners and later one PCB that contains everything for people wanting to upgrade in larger steps.

While i don't think that there would be a shortage of IO-Pins on FPGA i want to point out that the daisy chained configuration does only apply to the very special case of using as an Output Register. I even know if that ist really SPI or just a bunch of serial shift registers. What if a slve is supposed to output a byte, that will se next byte after the byte be: the output byte or the just the byte received? There is, of course a protocol for using a 'unlimited' number of chips with just a constant number of 4 lines. Did you ever hear of those JTAG

So JTAG for uploading the bitstream would be fine, this would work out for ALTERA, XILINX. LATTICE even in mixed configurations by using the SVF Format with a jam player.

One note to the proposed device configuration, the 3A @ 1,2 could be a little short, as this is only 3,6 Watt and the fact that the fpgaminers design just needs 4,4W. Maybe Xilinx Chips have a better MHash/J ratio but there seems to be a pipelined design in the queue with higher clocks that will need more power.

Multiple FPGA per board sounds good, but i guess where one just buys one FPGA for testing the next one buys a full tray Of course there is the possibility of partially populated boards but assembly with many different configurations will result in higher setup cost. I at least can resist the temptation to toast 500 Euro in a pizza oven.

While i don't think that there would be a shortage of IO-Pins on FPGA i want to point out that the daisy chained configuration does only apply to the very special case of using as an Output Register. I even know if that ist really SPI or just a bunch of serial shift registers. What if a slve is supposed to output a byte, that will se next byte after the byte be: the output byte or the just the byte received? There is, of course a protocol for using a 'unlimited' number of chips with just a constant number of 4 lines. Did you ever hear of those JTAG [...]

Basically, this is a set of shift registers: at the latest when SS# is asserted (going low), the slaves put whatever they want to send into the register. Then the master clocks the register through, sending fresh instructions to the slaves. When SS# is deasserted (going high) then the slaves read that data out of their register. This of corse requires NOP command if the master wants to address only one slave, the others need to be send this NOP. As for JTAG: it can be used, but is it accessible from the design after configuration is finished? That would be sweet of course! JTAG is slower than SPI at the same clock speed because it has to do more register adressing. But the overhead may be worth it in the motherboard approach to have a universal bus. For the dedicated board with its own interface chips, there is not much difference between JTAG and SPi except for speed and complexity of protocol.

One note to the proposed device configuration, the 3A @ 1,2 could be a little short, as this is only 3,6 Watt and the fact that the fpgaminers design just needs 4,4W. Maybe Xilinx Chips have a better MHash/J ratio but there seems to be a pipelined design in the queue with higher clocks that will need more power.

I said 4A, which I got from the National Semiconductor Webbench suggestion. This is not the output of the Power Report of ISE. As for MHash/J: no clue.

Multiple FPGA per board sounds good, but i guess where one just buys one FPGA for testing the next one buys a full tray Of course there is the possibility of partially populated boards but assembly with many different configurations will result in higher setup cost. [...]

I think the savings for the PCB manufacturing are minimal, if they are there at all: after all, you still have to pay for the extra area. It all comes down to the number of board sizes and the number of boards per size.

[...] As for JTAG: it can be used, but is it accessible from the design after configuration is finished? That would be sweet of course! JTAG is slower than SPI at the same clock speed because it has to do more register ad[d]ressing. But the overhead may be worth it in the motherboard approach to have a universal bus. For the dedicated board with its own interface chips, there is not much difference between JTAG and SPi except for speed and complexity of protocol.[...]

Ok, I just found the BSCAN_SPARTAN6 primitive. Still not clear on how to interface it, though. But it answers the question: one can use JTAG in a motherboard and daughterboard design. Just need some DIP-switches to short TDI to TDO for unpopulated slots.

I am more and more thinking about dropping the SPI completely even for the dedicated board in favour of JTAG. But does anyone have an example of how to use the primitive in a design?