despite the fact that you have already found an example you could look at opencores. There ist a advanced debug system that is said to work with ALTERA, XILINX and ? with the Vendor supplied JTAG primitives. Maybe this helps you deisgning the 'Host' software with portability in mind. I did not test this solution since i am trying to burn the bitstream into the config proms of my board an doing the communication via serial lines. My CycloneIII starter board has only JTAG (and a homebrew unfriendly HSMC connector) and this requires my development PC to constantly polling the jtag connection.

Soldering with a pizza oven was a joke, despite the fact that with a proper setup it should viable way. I have a university lab here and there is at least one guy in town that had an offer to rent his reflow oven for EUR 50 a week.

You can actually reflow at home in a small toaster oven. You wouldn't want to manufacture that way for resale though. I'm sure it's a bit hit and miss and you would need to have the process down before you tried a costly chip. I've also seen the local mobile phone repair guys do BGA chips with their SMD blowers. I live in Bangkok where they do this kind of work in stalls here and there in malls and by the roadside.

I don't think the cost of having a local regulator/filtering on each board is too prohibitive and I would think that having a few supplies lines on the connector would be good. Maybe 1.2 and 2.5 available and a 5V line to allow a board to have it's own if it's needs 3.3 or 5V.

The question of connectors that can handle that is another issue. I think smaller connectors are likely to be limited to < 1A per pin and so you are already talking 4 x 2.5V pins per FPGA and so multi-chip cards get a bit crazy pin wise, and then multi-cards even more so. If you then look at a proper power connector for each card then the data connectors become smaller and simpler, maybe even a daisy chain ribbon cable much like an IDE (but much less pins).

My initial idea was to limit the main board to about 4 slots. Having more than that is challenging power wise.

Maybe look at this mining problem another way. The FPGAs don't really get such high hash rates and cost too much.

What about making a main board that can hold 4-8 GPU cards and has a FPGA as a PCI Express master controller. The FPGA would pass data on this bus to the GPU card selected. I don't know enough about the PCI Express protocol to say if this is doable but it seems like this application doesn't require high data rates between CPU and GPU. If that is the case then maybe you can make a USB device, as discussed here, that tells an FPGA to pass data to card slot N on the bus.

I think the current FPGAs have the signal ability to do PCIe control but I'm not sure. There may even be an open core design for PCIe controller.

You could have several of these hanging off a USB hub, all controlled by the python script. This might work and would be a lot less design work (maybe). Or is that just crazy?

One PC with 8 USB hubs each one controlling an 8 board GPU stack. It may even be possible with ribbon cable and connectors tied to the controller board so as not to require a huge PCB.

No matter how you try to do it, having a central power supply on the backplane will just not work. An XC6SLX150 will use like 10 amps, and the maximum allowable voltage drop is 50 millivolts, that would mean 5 milliohms of the full path from the voltage regulator to the FPGA. This is hard enough to achieve without a connector in between. If there are multiple FPGAs on a card it gets even worse. There's just no way around point-of-load PSUs here.

So there are basically two options:

Backplane-based:- I'd favor using a connector that's inexpensive and easy to get, but that doesn't let you plug regular PC stuff into it.- If we go the backplane route, we should go for a really universal interface, and not something minimal. So I'd say 5V + 12V + I2C + JTAG seems to be the ideal combination. And each card should have a small I2C eeprom (like those SPD things on RAM DIMMs) that tells the backplane what it is and how to talk to it. Oh, and there should be an IRQ lane from the FPGAs (or whatever) to the backplane.- The backplane should be able to supply at least 500mA per board on the 5V rail and 2-5A per board on the 12V rail. What about using a regular ATX 2.0 power supply, with a 24pin connector on the backplane, and possibly the 8pin CPU connector on the other end, for some more cards? Really power-hungry cards can use the standard PCIe power connectors to allow for more FPGAs.- Maximum number of chips per card: 127 (JTAG scan chain length might further limit this)- I'd think 8 to 16 slots on the backplane might be a good idea.- Does the cost of the backplane CPU really matter? I'd rather go with some semi-decent ARM and some flash and RAM than a crappy PIC, possibly running linux.

Standalone cards:- USB seems to be the way to go.- Standard 4pin molex connectors for power supply, and PCIe power connectors for the bigger ones?- Mechanically stackable boards- Who wants to build the 127 port USB hub?

What about making a main board that can hold 4-8 GPU cards and has a FPGA as a PCI Express master controller. The FPGA would pass data on this bus to the GPU card selected. I don't know enough about the PCI Express protocol to say if this is doable but it seems like this application doesn't require high data rates between CPU and GPU. If that is the case then maybe you can make a USB device, as discussed here, that tells an FPGA to pass data to card slot N on the bus.

I think the current FPGAs have the signal ability to do PCIe control but I'm not sure. There may even be an open core design for PCIe controller.

You could have several of these hanging off a USB hub, all controlled by the python script. This might work and would be a lot less design work (maybe). Or is that just crazy?

Maybe look at this mining problem another way. The FPGAs don't really get such high hash rates and cost too much.

The point is the low energy consumption. GPUs will not be used in some months because of their high wattage and then we need to be there with our low-energy-boards. Although I'm a bit pessimistic right now because somebody could have non-free very easily pluggable asic system right then and straight away fill a server farm with thousands of these boards.

Would "our" modular system capable of taking asic boards if available? there was a thread right there about someone "founding asics for the community" but i guess this was a troll or got selfish or busy or whatever http://forum.bitcoin.org/index.php?topic=14910.80

The future of bitcoin is asic i believe. asic or death

I don't expect anything, but I am listening at 18WN5YRGaBKGPus4n8QHuF7YnyzyDxMRQ6

If you want to go the GPU way, ask a contrac manufacturer or even a card manufacturer for a speciac OEM Version with the graphics port and the memories left unpopulated. I think that will be a much quicker solution if you can afford buying in 1k quantities or so ...

Having your own PCB artwork could save you some $, but with a board that can hold 6 or even 8 GPU chips would be far more expensive as you would also need a proper cooling solution which will have to be water cooling or even a more advanced scheme.

SO i think we should start to break down this idea into two Basic concepts:

Backplane based solution

- The motherboard is used for housing the daughtercards, routing the power supply and bundeling the Bus system.

- The slots are based on a widely avaidable Pc standert eg DDR slots ( to be determined) and houses between 6 and 20 such slots ( to be determined)

Power supply

- Standart ATX Pc power supply for global distribution of eg 12 V, 5 V, 3.3 V rails (to be determined) and molex power adapters for more power intensive cards.

- The nessesary current and voltage for each daughterboard components and FPGA(s) is indiviually produced on the daughtercard out of the rails and molex adapters supplied by the Motherboard.

IO and control

- minimum one EE prom for communication memory

- Bus system using USB, JTAG, and simmilar ( to be determined)

- ARM CPU for Mainboard control ( to be determined)

Standalone card based solution

Power supply

- similar to the daughtercard solution while just using molex power supply for individual current generation

[ needs further detailation]

IO and controll

- USB Bus for individual cards (to be determined)

[needs further detailations]

Global basics

- Both board designs use an FPGA chip wich can process at least one full SHA-256 cycle wich was found to be more effective than a pipelined version.

-> This would result in one of the following : Altera EP4CE75 at minimum or an EP4SE530 for a upscaled version.-> For Xilinx XC6SLX150 or maybe XC6SLX100 (to be determined)

According to my knowledge Xilinx chips have been found to be less performant an harder to programm, but i think this should be evaluated by those with more experience with the current FPGA series. (to be disscused)

So i ask everybody of you who wishes to take part in the actual development to help plot out the individual specification of each of the both solution , so we may decide on one approach as soon as possible.

We should try to keep the two solution as similar as possible: if the external power supply and computer connection for the backplane based cards and the standalone cards is identical, this means they can be interchanged in a hardware sense. If the computer-FPGA interface (e.g. FT2232) and the on-board bus to the FPGAs is identical, this means there is no difference from the programmers point of view between the two different approaches. Only the spending plan of the user differs. Basically, I suggest to specify everything so the single board can be thought of as a motherboard without the slots to exchange FPGAs.

lame.duck got me thinking of using JTAG both for booting the FPGA and for later communication. While I originally thought SPI would be the way to go, this is potentially less portable between different FPGA makers (can all of them boot of SPI, can they boot of a mixed chain?). I am now convinced that using a JTAG ring to talk to the FPGAs is all that is needed. Here are the advantages:

Only four lines, no special requirements on routing (if you don't go too long)

The chips can be software IDed: no need for an extra EEPROM that stores the hardware type

Even a future ASIC has a good chance of being able to speak JTAG

There are two disadvantages:

For the backplane based cards: you cannot leave slots unpopulated. If a slot is empty there needs to be a jumper or DIP switch that connects TDI to TDO for that slot.

For mixing different FPGAs the JTAG signals may have different voltages that are not compatible between different chips. So each daughtercard should plan on having levelshifters from a specified motherboard voltage to the local, correct voltage.

Given what I said, there should be minimal specifications that apply to both solutions. Here a first suggestion:

The (mother) board has one USB-B connector for host communication and one Molex 8981 connector for power supply.

Only the 12V pins (+12V:1, GND:2+3) on the Molex are used (for simplicity).

The USB connector attaches to a FT2232 chip that is connected to a JTAG chain via its MPSSE.

The FT2232 operates bus powered (works without having +12V connected).

The PWREN# port of the FT2232 controls the on-board power supplies that convert the +12V to lower voltages. They are only on if the USB device is enumerated.

The presence of +12V (or the lower voltages?) can be read back on one of the GPIOs of the FT2232 (say: GPIOL0, H=voltage present, L=error).

The FT2232 can control an LED via a GPIO (say: GPIOL3, L=off, H=on).

Should we specify a vendor and device ID (needs a small EEPROM) or leave it at the default values for FTDI?

The JTAG signal connects all FPGAs in one ring.

All FPGAs have the same connected signals:

JTAG signals: TCK, TDI, TDO, TMS

Clock signal

For discussion; open collector IRQ signal (wire-or). If present need to specity its connection to the FT2232.

All other IOs are unconnected.

There is no reset or configuration, as the JTAG interface handles that.

For every type of FPGA (for every JTAG IDCODE) the following information is specified and does not change for different implemetations:

Voltages and IO standards on the banks

Location and frequency of the clock input

If we decide to use it: location of the IRQ

This way, the software can talk to every bit of hardware! The initialisation code is identical for all devices. The commands for communication are device specific, but the software can detect what to do via the IDCODE. Firmware for one IDCODE is guaranteed to work independent of backplane or standalone board.

For overall system control I'd stay with a standard PC for now. It's cheap and easy to work with. It takes some complexity out of the design, so there's less things that could go wrong.

Seconded. And one PC can run a whole lot of other cards (standalone or backplane). And having an ARM on the backplane but nothing on the standalone cards will split the development effort down the middle.

A cheap computer can be a netbook. It costs only little more than a custom solution and has a lot of added value (and is faster to get). If you want something embedded: there are already embedded solutions that contain ethernet and USB host on a few square centimetres, e.g. FOX Board G20, different plug computers, ...

The (mother) board has [...] one Molex 8981 connector for power supply.

Only the 12V pins (+12V:1, GND:2+3) on the Molex are used (for simplicity).

[...]

I just had a look at different ATX power supplies: while they had 9A of +12V their 5V rail was above 20A! So maybe I picked the wrong voltage?

The reason why I picked one voltage at all instead of allowing the use of both 5V and 12V: some may not want to power their FPGAs from an ATX power supply but use something dedicated. There, a single supply is strongly preferrable.

On that note: these people may not like the Molex 8981 as much. What about the Tyco 284513-2 from the Buchanan series:

It's basically a question what there are more of: ATX power supply users or dedicated supply users?

A cheap computer can be a netbook. It costs only little more than a custom solution and has a lot of added value (and is faster to get). If you want something embedded: there are already embedded solutions that contain ethernet and USB host on a few square centimetres, e.g. FOX Board G20, different plug computers, ...

...which are all way more expensive than a small ARM-based solution for the backplane, which removes the need for a PC.

My suggestion would be the following:- One of those FTDIs on every FPGA/ASIC board, with access to the JTAG scan chain, the IRQ signal, and possibly an the I2C bus- A USB mini B connector on each board, mounted in a way that doesn't allow you to plug a cable when the board is sitting on a backplane- Each board has a JTAG scan chain, an I2C bus and the USB interface to the FTDI connected to the backplane connector. This increases flexibility and you get it almost for free.- Future backplane implementations may either use USB or JTAG/I2C.- FPGA designs may choose to do everything via JTAG, or go for I2C for post-boot communication.- I'd still be strongly in favor of a small I2C EEPROM on each card. The cost for this is neglegible.

I'm not really concerned about the backplane/card implementation. This can easily be improved at a later point. The interface between those is what can't easily be fixed in the future, so this should be designed for maximum flexibility.

The (mother) board has [...] one Molex 8981 connector for power supply.

Only the 12V pins (+12V:1, GND:2+3) on the Molex are used (for simplicity).

[...]

I just had a look at different ATX power supplies: while they had 9A of +12V their 5V rail was above 20A! So maybe I picked the wrong voltage?

The reason why I picked one voltage at all instead of allowing the use of both 5V and 12V: some may not want to power their FPGAs from an ATX power supply but use something dedicated. There, a single supply is strongly preferrable.

On that note: these people may not like the Molex 8981 as much. What about the Tyco 284513-2 from the Buchanan series:

It's basically a question what there are more of: ATX power supply users or dedicated supply users?

I would still stick with Molex 8981. People using bench supplies can easily stick a Molex on their wiring, and people using ATX are already set. If you use a different connector, then everyone has to make custom cables.

And I would go with a 240 pin DIMM socket for the cards. Figure on about a third of the pins being grounds. Add a few for SPD, a few more for JTAG, a few more for whatever serial bus you come up with. Still leaves you with plenty of open pins for power, even if you go with 8 or 10 pins for each likely voltage level.

I don't know too much about hardware, but I still want to comment on a few points:

1.) Power connector: I hate those molex-connectors but the ability to use a standard ATX PSU without hard-to-get adapters is worth going with it. But I also think that an ATX supply isn't the most efficient choice, when you use only the 12V rails. Also they are rather expensive, usually overpowered and need active cooling.

2.) Onboard controller: I like those embedded ARM/MIPS-CPUs but I think they shouldn't be on the motherboard if they are not necessary. If the USB-bandwith is enough to supply all those FPGA with data, then leave the high-complexity tasks to the Host. And it really doesn't have to be a PC. I have a Sheeva Plug which takes less than 5W and has a 1GHz ARM with 512MB RAM. More than enough I think.

The FT2232 operates bus powered (works without having +12V connected).

Since you can't power the whole FPGA-array through one USB-port and we will need an external power-supply anyway, I'd suggest to also power the FT2232 from the power supply. Why? There are potential host-devices that can't even handle the minimum 100mA. For example an Android phone with a modified kernel to support host-mode ... would also be a very power-efficient host.

Sorry if some things I said didn't make any sense, I'm really a software guy.

I don't know too much about hardware, but I still want to comment on a few points:

1.) Power connector: I hate those molex-connectors but the ability to use a standard ATX PSU without hard-to-get adapters is worth going with it. But I also think that an ATX supply isn't the most efficient choice, when you use only the 12V rails. Also they are rather expensive, usually overpowered and need active cooling.

Depending on how you do this you might want to plug those things directly into power-hungry mining cards, as the 12V rail might not even need to be regulated. It's only the supply for various other regulators.

2.) Onboard controller: I like those embedded ARM/MIPS-CPUs but I think they shouldn't be on the motherboard if they are not necessary. If the USB-bandwith is enough to supply all those FPGA with data, then leave the high-complexity tasks to the Host. And it really doesn't have to be a PC. I have a Sheeva Plug which takes less than 5W and has a 1GHz ARM with 512MB RAM. More than enough I think.

My intention was to use this ARM CPU as the host, and communicate via ethernet from there. No need for a sheeva plug.

The FT2232 operates bus powered (works without having +12V connected).

Since you can't power the whole FPGA-array through one USB-port and we will need an external power-supply anyway, I'd suggest to also power the FT2232 from the power supply. Why? There are potential host-devices that can't even handle the minimum 100mA. For example an Android phone with a modified kernel to support host-mode ... would also be a very power-efficient host.

I don't really see why one would ever want to do that. If the FTDI is the only chip that needs 5V power, bus-powered seems to be perfectly fine to me.

My intention was to use this ARM CPU as the host, and communicate via ethernet from there. No need for a sheeva plug.

I as a noob have a questin here: As different Daugtherboards/FPGAs should be supported, would this ARM CPU program these fpgas? Or how do you do this, as nobody except the backplane knows which types of daughterboards are connected?

Additionally I want to say that I really enjoy this thread here. Thanks for everybody involved. I guess I will personally switch in read-only mode for now, as I have only slightest experience with such electronics.

I don't expect anything, but I am listening at 18WN5YRGaBKGPus4n8QHuF7YnyzyDxMRQ6

I as a noob have a questin here: As different Daugtherboards/FPGAs should be supported, would this ARM CPU program these fpgas? Or how do you do this, as nobody except the backplane knows which types of daughterboards are connected?

Exactly, the ARM will have to boot the FPGAs, unless they have a configuration flash on the board itself.