Really nice design! Impressive routing, I was surprised you were able to fit two DDR chips on there as well.
It's still too bad that all the tooling for this is closed source (Altium, Xilinx ISE / Vivado, with the latter being far more offensive). But the Spartan 6 is a really nice chip otherwise.
It seems like the target application is compute acceleration. Do you have any particular applications in mind?
It should be mentioned that the '100T device pictured actually requires the full paid version of ISE. The free-as-in-beer ISE Webpack only supports up to the '75T. Thankfully Xilinx implemented compatible packages through the device family, so lower devices are a drop-in replacement (mostly).
Maybe one day we'll have a Free toolchain, but FPGA tools are much closer to chip design tools than to compilers, and the space is fraught with patent concerns. However I did read on Deep Chip that some fundamental synthesis patents have been invalidated, which might open up some breathing room.
Supposedly, a lot of key FPGA patents are set to expire Real Soon Now, or just did, or something along those lines. I'm holding out hope that we'll have a true Open Source FPGA environment sooner or later, but it sure is taking it's time... :-(
I only have a passing understanding of the FPGA world, but I thought that everything in that space in shrouded in trade secrets. Do patent claim descriptions really have enough detail to help you produce a working bitstream?
Honestly, I'm far from an expert on this topic, but I've heard quite a few people who play in this space more than I do, speak of the patent issue as being one important blocker. My guess is that patents are just one of multiple factors. What, exactly, it would take to get a full Open Source FPGA toolchain, I don't know. But some of the other answers in this thread are full of very illuminating info, so I'm kinda glad this came up. I've learned a lot just from reading the other responses.
Patent threats don't really hinder community reverse engineering efforts, but they do discourage the major vendors from being open-source friendly, because nobody wants to open up their tech only to get sued over the details you just disclosed.
That's a key point, Altera and Xilinx may well have decided to keep some of their inventions secret, banking on being able to keep them secret, In which case a patent may do them little good. And there is no 7-year date of expiry on secrets.
I see comments like these all the time about 'open source bitstream generators' and whatnot. If you have a very simple, regular architecture, then P&R (place & route) is relatively simple, and has been done (there is an open-source bitstream generator for the Lattice iCE40 chips, see http://www.clifford.at/icestorm/). This line from that page in particular:
"It has a very minimalistic architecture with a very regular structure. "
For anything close to a modern FPGA, the P&R is only the start of the problem - timing is the next big one. Having a large design not meet timing can be a nightmare, but the tools take delay (extracted from the design) into account for routing, so you'd have to add that; also, I think the amount of effort to get the "hard macros"/specialized in-silicon blocks configured correctly would be amazing. You'll notice that even in the iceSTORM project, PLL's and timing analysis are not yet implemented. I use the Xilinx Zynqs at work, and they contain high speed serdes, dual-core ARM processors, and a boatload of other items.
There are large teams who are employed full-time, and frequently get their Masters/PhD with a specialty in VLSI/EDA, developing these algorithms, testing them extensively, and then implementing them. With the money someone like Xilinx brings to the table, they can pick those individuals out from school directly.
It isn't a CPU, and it never will be. I'm amazed at what free software can do, but I just don't think a team without substantial financial backing and extreme expertise could ever even approach the problem.
All of this is without even considering errata that might exist for the FPGAs, or security features at the silicon level - items which will become different immediately after a free solution is announced. Why? There's zero incentive for Xilinx to allow a free software competitor, and they have the IC designers - BOOM! The next generation adds something goofy that means a team would have to decap a chip, and start probing it. Also, said team has to be able to probe something at 28nm or below. The freaking optics for that kind of work start to get expensive by themselves, let alone the lab you need with floating tables and microsteppers (depending). A team might have to install a special floor to even get the kind of mechanical precision necessary - let that sink in. Who the hell has the money for that? I'll tell you who - other nations. We've used the security features on FPGAs (from Actel and Xilinx both) to prevent another NATION from figuring out our algorithms. The goal is to make it so costly to reverse-engineer as to make development of your own entire system seem like a saner approach.
No. There will never be an open-source bitstream generator for a modern FPGA for the same reason the Apollo project required a nation's financial backing - SpaceX can get to LEO (Low Earth Orbit). They just don't have the money, time, or manpower to get people to the moon. And if they did, they still wouldn't be "free software" - I guess a better comparison than SpaceX would be Copenhagen Suborbitals.(https://en.wikipedia.org/wiki/Copenhagen_Suborbitals)
TL;DR - No, there will never be a free software bitstream generator for any kind of modern FPGA, for a multiplicity of reasons.
EDIT: If anyone is interested in trying to do a free software FPGA, you'd have to design it from scratch and try to be a fabless semiconductor company. 1 seat of the EDA tools for IC design (on a modern node) vary - but at the last place I worked they used Mentor Calibre, and I think it was somewhere between 50/100k? It would be fun to design one on an older process, but it would be laughably expensive, slow, and power hungry compared to more modern FPGAs. Bonus though: it will have better TID (total incident dose) radiation hardness, although it wouldn't be a serious contender unless you implemented scrubbing/error detection or a unique rad-hard transistor setup (then you'd need to pay to have it sent to a reactor and tested, which is a PITA - trust me).
Wow, that's a ridiculously defeatist attitude. Comparing reverse engineering a FPGA to landing on the moon?
First of all, it's a relatively new field for open source. The icestorm project only got a complete synthesis to bitstream pipeline working last month. So although you see "comments all the time", it's not like development is stagnating. In fact, it got off the ground for the first time. It's not surprising it lacks support for hard peripherals at this point in development. Lack of timing analysis is a showstopper for any serious design as well, but it's not like it's an unsolvable problem.
>There's zero incentive for Xilinx to allow a free software competitor
What? How is another system that sells their chips a competitor? AMD reacted to open source driver development by providing documentation for their GPUs. The exact opposite seems kind of an extreme position. Xilinx is already pretty open source friendly on the software side with their Linux kernel contributions. And it's not like they are the only vendor in existence.
The security features on FPGAs right now are to prevent people from getting the bitstreams out. Just like on virtually every modern microcontroller. That hasn't stopped open source toolchains one bit.
FWIW I've done designs on the Zynq too, it's certainly one of the most complicated cases and I doubt it will have open source support any time soon. But most of the critical stuff is in the hard blocks (e.g. ddr timing and the like), and it's actually the easiest because there is not really any p&r work to do there, it's just figuring out what the configuration bits are.
(sorry if I sound overly defensive, I work on free video codecs and get a very similar attitude out of many other people who feel that performant video codecs are not possible outside of MPEG. it gets incredibly frustrating)
Actually the IC design tools are a lot more expensive than 50k to 100k. You need a variety of tools for a modern process node (Calibre being one of them usually used for LVS/DRV - checing the GDS2 against the schematic and checking the process design rules). You also need a synthesis tool to convert RTL to a gate level representation (usually another 100k or so), a place and route tool (usually many 100's of K), simulators (probably around 50k again), tools for inserting test logic (again around 50k), timing analysis tools (probably around 50k again or more). Usually you have a bunch of timing analysis tools as you need to check timing at a variety of process corners and temperatures at the same time. There are also other tools that get used at various points in the flow that all seem to cost a lot of money too (like tools for analyzing static and dynamic IR drop and logical equivalance tools, formally checking that the gate level description matches the RTL)
So you can see you can end up spending a million dollars fairly easily.
On the subject of free toolchains, I personally suspect we as a community will get to FPGA tools some time after we nail down board design tools, which today are still a work in progress.
For example, boards can be designed by hand, but a large FPGA design would be pretty miserable without an autoplacer and autorouter.
Another possible application: talking to high speed serial busses. This isn't just a Spartan 6, it's a Spartan 6 XCblahblahT. Usually when you see cheap Spartan 6 boards they use the non-transceiver version to cut costs without mentioning that it cuts one of the killer features of the chip! This one is the real deal. You still have to use ISE (not even Vivado; they locked the Spartan 6s out of Vivado) but it's still nice.
EDIT: they may have brought all the transceivers out to PCIe; I'd have to look closer to tell for sure.
This is MiniPCIe which would only have a x1 connection to the edge fingers, which means only 4 GTP connections would be needed (and the XC6SLX100T has 8 IIRC).
I was a bit surprised to see an asymmetric stackup, many board-houses will complain about that. It can also become a problem during assembly, because the different heat distributions can cause the board to warp.
Neither the schematics nor the layout would have passed a proper review.
On the other hand, thanks for providing the data to the public, it is an interesting project either way.
I've since moved on to KiCAD and been pretty happy with it. Something like this board would still be near difficult to impossible, though, as impedance matching for the DDR lines doesn't really exist in KiCAD.
I also ordered a few ICEsticks to play with the new icestorm open source toolchain. That FPGA is in no way comparable to a Spartan 6, though.
New KiCAD versions (from bzr) have push and shove interactive router which also can do length matching and differential pairs, although the whole thing is far from complete and stable but it is usable (for example, the "push and shove" part does not work when routing diff pairs and when you move move mouse too fast in plain track routing mode it just crashes).
One problem with these new features is that they are somehow linked to new implementation of pcbnew's canvas which also implies somewhat different UI behavior (more windows-like, with selection of individual PCB elements and so on), which I generally don't like that much. Anyhow I recently used current kicad from bzr for somewhat non-trivial board (smallish FPGA in TQFP, ARM MCU, some analog RF, few LVDS lines).
Altium have recently released 'circuitmaker' which is basically a $0 version of Altium, for open hardware. I guess it's meant to compete with Eagle, which a lot of open hardware projects use.
It's still closed source (like Eagle). And it's Windows-only. And currently there's a DirectX conflict with VirtualBox so you can't use it there (reportedly it works in other VMs). But it's cheaper than paying $9000-a-seat-plus-$1000-a-year-maintainance for commercial Altium.
Oh, nice! On the other hand, seeing this on HN now I feel really inadequate, having just finished the prototype of my miniPCIe Arduino clone last weekend... Really need to level up too!
I hope to try other vendors' ARM boards too, I've made a list a few weeks ago of those that have mini-PCIe connectors, so plenty to experiment with....
Originally we built this board for a custom security enhanced SoC design based on OpenSPARC T1. It was to run OpenBSD from the microSD card and operate as the baseband controller for a new type of wireless network node.
The board could be repurposed for the lowRISC project. It is also suitable for use with embedded routers and laptops as a crypto coprocessor or HSM.
We are pairing the first batch of prototypes with the PCengines apu1d4 (w/coreboot) and will provide these reference platforms to developers.
To be considered for a unit you need to compile a bitstream of the OpenSPARC T1 for the LX150T variant of this board.
It looks like, from the scant info on the website and github, that they initial goal is to use these as autonomous mesh network nodes, probably with lots of crypto processing, given that the name of the board files is PolysomeCrypto01. It looks, though, like this was discovered before the folks were ready to really discuss it.
So there's a serial flash on board? How many milliseconds max it takes to load the bitstream to FPGA after power on? Can you load PCIe endpoint within PCIe spec alotted time, 100 ms?
The FPGA is an XC6SLX100T, so the bitstream could be up to 26.7 megabits. It could take up to two seconds or so to configure the part with a large design.
So is it possible to load bitstream in parts, enough for basic PCIe endpoint within 100 ms to conform to PCIe spec and then load a bigger design afterwards?
I once had to work with a hardware team to debug a "boot issue" where we were right on the margin due to a two-phase boot process (boot initial bitstream, check journal in flash, load active bitstream) and didn't even realize going into the debugging that we were close to the margin.
Yes. Xilinx chips can do on-the-fly reconfiguration. So you can have your bootstrap loader and then either load a more complete firmware from the spi chip, or over pcie
yeah, isn't that hard to get the basics by reading the docs and doing some basic experiments with explicit placement and examining the resulting bitstream.
When I looked at the encryption format for the virtex 5 series it was quicker to reverse engineer it than it was to find someone within Xilinx that could explain it (just basic bit/byte order, what's covered by encryption, etc).
At the end of the day, FPGA configuration logic is just another state machine ;)
Not with official workflows at least, and even then, you would somehow have to keep the PCIe block up during reconfiguration for compliance. I think it would be difficult to reverse engineer to that level.
The easier thing to do would be to delay enumeration.
The FPGA itself is around $160. Not cheap, but not expensive either (look at some of the high-end Virtex devices for an idea of how much expensive FPGAs cost...)
Not on its own. You'd have to implement an endpoint on the FPGA to make it show up. (Same as most USB microcontrollers won't show up on the bus until there's appropriate code running on them.)
Being an FPGA (essentially a blank canvas where you can design a logic cirguit), it's not really comparable to a general-purpose CPU. However in specialised applications, it can be much much faster. see: bitcoin mining and other crypto tasks.
It's still too bad that all the tooling for this is closed source (Altium, Xilinx ISE / Vivado, with the latter being far more offensive). But the Spartan 6 is a really nice chip otherwise.
It seems like the target application is compute acceleration. Do you have any particular applications in mind?