Development Board 2 — Digilent ATLYS Spartan6

The Digilent ATLYS is the current winner in the bang-for-buck category: the S6LX45 is the second largest part that the free WebPack edition will handle, it has AC'97 audio I/O, HDMI video I/O (you'll have to handle those with the FPGA, so officially no FullHD at 60Hz), comes with 128 MiBx16 DDR2-DRAM and 8 MiB QSPI Flash, tri-mode ethernet, USB UART and HID and a few assorted on-board LED, buttons and switches. Programming is via a built-in USB port, so no extra JTAG gear needed. External PSU is mandatory however, as the larger FPGA and additional on-board stuff take their toll.
Aquired again from Trenz Electronic.

HDMI output
I've modified the design from Xilinx XAPP495 to check which output modes I would get working. Standard FullHD does work even if not officially supported, I've even produced modes with pixel clocks up to 193 MHz, more than twice the maximum frequency in the datasheet (my largest monitor just switches off when it gets higher frequencies), but the pixel pipe is slower than that and you get some flickering pixels. WUXGA (1920x1200) is working stable with reduced blanking (128 MHz @ 50 Hz / 154 MHz @ 60 Hz). A 1080p50 mode with reduced blanking gets the pixel clock to 115MHz, which is almost within the datasheet spec, but not all monitors (and even fewer TV) recognize that mode.

Notes on ATLYS board

I'm still trying to wrap my head around this board, a few things to note:

Cray-1S

So, what does one do with that copious amount of logic? Re-build a supercomputer icon, of course. Even better when a lot of the work has already been done for you: here's Chris Fenton's Homebrew Cray-1A project — and he has just put the sources on this Cray-1X Google code SVN. The Verilog is nice and clean and it is starting to work on the ATLYS board at 40 MHz currently (with some inferred multipliers replaced by Coregen modules). I still hope I'll get it to the original 80 MHz, but currently there's too much routing delay in a handful of nets. If I can remove the longest few paths I should be up to about 50 MHz and it will be an uphill battle from there as there is apparently a great number of paths sitting around in the 20 ns region.

The 160 MFLOPS of the Cray-1 isn't something that gets much notice today. There's another aspect that made the Cray-1 the fabulous supercomputer that it was at it's time: it sustained 640 MiB/s in it's 4 MiW of main memory. This proved elusive for workstations, let alone PC, for many more years, but should be possible to achieve on the ATLYS board. The peak bandwidth of the DDR2-DRAM at 320 MHz would be twice as good as needed (so the interface could even be timeshared with some IO), but there seems to be no way to get to the data within a fixed 11 clock periods latency (137.5 ns at 80 MHz clock) unless I figure out how to keep the memory interface from inserting refresh cycles on its own. The memory and indeed all functional units of the Cray-1 always complete a request in a fixed number of cycles once it has been issued, so accomodating a variable latency requires changing the architecture or stopping the clock (both of which would require extensive changes to the current code).

The project has not been progressing as well as I had hoped at first, but I've been reading up on the Cray Hardware Reference Manual [PDF 19MiB], which has lots and lots of detail that are slowly beginning to make sense to me.