Hacker News new | past | comments | ask | show | jobs | submit login
AMD announces the Spartan UltraScale+ FPGA family (cnx-software.com)
129 points by chrsw 10 months ago | hide | past | favorite | 85 comments



It would be incredible if AMD made some open source moves with their FPGAs as they did with their GPUs. That would allow open source tools like Yosys and nextptr to work with them easily vs. the painstaking reverse engineering efforts this now requires. Xilinx Vivado is a massive beast of a thing.

https://github.com/YosysHQ


I think the issue is there are different market drivers at play. GPUs are more 'visible' than FPGAs to consumers; there's probably a couple of orders of magnitude more people pushing for open source/hackable drivers for GPUs than FPGAs (at least today). So AMD doesn't feel the pressure in the FPGA market, and it's 'easier' to maintain the status quo. I suspect we'll have to wait for a use case where somewhere near as many average users are actually thinking about their FPGA vendor as they do about their GPU vendor before things will change. Maybe an AI/ML application?


From the article "AMD Spartan UltraScale+ samples and evaluation kits are expected to be available for sampling and evaluation in the first half of 2025." For hardware products, what is the point of making these announcements a year in advance? Are they available for select partners before 2025?


When you need to decide on what FPGA line you are going to base your products on, you certainly want to know years in advance what new models will obsolete your FPGA products. However, you don't just want to know their specifications but also their price.


This is kinda true.

But the flip side is most of them tell you ahead of time whether something is recommended for new designs, and how long they will guarantee production for.

Which is much more important.

In this day and age with FPGA companies being merged and spun out left and right, i doubt anyone relies on the future line announcements this second.

In more normal times I agree you want to know if you are going down a dead end path.


If you know your product needs these features you can do early prototyping and development on similar or higher end platforms and then switch to this new platform when it becomes available.

Having knowledge of upcoming parts is important in hardware engineering for planning purposes.


Oh, yes. Hardware products and projects, specially those complex enough to need an FPGA, will usually have a long development pipeline and even one year seems pretty short.

And you can very often prototype with either an overpowered development board or you own prototype board with another FPGA, and then downsize appropriately as the project advances. Most importantly, if you know there will be a viable version in a year, you can postpone the final decision and Xilinx get to avoid having you choosing Altera now if it's possible that their new offering will match your project.


Do companies really gamble on their supply chain that hard?


They don't have a choice. FPGA's aren't like vanilla logic chips: they are highly proprietary both in terms of their specific hardware functionality as well as software tool chains and you can't second source them. Some of the larger customers are using them for government contracts (think military and space applications) where they are signing up products they are contractually on the hook for from years to decades.


Yes I know. So any product being developed today or in the next year would be done on a well established option that will be produced for many more years. It still isn't clear to me why such a long runway on the announcement does for anyone. It shouldn't be changing anyone's plans.


Yes they are.


What are some killer apps for FPGAs? What major products do they enable?


They enable a lot of crazy defence products. A well known german product for example is the Iris-T from Diehl Defence. Highly accurate and exceptional engineering. But I guess FPGAs are in most defence products nowadays. I think the biggest reason is that you can build/verify your own hardware without having to go through the expensive ASIC manufacturing

Edit: I just realized that these are some literal killer apps. That wasn't even intentional, lol.


> But I guess FPGAs are in most defence products nowadays.

Yes.

> I think the biggest reason is that you can build/verify your own hardware without having to go through the expensive ASIC manufacturing.

Plus, you don't give out your secrets to fabs, too. Design, verify, launch, discard, without the need for signing an NDA.


"Plus, you don't give out your secrets to fabs, too."

That's a perfect answer to my question, thank you.

Also many other great answers in this thread, but I don't have much to add.


It also probably makes it easier to prevent adversaries from being able to delid/reverse engineer products. When using FPGAs you don't even need to have the firmware/gateware on or near device until it's in use, which would help prevent any sensitive trade secrets from making it into the wrong hands.


"Jones, fire on the bandit at 270."

"Yessir."

"Jones, why isn't that SAM in the air?"

"Sarge, it's flashing the bitstream. The progress bar says 60%."


Any sensor that captures a ton of data that needs realtime processing to 'compress' the data before the data can be forwarded to data accumulator. Think MRI or CT scanners but industrially there are thousands of applications.

If you need a lot of realtime processing to drive motors (think industrial robots of all kinds), FPGAs are preferred of micro-controllers.

All kinds of industrial sorting systems are driven by fpgas because the moment of measurement (typically with a camera) & the sorting decision are less than a milisecond apart.

There are many more, it's a very 'industrial' product nowadays, but sometimes an FPGA will pop up in a high-end smartphone or TV because they allow to add certain features late in the design cycle.


They enable a bunch of niches (some of which do have a large impact), as opposed to having a few high-volume uses. Basically anything where you really need an ASIC but you don't have the volume to justify an ASIC (and also have the requires large margins for such a product to be viable). Custom RF protocols, the ASIC development process itself, super-low-latency but complex control loops in big motor drives, that kind of thing. You'll almost never see them in consumer products (outside of maybe some super-tiny ones which aren't useful for compute but just do 'glue logic') because they're so expensive.


What you're describing is correct for the top-end FPGA products (they're in every 5G base station, and almost every data centre has thousands of them rerouting information), but the low-end ($10 or less) 2k LE FPGAs are in a hell of a lot of products now too. They're fantastic for anything where you need a lot of logic that executes immediately/concurrently (vs sequentially as would with a microcontroller) in a tiny package. Think medical devices, robotics, comms devices, instrumentation, or power controllers.

I'm pretty sure there's an FPGA in most consumer devices now, but as you say they're there for some sort of glue logic - but that's a killer niche unto itself. Schematics can shift and change throughout a design cycle, and you only need to rewrite some HDL rather than go hunting for a different ASIC that's fit for purpose. It's a growing field again as their cost has come right down. They're in the Apple Vision headset, the Steam Deck, modern TVs, and a host of small form factor consumer computing products.


> they're in every 5G base station

Just a tiny nitpick to your great answer but Nokia's 5G base station stuff (Reefshark) is built around ASICs. I would expect others do the same. There's some reasoning at https://www.electronicdesign.com/technologies/embedded/artic...

https://www.nokia.com/about-us/news/releases/2020/06/15/noki...


The ReefShark ASIC sits alongside an FPGA which acts akin to an IPU. I know only because I played my own small part in the design. It was originally meant to be entirely FPGA-based, but they got hit with some severe supply constraints by Intel and Xilinx, which is why cost keeps getting discussed. Prices have dropped back down to stable numbers again since mid-last year, but at the time ASICs ended up being more affordable at the volume they're doing (demand spiked mid-project due to the removal of Huawei networking equipment).


We (outside Wireless) heard the Intel silicon didn't perform/yield and the original designs became infeasible, prompting a sudden mad scramble. I didn't realise it was originally planned to be FPGA-based. Interesting, thanks.

Very glad to hear things have improved.


> I'm pretty sure there's an FPGA in most consumer devices now,

I can’t think of the last time I saw an FPGA on a mainstream consumer device. MCUs are so fast and have so much IO that it’s rare to need something like an FPGA. I’ve seen a couple tiny CPLDs, but not a full blown FPGA.

I frequently see FPGAs in test and lab gear, though. Crucial for data capture and processing at high speeds.


Low-latency (e.g. less than 20 lines) Videoswitchers/mixers. There's a huge amount of data (12Gbps for 4K/UHD) per input, with many inputs and outputs, all with extremely tight timing tolerances. If you loosen the latency restrictions you can do a small numbers of inputs on regular PCs (see OBS Studio), but at some point a PC architecture will not scale easily anymore and it is much more efficient to just use FPGAs that will do the required logic in hardware. It's such a small market that for most devices an ASIC is not a option.


Blackmagic's whole gear line is based on Xilinx FPGAs. Whatever product of them you see, if you tear it down, it will almost always be nothing more than SerDes chips, I/O controllers and FPGAs.


Anything where you wish you could have an ASIC but you don't have the budget for custom ASIC, and where using smaller chips either makes for worse Bill of Materials or takes up more space.

They are used everywhere, including some very small ones I've seen used purely for power sequencing on motherboards - usually very small FPGA with embedded memory that "boots" first on standby voltage and contains simple combinatoric logic that controls how other devices on motherboard are getting powered up faster than any MCU can do it - while taking less space than discrete components.

Glue logic, custom I/O systems (including high-end backplanes in complex systems), custom devices (often combined with "hard" components in the chip, like ARM CPUs in Zynq series FPGA), specialized filters that can be runtime updated.

Lots of uses.


They're used in places that require real time processing (DSP) of raw digital signals at very high (several hundred Mhz and more), where you cannot afford to miss a sample because of latency from a microcontroller (uC). I think even some PCI devices use them for this reason, and it allows you to update firmware whereas ASIC doesn't

A while back I wrote an entire FPGA pipeline to recalibrate signals from optical sensors before they were passed on to the CPU. Doing this allowed us to keep up processing speed with acquisition, so it was real time. A lot of FIR filters and FFTs. But my proudest achievement was a linear interpolation algorithm which is fairly high level and tricky to implement on FPGA, which is more geared towards simpler DSP algorithms like FIR filters, and FFT (not simpler but so much effort has got into making IPs it effectively is because you don't have to implement it yourself)

But other than that, for raw bulk compute GPUs are kicking their butts in most domains.


To give you an example, these are often used in CNC machines.

Before you had to have:

A. a PLC that ran logic with real time guarantees to tie everything together. The PLC is often user-modified to add more logic.

B. Decoders that processed several mhz encoder feedback signals, somewhere between 3 and 10 of these.

C. Something that decides what to do with the data in B

D. Encoders and motor driving, also being output at several mhz (somewhere between 3 and 10 of these as well)

Among other tasks.

These were all separate chips/boards/things that you tried to synchronize/manage. Latency could be high. But if you are moving 1200 inches per minute (easy), 100 milliseconds latency is equivalent to losing track of things for 2 inches. Enough to ruin anything being made.

Nowadays it is often just an FPGA hooked up to a host core.

(or at a minimum, a time-synchronized bus like ethercat)


- ASIC emulation and prototyping

- High-frequency trading (executed in-fabric)

- Niche real-time video devices, etc.

- Cryptocurrency mining

- Real-time motor pulse generation for robotics

- Custom NICs and HPC devices

- RF signals processing (radar, guidance, etc.)


Products with PCIe (PCI Express) and high speed interfaces like 10G Ethernet, SATA, HDMI, USB 3.0 and higher, Thunderbolt.

Most of the ASICs with these SerDes interfaces are not for sale on the open market, only for OEM who buy MOQ of millions.

Take for example the Raspberry PI SBCs. The Raspberry Pi only got PCIe very late (compute model 4), influencer Jeff unlocked them with a lot of difficulty https://pipci.jeffgeerling.com but you still can't buy these cheap microprocessors from Broadcom.

The reason is that no cheap PCIe chips are available for hobbyists and small company buyers (below a million dollars).

'Cheap' FPGA's starting at $200+ where and still are the only PCIe devices for sale to anyone. If you want to nitpick, a few low speed Serdes are available in $27 ECP5 FPGA's, but no 10 Gbps and higher.

Another example, I sell $130 switches with 100 Gbps switching speeds and PCIe 4x8 and QSFP28 optics. But you can't buy the AWS/Amazon ASIC chips on this board anywhere, nor their competitors chips from Broadcom, Intel, MicroSemi/Microchip, Marvell.

I went as high as Intel's vice president and also their highest level account manager VPs and still got no answer on how to buy their ASIC switches or FPGAs.


The core of modern oscilloscopes is often an FPGA that reads out the analog-to-digital converters at ~gigasamples/s and dumps the result into a RAM chip. Some companies (Keysight, Siglent) use custom chips for this, but FPGAs are very common.


From a consumer-facing perspective, FPGAs have enabled a golden age of reasonably affordable and upgradeable hardware for relatively niche tech hobbies.

* Replacement parts for vintage computers * Flash cartridges and optical drive emulators for older video game consoles * High-speed, high quality analog video upscalers

Many of these things aren't produced at a scale where producing bespoke chips is not really viable. Using an FPGA lets you build your product with off the shelf parts, and lets you squish bugs in the field with a firmware update.

There is also MiSTer, an open source project to re-implement a wide range of vintage computer hardware on the Terasic DE10-Nano FPGA.


Stuff with a lot of simultaneous i/o that needs to be processed simultaneously, is one answer.

https://www.intel.com/content/www/us/en/healthcare-it/produc...


Lower-volume specialty chips for interfaces (lots of I/O pins), such as adapters for an odd interface, custom hardware designs for which there isn't an existing chip, etc.

For instance, audio, video or other signal processing can be done by putting the algorithm "directly" into the hardware design; it will run at a constant predictable speed thereafter.


RME realizes sub Firewire latencies over USB 2 with them in their audio interfaces, plus the ability to enable new functionalities via updates.


I think low latency is the main thing. In most cases, to get an FPGA that's faster in terms of compute than a GPU/CPU you're going to have to spend probably hundreds of thousands (which the military do, e.g. for radar and that sort of thing).

But even a very cheap FPGA will beat any CPU/GPU on latency.


In the past, I’d have tried to use Achronix’s FPGA’s for secure processors like Burroughs B5000, Sandia Secure Processor, SAFE architecture, or CHERI. One could also build I/O processors with advanced IOMMU’s, crypto, etc. Following trusted/untrusted pattern, I’d put as much non-security-critical processing as possible into x86 or ARM chips with secure cores handling what had to be secure.

High-risk operations could run the most critical stuff on these CPU’s. That would reduce the security effort from who knows how many person-years to basically spending more per unit and recompiling. Using lean, fast software would reduce the performance gap a bit.

CHERI is now shipping in ASIC’s, works with CPU’s that fit in affordable FPGA’s, and so this idea could happen again.


One particular use of FPGAs (and ASICs) is operating on bit-oriented rather than byte-oriented data. Certain kinds of compression and encryption algorithms can be implemented much more efficiently on custom chips. These are generally limited to niche applications, though, because the dominance of byte-oriented general-purpose CPUs and microcontrollers selected against such algorithms for more common applications.


Ultra-accurate classic computer and videogame emulators!


It's not so much about accuracy as the low-latency video output (and at the correct refresh rate).


Low latency video and correct refresh rate are part of why FPGA emulation is more accurate.


It can be of use in anything that handles a lot of data throughput but not built in large enough numbers to justify producing an ASIC. First example that comes to mind is an oscilloscope, but by definition FPGAs can be used anywhere (from retrogame consoles to radars).


Broadly speaking anything that does either a lot of reasonably specialized logic and medium-to-high performance broad work will have an FPGA in it (unless its made in very high volumes in which case it may be an ASIC, ditto for very high performance things).

Some FPGAs are absolutely tiny e.g. you might just use it as a fancy way of turning a stream of bits into something in parallel for a custom bit of hardware you have, other FPGAs are truly enormous and might be used for making a semi-custom CPU so you can do low latency signal processing or high frequency trading and so on.


I think we’re going to see a greatly increased use of FPGAs in AI applications. They can be very good at matrix multiplication. Think about an AI layer that tunes the FPGA based on incoming model requirements? Need an LPU like groq? Done. I would bet Apple Silicon gets some sort of FPGA in the neural engine.


But ASICs perform way faster and more efficiently. I doubt even the gain that you would get from "retuning" the FPGA would not increase enough compared to the benefit from a general purpose processor, GPU, or an ASIC


Until you need floating point performance.


They are useful for products which do video encoding, decoding, and microwave receive and transmit of video data. They are useful for TCP/IP insertion and extraction of packet data, e.g., in video streams.


Some older video game consoles have been "emulated" in FPGA. You just map out the circuitry of a device and voila you get native performance without the bugs of a software implementation.


Deterministic latency so you know the upper bound and lower bound


Given the importance of Storage and Networking when working with big data for LLM. Having storage and network code in FPGA might be useful..


Military Radar and Sonar


SDR


This "announcement" is basically a no-op. It's very likely not a new device but taking something like an Artix and just re-branding it and then limiting access to device features via software(Vivado). The same was done for 'Spartan-7' -- it was NOT a new device, just an existing design where they removed the SERDES block.

What made Spartan-6 unique was in addition to having transcievers was the hardened DDR memory controller.

If you read this announcement, DDR support is done via 'soft' memory controller, which chews up a lot of resources and makes meeting timing frustrating, forcing use of Xilinx's MIG IP.


> Transceivers – Up to 8 GTH transceivers supporting up to 16.3 Gb/s

Hey, the Spartan line is useful again!


Finally, I waited years for this. I could only use the Kitex line. Big question is will these Spartan Ultrascale+ with tranceivers and PCIe 4.0 be cheaper or more expensive?


They have strong competition from Lattice and I really thought they had ceded the low-end to them already. For the PCIe chips you can figure out where the pricing must fall for a certain feature set, but I'd be surprised if the cheapest PCIe 4.0 part in the Spartan US+ family is under $500. Consider the Lattice Avant-X series as competitive reference: https://www.mouser.com/c/semiconductors/programmable-logic-i...


Strong competition from Lattice? I might be missing something, but AFAIK Lattice certuspro-nx FPGA's are more expesive and so is the synthesis software.


They already supported this line rate though, for years?

What am i missing?

(IE It's still only pcie gen3, and this announcement doesn't change that)


Spartan 6 had transceivers. Spartan 7 dropped them. Boo. Spartan UltraScale+ brings them back. Hooray! (Though to morphie's point we should probably hold the champagne until we see pricing.)

Also, it looks like PCIe Gen 4. The bitrate is enough and the page says Gen4x8. Is there something I'm missing?


Yes, we are missing vital information.

Similar issues as Artix Ultrascale+ where the small print says " PCIe Gen4 is available in AU10P and AU15P in the FFVB676 package. AU10P and AU15P in other packages support Gen3x8.". Same die for all these FPGAs but not same speeds at given prices. Most of the die will be disabled, 'binned'.

So Spartan 7 Ultrascale+ can theoretically support 8 lanes of PCIe Gen 4.0, but if they actually do at the price you pay is unclear and also if the PCIe 4 IP is free or very expensive and only available in the expensive toolchain under NDA?

[1] https://www.xilinx.com/products/boards-and-kits/device-famil...


Forgive my ignorance, and I made a brief attempt to answer it myself - what's GTH in this context?


Iirc it stands for “gigabit transceiver high speed” or something. There is also GTX, GTY, GTP, etc which are different implementations of SERDES on Xilinx chips.

It’s like USB full speed and USB high speed. They’re largely meaningless marketing terms just to differentiate generations.


Thanks, that explains why all my search turned up was xilinx product materials...


Interesting product. It seems they ported some feature from the Versal family, like the hard LPDDR controller and the new XP5IO, so not just a cut down version of the bigger Artix


No hard core is a bummer. I want to see the first party SOM.


You mean something like this: https://www.amd.com/en/products/system-on-modules/kria.html

Spartan is the glue-logic family for applications that either do not require any kind of CPU core at all or where some simple soft-core (MicorBlaze, x51, hand crafted state machine...) suffices.


Yes I do mean the Kria line, but one for this new Spartan Ultrascale+. I already buy packaged versions of $30 FPGAs because it alleviates most of the integration effort.


Why? They have a different line for that. Not worth wasting expensive silicone die area for something a lot of customers or projects won't need.


Because a hard core is much smaller and efficient than a soft core. Any small edge application you want to run embedded code on (which I argue is most all) will now need an external hard core or use a lot of precious PL resources and power for a soft core.


They already sell other models with hard cores for those who need that since not all applications do.


> No hard core is a bummer.

The Zynq product family already exists.


I wonder what is the hardware programming interface.

Something "horribly" simple I hope and suppose (memory mapped command ring buffers and dma buffers?)

Then quickly getting up and running a shmol RV64 core... and I guess this is hacker paradise.


In FPGAs you are literally programming the enablement of individual logic gates. That's it.

There is no DMA or command ring, you have to implement it yourself.

The FPGA bit code itself is extremely propertiary to each vendor as the nature of controlling all the logic sauces is all secret sauce. So you have to use the vendors tooling.


If we have to go thru a closed source or brain damaged hardware programming interface, this is a loss for us.


It has basically always been this way. Xilinx wasn't different before this announcement. Altera wasn't and isn't either. Some of the smaller lattice designs are a bit more open, but are a different market segment.

You will use Vivado and you will like it!


An FPGA is a blank slate, more so than a CPU is, the programming interface is whatever the user wants to implement (including ring buffers and DMA or whatever).


> I wonder what is the hardware programming interface.

Vivado

> Then quickly getting up and running a shmol RV64 core... and I guess this is hacker paradise.

https://github.com/SpinalHDL/NaxRiscv


Comparison to the one in MiSTer?


MiSTer is about 110K , this can go up to 218K. Probably this new FPGA is something like BGA package, since it has over 500 I/O pins. Also not available this year: "AMD Spartan UltraScale+ samples and evaluation kits are expected to be available for sampling and evaluation in the first half of 2025."


This isn't "a" new FPGA, there'll be like 50+ (or even hundreds of) SKUs, many will be various BGA/chip scale packages (with different numbers of pins available) but not necessarily all of them. The Spartan line, being the lowest level of Xilinx's FPGAs, have traditionally had a few versions in TQFP packages. I wouldn't be incredibly surprised if they did with this new version (although it's probably less and less likely with each generation).


There weren't any TQFP packages in 7-series either, and I don't expect that to change for US+. Even Spartan-6 only had that package for their smallest parts (XC6SLX4/LX9).


Keen to see how this impacts emulator development.


Shame it is only PCI-e Gen 4.


You don't seem to get PCIe Gen 5 outside of the top-end devices it seems. Agilex 7 has it on the Intel side with the R-Tile devices (including a hard CXL IP too which is lovely), but their recently announced Agilex 5 (the UltraScale+ competitor) only host PCIe 4.0 x8. I guess it takes up too much in terms of resources to double the throughput on a low-power device.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: