Chip 2.0: High Level C to Verilog with Python Bindings

sweden · on May 12, 2018

As an hardware engineer, I'm always skeptical of these homebrewed high level design flows, they usually never seem to translate to good hardware.

And it seems to be the case for this one too. I quickly generated some Verilog code from one of the given examples and I was surprised to see flip-flops without reset. Also, not all registers are even being reset'ed at all.

I have some serious doubts about the quality of the hardware generated by this tool.

aylons · on May 12, 2018

I agree with you: it pays to be skeptical of these high level design flows. I'm wary of HLS even from established vendors.

However, if it targets FPGA, and not ASICs, maybe the reset omission is on purpose. FPGA guarantees initialization on load and the some registers may not need to be cleared after usage. On the other hand, a reset signal consumes precious routing resources spanning large areas.

But I surely would triple-check this, even for FPGA use. I expect to actually do so in a few weeks.

EDIT: Ok, this comment sit far too long in the edit box before posting. Now I see the huge discussion in this thread. I'll keep this comment anyway.

tripletao · on May 12, 2018

No need for async reset in most FPGA flows--the flops all start at zero, or whatever you initialize them to. Adding one anyways may bloat your area, and has no benefit. (Like, if you don't trust the FPGA's own power-on-reset logic, then how do you expect it to get configured?)

analognoise · on May 12, 2018

This. GP's comment is way old school. For most FFs in your designs, reset is unnecessary.

Ken Chapman at Xilinx wrote some wonderful whitepapers about that; look them up. Join us in enlightenment, brother.

sweden · on May 12, 2018

It's not old school, it's standard practice in the ASIC design industry.

You mentioned those papers but you didn't link them, so I will have to guess that this is the one in order to address your comment:

- https://www.xilinx.com/support/documentation/white_papers/wp...

I guess the keyword here is "FPGA design". The author of the paper argues that the FPGA already offers an abstraction layer that allows the designer to ignore the problem for certain occasions in order to save up on area.

> The good news is that 99.99% of the time, the timing of the reset release really doesn't matter. (...) However, if you have ever had one of the circuits that doesn’t work the first time, then maybe you have encountered one of the 0.01% cases and have been unlucky enough to have released the reset at the wrong time.

> A design implemented in a Xilinx FPGA does not require insertion of a global reset network. For the vast majority of any design, the initialization state of all flip-flops and RAM following configuration is more comprehensive than any logical reset will ever be. There is no requirement to insert a reset for simulation because nothing will be undefined.

But it doesn't mean it is 100% correct as this design would be very poor if it was ever translated to an ASIC. I guess since this project is done with FPGA design in mind, the strict ASIC design rules don't apply here.

tripletao · on May 12, 2018

Not really a question of degree of correctness. Optimizing for FPGA vs. ASIC implementation are just different problems. Approaches that are optimal for one may be sub-optimal or outright wrong for the other.

TomVDB · on May 12, 2018

The presence or lack of a reset on a FF isn't exactly the best proxy for hardware quality.

sweden · on May 12, 2018

Strange bugs can happen if the hardware is not correctly initialized and the presence of an asynchronous reset is almost an universal practice when it comes to this.

Not to mention that the coding style of a always@ block might influence certain optimizations in the hardware synthesis tool. If the FF is not explicit in the code, the tool might infer a latch instead, for example.

It's true that there are reasons to use use always@ blocks like the generated code provides but it's not really obvious in this case why it is being used.

TomVDB · on May 12, 2018

> ... the presence of an asynchronous reset is almost an universal practice when it comes to this

Using async resets isn't a great example when many modern ASIC flows shun it in favor of synchronous resets. :-)

(Among other reasons, async reset will always mess up your design when you have crosstalk induced glitches; a sync reset will only do so when it coincides with a clock edge.)

> Not to mention that the coding style of a always@ block might influence certain optimizations in the hardware synthesis tool. If the FF is not explicit in the code, the tool might infer a latch instead, for example.

Yeah, let's just strongly disagree on that one and leave it at that.

sweden · on May 12, 2018

Then both of us must be designing ASICs which are worlds apart. :-)

Asynchronous resets are fundamental because they are not dependent of the presence of a working clock. Imagine that you have a system working on synchronous reset: if your entire system breaks, including the PLL which generates the clock, and you want to reset it, how are you going to do it if the clock is not working in the first place?

Also, a reset signal is asynchronous by nature, if you wanted to use synchronously to your system's clock, you would need to add extra hardware to bring it to your clock domain. And this would expose it to possible metastability problems.

TomVDB · on May 12, 2018

The arguments that you give are pretty trivial and easy to solve once and for all compared to physical design consequences (which you don’t mention at all.)

They were definitely not sufficient to stop the conversion of millions of lines of Verilog from async to synchronous resets. A decision that was obviously not made lightly.

Ultimately, both async and sync reset methodologies can be made to work. Using sync reset in all of today’s chips doesn’t suddenly make the async chips of 5 years ago non-functional.

On an FPGA (which is the main target of this tool), it doesn’t really make a difference since there is no physical design to be done.

sweden · on May 12, 2018

I didn't mention physical implementation (if that's what you mean by "physical design", "physical design" is a term which would describe the process of doing a full custom design directly in schematic/layout without even using design techniques such as Verilog) because async/sync is part of the digital architecture of the hardware.

I feel I don't understand what you are trying to say, you just dismissed by entire comment without address any of the things I mentioned.

TomVDB · on May 12, 2018

Alright, let's talk about your comments:

> Asynchronous resets are fundamental because they are not dependent of the presence of a working clock.

Yes. And a clock is fundamental because without one, your design wouldn't work either. :-) So there's little point in worrying about your flip flops not having a defined value before the clock is present.

> Imagine that you have a system working on synchronous reset: if your entire system breaks, including the PLL which generates the clock, and you want to reset it, how are you going to do it if the clock is not working in the first place?

For core logic (which is 99.9999% of your gates), it doesn't matter that they don't get a defined state in the situation that you describe.

For the few signals where you actually care about a defined state (IOs or PLLs or whatever), you still connect the async reset. Problem solved.

> Also, a reset signal is asynchronous by nature, if you wanted to use synchronously to your system's clock, you would need to add extra hardware to bring it to your clock domain. And this would expose it to possible metastability problems.

You use exactly the same reset synchronizer that you use with your asynchronous reset. After all, even when using async resets, you release the reset synchronously as well. The difference between async and sync resets is in the FFs, where the reset is connected to the D input instead of the C input. It's not in the way the reset is generated.

> I didn't mention physical implementation because async/sync is part of the digital architecture of the hardware.

That's interesting, because the async/sync reset decision was almost entirely driven by the backend related considerations.

I already mentioned one of the biggest problems with async resets: it exposes your design to glitches at all times, whereas sync reset only exposes you during a clock edge. In other words: sync resets inherently make your design more resilient against non-digital secondary effects.

> you just dismissed by entire comment without address any of the things I mentioned.

I dismissed them because they are trivial to work around in the front end and they can be simulated. And because once you fix them, they'll never come back during the next layout spin.

On the other hand, a backend issue will reappear: it doesn't matter how many crosstalk related ECOs, you'll have to start all over again if there's a spin.

Choosing between async and reset is matter of trade-offs. They both can work very well. And they each have benefits and disadvantages. You're doing yourself a disservice if you just dismiss one just because it was always done that way.

ris · on May 12, 2018

Very interesting that this doesn't seem to lean on any existing compiler toolchains - it all appears hand written from compiler through cpu generation.

I wonder how the results compare to something like TCE http://tce.cs.tut.fi/ which takes advantage of the LLVM ecosystem.

Given the lack even of advanced c-compiler optimizations and the inherent drawbacks of FPGAs, can this even beat the power or speed performance of a software implementation running on e.g. a modern ARM core?

TomVDB · on May 12, 2018

I wish the documentation gave more details about the whole conversion process.

I'm guessing that the RISC core gets optimized by dropping instructions that aren't needed for a particular C program.

But does it have some kind of heuristic to determine whether or not to drop an instructions based on how often it's used? Or is it just a greedy "If the C compiler generates instruction A, then add instruction A to the CPU" decision?

aylons · on May 12, 2018

Very interesting. Please take note that this is not your usual transpiler such as MyHDL or Migen:

"Behind the scenes, Chips uses some novel techniques to generate compact and efficient logic - a hybrid of software and hardware.

Not only does the compiler translate the C code into CPU instructions, it also generates a customised pipelined RISC CPU on the fly. The CPU provides the optimal instruction set for any particular C program."

I'll test it out soon and post about it.

patrickg_zill · on May 12, 2018

I wonder what it would do if you compiled a small Lisp or Scheme interpreter that included eval with it.

aylons · on May 12, 2018

I'm not sure if I have the software skills for that (my compilers course is still pending in Coursera), but will try to do it!

However, I would not raise my expectations that much.

UncleEntity · on May 13, 2018

You could try one of the minischeme implementations floating around github without too much fuss.

Not sure how FPGAs do their thing but a swapping out fgets() with something else and setting up the garbage collector to use one block of memory would probably be necessary I'd imagine.

aylons · on May 13, 2018

The FPGA architecture is not the problem here. This Chips thing works on another level of abstraction: it will generate a small RISC CPU and run the interpreter it compiles. So, maybe I can try to port http://armpit.sourceforge.net/ or something like this.

But I don't expect not very interesting to happen. The " CPU optimization" will not result on a Lisp Machine. It will probably just compile the interpreter as usual and prune unnecessary instructions.

ddcc7 · on May 12, 2018

Does anyone know how this compares to LegUp [0], an academic HLS tool for C to Verilog? Apparently that tool has also been commercialized into a startup [1].

[0] http://legup.eecg.utoronto.ca

[1] http://www.legupcomputing.com

Cieplak · on May 12, 2018

If you have the money/business, BlueSpec is the best tool for high level synthesis.

http://bluespec.com/54621

pasabagi · on May 12, 2018

This looks fascinating - thanks for sharing!

bedros · on May 12, 2018

how different from myhdl [0]

[0] http://www.myhdl.org/

TomVDB · on May 12, 2018

Completely different.

myhdl is an alternative to Verilog, but it's still RTL.

This tools takes the C code, compiles it to a custom assembler language and embeds the generate assembler code as a ROM in a custom generated RISC processor.

bedros · on May 12, 2018

there's no mention of that in first few paragraphs

TomVDB · on May 12, 2018

Indeed, there is not.

I had to run the example design to fully understand what it's trying to do.