Having just been the victim of the Intel I219 on linux, and a heavy user of OpenWRT, my initial response when I saw the logo was pure horror.
For those that don't know – as I didn't – the hardware behind Intel's more recent ethernet chipsets seem only to be tested and supported on the most basic of networking configurations. As soon as you add something like a macvlan, it is apparently normal to expect problems like dropped packets and module unloading every few GB. One of the solutions seems to be to disable offloading [0], but for me that just bought me an order of magnitude more time before the hang.
However, this does look like good work – and I have to celebrate the contribution from the devs on this one. There are huge gains to be had by network offloading – and perhaps in time Intel's hardware division will iron out the problems with their chipsets.
No. If it was my own machine I'd certainly try it out – but it's work's machine running Debian stretch, so we switched to a difference NIC with a Realtek interface which has given no problems and meets our performance requirements.
To be clear, this appears to be a problem with the newness of the chipset and the time it takes to get drivers into upstream – the chipset we switched to had its fair share of teething problems when it first came out.
Intel had a long-standing reputation for rock-solid network cards with great open-source drivers ready the day of release – a reputation Realtek has never really had.
This topic of hardware offloading is interesting, and, for some people's requirements, is there an open source consideration to keep in mind: are you offloading to effectively "closed source"?
I'm a long-time OpenWrt user on various reflashed SOHO WiFi router SoC boxes, and I recently want to move my home router(s) to somewhat more open PC hardware (with Pfsense, OpenWrt, or DIY atop a GNU/Linux distro), and obtained some Intel gigabit NICs for this purpose.
Since I want to use fanless PC hardware (although even the first Intel card alone runs quite warm at idle), I'm wondering how the performance will be, running all packets through Linux+CPU+I/O, compared to whatever hardware optimizations OpenWrt is doing on the SoC boxes. If the performance on PC hardware falls short, offloading could make the difference, but that would be starting to look more like the closed SoC I was going to a lot of effort to move away from.
(Ultimately, I would want to do this on a fully-open RISC-V board, with open hardware NICs, open hardware AES, etc., but I don't know that will ever be viable. For now, PC hardware, as open and trustworthy as I can get it, might be closest.)
The MediaTek MT7621 SoC has support for hardware NAT in OpenWrt. The driver is being developed by Felix Fietkau of ath9k fame who has stated that it’s the most open (blob free) 5Ghz chip on the market. I’ve been using an mt7621 device since before the Lede split and the progress has been amazing. If you’re reading this Felix, thank you!!
I haven’t been following the development branch but the current stable branch says “Experimental feature. Not fully compatible with QoS/SQM”. For this reason I haven’t enabled it.
EDIT: My decision to go with the mt7621 came from watching the rant by Felix on wireless drivers.
https://youtu.be/hiUosbhR0Wo
We once tried to develop a repeater at 7628, and it turned out to be a disaster. Mediatek's platform support just does not exist for anybody buying less than 1m chips.
Everybody else are footballed to regional partners who can't do a thing. Their official BSP is a fork of 2.4... No userspace tools can work with their drivers that essentially run their own IP and MAC stack. But even their own tools are not sufficient to config things, communications with them often went like "put byte A1 into secret register B and arm watchdog beforehand just in case it crashes."
Is this "hardware NAT" something that's closed in the hardware/firmware, or is it specialized hardware that is programmable with open source to implement NAT in some kind of accelerated way?
(I don't care as much as the FSF does, about where closed behavior is represented -- downloadable blob from host, on-device storage, or burnt into hardware. What's more important to me is getting more of the behavior represented in open source/hardware.)
(BTW, I love the Linux ath9k work. Besides various routers and PCIe WiFi cards, I have a stockpile of Corebooted ThinkPads in which I've replaced the mini-PCIe cards despite the original whitelisting.)
I’m not an expert but as I understand it, many chips have hardware offloading (what I referred to as Hardware NAT). However, it’s bundled into a closed blob. Meaning it may only work with the kernel it was developed against and that it can’t be modified.
However, the mt7621 has the smallest closed blob and exposes enough for open development of hardware offloading. It’s this same access that allowed for the great ath9k and is what is allowing for a great mt7621.
If open is what you are interested in try OPNsense before pfSense.
I'd recommend starting off with keeping your LAN on a dumb pure L2 switch (no management interface or anything) with your box acting as the upstream swouterwall. It's a lot easier to start with that side of things and see how much load it generates there than it is to throw the whole thing at it and troubleshoot what parts are making it run slow.
I'll chime in, since I moved to PC-based fanless router hardware about 5 years ago, and for my needs it's been fantastic.
One of my WAN connections is 100Mbps down (25 up), and I don't have any problem saturating it. I'm only using whatever hardware offloading Linux supports out of the box, which from the parent article I guess is pretty minimal (e.g. stuff like checksum offloading, I suppose). I haven't noticed any bottlenecks when routing between two gigabit LANs (I just ran a quick dd|nc test that measures 940Mbps). However, my needs may be relatively simple and I'm sure there must be more complex scenarios that would benefit from the offloading described in the parent article.
I bought a 4-port Intel NIC because I wanted discrete network interfaces for 2 LANs and 2 WANs. I ended up having to saw off some of the PCI-e edge connector segments to make it fit in my motherboard, so I'm presumably not getting the benefit of all the lanes the NIC could otherwise take advantage of, but nonetheless it seems satisfactory for my needs.
The two advancements I'm following in the router field (which do not fit your requirements though) are 1) Router 7 [a router implementation fully written in Go] [1] 2) Turris MOX [a modular, extensible router] [2].
You could run Router 7 on PC Engines [3] who at least use Coreboot.
These are great links. Router in Go is impressive (and making me wonder whether one could do the same in Racket/Scheme, and work around GC). I like to see parties acquiring/regaining the ability to make their own hardware. The PC Engines apu2 looks tempting, and a couple more NICs would be even better, for an all-in-one device supporting more LANs (that are a bit more compartmentalized), and avoiding the need for separate Ethernet switches/hubs in some cases.
Currently I went for an Ubiquity route: ER-L, UAP AC Pro, and an ES‑16‑150W. Together it cost over 500 EUR. Add to it a Synology NAS with 16 GB RAM and 2x 6 TB HDD and it all cost ~1500 EUR.
I'd want the Unifi AP AC Pro either way though because the location of my router/switch isn't appropriate for WLAN AP, and I want the flexibility. My ER-L runs WireGuard, Pi-Hole (for adblocking), and Unbound (for DNSSEC). This allows my roaming devices (which roam over e.g. LTE or corporate/public WLAN) to utilize the mentioned services. The Synology NAS runs Docker, e.g. UNMS/Unifi Controller, but also things such as Nextcloud.
As long as I keep hardware offloading enabled (which does not work with certain software) the throughput of all of the above hardware saturates the specifications. I'm happy with the hardware, but I bought it all before I knew about R7.
Turris MOX software is based on OpenWrt btw. You could add 2x the 8 port module for a total of 17 ethernet ports (A+E+E) [1].
Are there any disadvantages to just using a SBC (pi, rock64 or whatever) that has a wifi chip and a LAN port, connected to a basic switch? I mean, besides having a slightly bigger box.
Well... if you want to route more then say 50-100 megabits... then the pi won't suffice.
OpenWRT on many simple platforms will do a couple of hundred megabits, and on something like the edgerouter-x or routerboard 750gv3 will do about 800-900 megabits.
I like the Pi, but am leaning away from it for routers. I haven't tried it for a router, so the following is just my initial impression, take with a grain of salt...
The Pi is around a closed hardware SoC, you can't pick&choose devices (like people often do when they can), devices are limited (e.g., 1 Ethernet), RAM is limited, devices might be on funny buses internally (e.g., on internal USB), microSD cards are not very reliable.
I don't know whether there's as solid a router software setup for the Pi as OpenWrt on a well-supported SOHO router.
I don't know how well the built-in WiFi in the Pi 3 can be made to work as an AP, and I think it's only one radio. Picking&choosing a WiFi device(s) that you plug in via USB gives you more options. You might need a powered USB hub, depending on how much total draw you've got on USB, and whether you've hacked your Pi board for USB power limit.
There used to be another consideration with the Pi, which is that I'd end up having to plug a bunch of things into it, including a powered USB hub, somewhat fragile, and I ended up putting everything into an electronics project box, and running just a power cable on a strain relief out. You also used to have to consult a list of known-good wallwarts and microSD cards, to reduce risk of flakiness, but I think that's improved. By comparison, a WNDR3700v2 with OpenWrt was an off-the-shelf appliance box, and could do more than the Pi, router-wise.
There are SBCs with better specs for routing. Where the SBC (and SoC) is designed could be a factor. Separate from SBCs, of course there's amd64 PC hardware on a Mini-ITX or MicroATX board, with PCIe slots for your choice of NICs, and gigabytes of RAM (though AES-NI is in the minority of fanless-capable options I've found). All these boards are effectively unauditable, of course.
(I'm not criticizing the Pi in general. I've used older Pis successfully, and currently use a Pi 3B+ with a 64-bit kernel as a builder&programmer&powersupply for PostmarketOS devices. I'm also thinking of putting a Pi into/onto the chassis of my laser printer, to provide a CUPS IPPS server that prints to USB, rather than try to keep firewalled in the router all the things this printer seems to want to do if you give it network access.)
If I wanted to go the SBC route, what specs would I want to pay attention to? There are many boards that are (or seem to be) well supported in mainline kernel, but I have no idea how to tell whether the wifi is well suited to this use case or the internal connection stuff... Throughput benchmarks are also few and far between.
Friendly PSA that modern Go has GC pauses under 1ms in almost all cases. That may still be a problem for your application, but it is a far cry from what people expect from other GC languages like Java.
I just found out about Turris Omnia. Its not open source but its modular like a PC. But its a router. I know it does not make sense so please take a look yourself
My experience with OpenWrt and hardware acceleration is that it matters once speeds gets high enough.
I’ve had some TP-Link Archer c7’s I’ve used as a main router in the past.
When all it has to do is switch packets on my local network there’s literally no load (as inspected by using htop). Gigabits ahoy.
However when routing stuff out to the internet, it needs not only to switch but also to do software NAT, and then the SoCs performance starts saturating between 350 and 380 mbps. Doing 500mbps is not happening.
For that reason I changed my main router to a Linksys WRT 1900 ACS (with a Marvel-based SOC) which has hardware NAT.
It runs symmetric 500mbps without seeing the gauges in htop even move. It’s a night and day difference.
Once speeds get high enough, HW offloading isn’t just nice to have, it’s a requirement.
And when you do have that, almost all those other tasks the router has to do becomes trivial because you have wads and wads of CPU to spare.
Imo PC-based solutions are overkill, but yes, you may end up with a more open solution that way.
I have no idea about the economics in this area, but it kind of baffles me that they add these propietary, closed source and buggy "accelerators" instead of improving the cores a bit. A bit more L1 cache would go a long way for networking.
Many switched from MIPS to ARM in the past 10? years, but the cores remain mostly just as anaemic as they were.
>Imo PC-based solutions are overkill, but yes, you may end up with a more open solution that way.
Yes exactly! If you look at it from the point of view of functionality to price ratio a pc might just be the best router there is.
- Its modular
- Its easily repairable
- Its portable (in the form of mini pc or a laptop)
- You can buy a second hand PC
Finally if you're taking the trouble of installing openwrt on a router it is a reasonable assumption that you want to do more with it. And in that case you're severely limited by the hardware that you choose.
To get the best of both worlds we can have a dedicated PC as a main router. And other cheap routers as your network extenders.
There's a push at the moment to get the Qualcomm QSDK hardware offloading ported to modern kernels for IPQ8064 (Dual Core Arm SoC with an additional 2 Packet Processing cores) - sadly, Qualcomm haven't kept pace with upstream Kernel dev so it's a mammoth task.
And that's the crux of the problem with these SoC's - they can have fantastic packet processing performance (and in the case of the IPQ8064 NSS cores accelerated PPPoE, qdisc and crypto) but it's all tied up in vendor only repos, doesn't track upstream and so you're stuck with buggy vendor firmware.
I was reading QSDK page [1] and it seems to me that they are doing something different from openwrt.
>The QCA Software Development Kit (QSDK) project allows users to build an OpenWrt based platform containing additional enhancements for Qualcomm Atheros chipsets that have not yet made it into the public OpenWrt repository.
I'm not aware of the project goals so I may be wrong.
Sort of - QSDK uses an old version of OpenWRT and a 3.x Linux Kernel to allow board partners (e.g. Netgear et al) to use their reference designs and spin up a working home router firmware quickly and easily.
An awful lot of devices these days ship with firmware that is actually OpenWRT (often v10-v15) based.
The actual NSS kernel modules have source available, and this is pulled into QSDK OpenWRT builds, but they've not had much luck getting stuff upstreamed[1] and getting them working on a recent 4.x kernel is non trivial.
This was also before the netfilter flow offloading framework, so the work is further compounded because they used their own offloading system.
> An awful lot of devices these days ship with firmware that is actually OpenWRT (often v10-v15) based.
This is a great explanation and it answers some of my questions as well. But I have one more. I have not worked with the devices that you mention but I was thinking that if they already have openwrt what is stopping an end user to simply update to the latest version?
Is there some kind of hardware incompatibility or maybe disabled updates?
Versions of OpenWRT are tied to different Kernel versions, it's an entire distro, not just a layer above the Kernel.
So the QCA NSS drivers, for example, are kernel modules. The source is fully open and available, but trying to get it to build on the 4.14.x kernel used by OpenWRT is an exercise in futility unless you know the linux networking code inside out, as well as understand what the drivers are doing.
Work is ongoing and some progress is being made for the IPQ8064, and some mediatek SoC's not have full hardware offloading, but taking vendor provided code and massaging it into something acceptable either by OpenWRT (they are loathe to do as it's a huge job) or the upstream Kernel is a huge effort.
The manufacturer usually modifies OpenWRT/QSDK to support their device. AFAIK, most of the time individual components from the device (CPU, Ethernet switch, wireless chip) are already supported in OpenWRT, it's just that the specific combo that the device contains just isn't there yet. This configuration is done with the device tree. On top of that, some manufacturers (Tp-Link, for example) don't use the standard OpenWRT sysupgrade image format, so the device rejects the new firmware that you try to flash.
Openwrt is such a gem,not just for home routers but also for IoT gateways etc. Its installation base is probably on par with Debian. I have long been hoping that Linux Foundation or somebody else to endorse this project and give it a huge boost on top of this already great project.
Just the other day I tried the build system[1] for openwrt
and I was amazed to find how easy it is to create a custom image for a platform of your choice. And not just the kernel you can also build specific packages ,if they are not already supported by the package repository or are outdated.
Everything can be done from a graphical interface. Just a few stokes of keyboard and you have your own custom build. Really impressive stuff.
Not of this particular presentation. But there are quite a few other videos talking about openwrt on the website (you'll find them under previous summits menu)
For those that don't know – as I didn't – the hardware behind Intel's more recent ethernet chipsets seem only to be tested and supported on the most basic of networking configurations. As soon as you add something like a macvlan, it is apparently normal to expect problems like dropped packets and module unloading every few GB. One of the solutions seems to be to disable offloading [0], but for me that just bought me an order of magnitude more time before the hang.
However, this does look like good work – and I have to celebrate the contribution from the devs on this one. There are huge gains to be had by network offloading – and perhaps in time Intel's hardware division will iron out the problems with their chipsets.
[0] https://sourceforge.net/p/e1000/bugs/571/