Hist Triggers in Linux 4.7

cyphar · on June 10, 2016

Glad to see that GNU/Linux is getting a DTrace-equivalent. Sucks that it's all a bunch of string and staples rather than a single ioctl interface like DTrace -- but hey, that's the Linux way.

netneutralish · on June 10, 2016

Reminds me of developing countries that leapfrog industrialized G8 countries in certain technologies, like skipping landline deployments and going straight for wireless networks.

cm3 · on June 10, 2016

It's a shared medium and can never reach the qualities of wired networking, though it might get better enough that we don't notice it as much as we do now. This reminds me of efforts like xDSL G.fast. People only talk about approaching the speeds of common fiber configurations, but miss the point of very low latency in FTTH and easy upgrade to, say, 10 or 40 Gbs, whereas that would require a technological enhancement in DSL or CABLE. With fibre it would only require switching out the "modem" and network gear in the local uplink station (hut).

Tiksi · on June 10, 2016

Eh, you're not gonna get 10 or 40Gbps over most FTTH installs. Most run GPON with beam splitters which maxes out at 2.5Gbps/line. To get 10/40/100 you need OS1/2 singlemode or OM3/4 multimode fiber which I've never seen used in residential/small business FTTP setups.

cm3 · on June 11, 2016

GPON isn't used everywhere, but it's used a lot, yes. Still, getting 2.5Gbps, exclusively, with low latency isn't possible with the other connections (yet?).

Tiksi · on June 11, 2016

No doubt fiber is definitely the way to go if it's an option. Other than fiber, the closest you'll get is DOCSIS 3 theoretically runs at ~1gig downstream. Anecdotally, I've had Fios for a couple years and get faster than advertised speeds consistently and never had an outage, so I'm all about fiber.

I think it's due time that having an actual fiber drop becomes as common place as having a phone line or cable line. With how much people use the internet these days, it makes sense to have it's own connection instead of piggybacking on whatever physical connection people already have.

venomsnake · on June 10, 2016

That is not leapfrogging. To have decent internet access that can power economy - you need some form of wire to the home. S/FTP,coaxial, twisted pair or fiber.

mikecb · on June 10, 2016

At the moment.

venomsnake · on June 10, 2016

Laws of physics tend to change infrequently. Sharing spectrum means low speed for all.

mikecb · on June 10, 2016

I'm not suggesting that, but that economics changes things much more rapidly, c.f. 5G, which may roll out multi-gigabit speeds to consumers far before landline.

venomsnake · on June 10, 2016

And how you will provide multi gigabit speeds to a 100 people in close proximity simultaneously? There is the question of throughput of the whole network.

lisivka · on June 10, 2016

So beams must be directional, to avoid sharing of spectrum with all, but only with some.

dmm · on June 10, 2016

A fiber of glass is the ultimate directional antenna.

cyphar · on June 10, 2016

The problem is that broadcasts are exactly that: broadcasting. They transmit in all directions. While you have bandwidth constraints in wires, they're all localised to the wire. Even parabolic dishes (which make getting a signal a huge PITA) still aren't as localised as cables. Wired connections get a bad rap, but they're actually a much more preferable (and scalable) medium for transmission than radio waves.

Not to mention that "bidirectional beams" is another way of saying "noise".

anonbanker · on June 11, 2016

why can't TCP/BATMAN be defined as "decent internet access"?

qb45 · on June 10, 2016

OTOH, industrialized countries leapfrog them in certain other technologies, like wired networks.

I'm not sure if wireless would scale to the traffic capacity of our developed and industrialized telephone and TV cabling, especially in dense cities.

wkz · on June 10, 2016

Yeah the UX is user hostile to say the least.

Shameless plug: I've started working on a more coherent tool. Inspired by awk and DTrace, but it is still in its infancy.

https://wkz.github.io/ply/

cm3 · on June 10, 2016

Sorry, this is going to be rather long for HN, but it's been bugging me for a while now, and this is a good opportunity to talk about it.

    Can't enhanced BPF do this too?

    Yes, if you program it to. I feel like we've been waiting for
    advanced tracing in Linux for years, and now two busses have
    arrived at the same time. This overlap concern was raised and
    discussed on lkml, and it was eventually deemed that hist triggers
    was different enough to be included.

This is what's wrong with Linux kernel development. We have pluggable subsystems like I/O scheduling or CPU scheduling where BFQ or BFS are not mainlined and have to live as external patchsets. Then we have more than a few competing tracing/probing frameworks all somehow magically in torvalds/linux.git despite their overlap and unfinished (partly experimental) state.

I used to think that Linux kernel development is about who comes up with good, working code, but more and more I'm starting to get the impression it's that you need to have the right connections for a competing implementation to land.

I won't argue that linux needs DTrace or a similarly coherent solution, although I absolutely believe that. However, seeing the impossible to explain difference in what experimental, half-implemented code gets mainlined and which widely used patches have to live outside the kernel, I cannot find another explanation than requiring the right street cred and connections for competing implementations to land. This is especially inexplicable for BFQ because it's a in an already pluggable system.

To be clear, I'm not referring to projects like GRsec whose maintainers are unwilling to split it into smaller chunks, and I'm primarily talking about simple things like BFQ, which has solved my global lockups during large (long) USB mass storage flushing. With BFQ, it's at most the application that called fsync() and the rest of the applications are still responsive.

With all that being said, the evolutionary style of kernel subsystem development doesn't work for everything, especially not for a DTrace replacement, which would do well by having a single well thought out design and broad availability of probes. While I can see that maybe people are afraid of Oracle patents, DTrace isn't complicated enough that a similar effort couldn't be implemented while keeping the same benefits. A simple transpiler for DTrace scripts can be more than sufficient if the syntax is also preferred to be different.

In summary, I don't understand how we have so many tracing system alternatives, experimental and unfinished as well, while a pluggable subsystem has only plugins personally accepted by that subsystem's maintainer.

gajjanag · on June 10, 2016

> This is what's wrong with Linux kernel development. We have pluggable subsystems like I/O scheduling or CPU scheduling where BFQ or BFS are not mainlined and have to live as external patchsets. Then we have more than a few competing tracing/probing frameworks all somehow magically in torvalds/linux.git despite their overlap and unfinished (partly experimental) state.

> I used to think that Linux kernel development is about who comes up with good, working code, but more and more I'm starting to get the impression it's that you need to have the right connections for a competing implementation to land.

I think this is true of all sufficiently large organizations with human interactions: politics is an integral part of them.

It also reminds me of Con Kolivas's reasons for no longer working on the mainline kernel: http://forum.digit.in/chit-chat/81361-why-i-quit-kernel-deve....

Generally, I just hope that server developments and improvements trickle down to my personal computing devices.

pawadu · on June 10, 2016

Lets view this from another angle: you want to work in a project with tens of thousands of developers and literally billions of users. No matter your works technical merits, you simply cannot DEMAND that Linus accepts your huge patchset without having a very good track record.

Con did not have that track record. In fact, he made it very very clear that he was not interested in a long-term commitment and that he was not willing to compromise. I am sure Con was a good programmer, but he was also a cowboy programmer with awful people skills.

So this was never about "knowing the right people".

cm3 · on June 10, 2016

Let's ignore BFS because my motivator for using a -ck kernel is the availability of BFQ.

I mean, I didn't claim it's generally about knowing the right people, but some occasions, like the one mentioned above, do not leave much room for interpretation. BFQ in particular has been discussed years ago on LKML, and while the general design of it was deemed beneficial, the consensus from the kernel maintainers had been to refactor CFQ and turn it into BFQ. Given the pluggability of I/O schedulers, and already having at least three bundled schedulers, this makes no sense and is hard to explain to users of the kernel. Now years have passed and neither did CFQ evolve in the discussed way nor did BFQ enter torvalds tree.

The argument I'm making is that in the meantime many unfinished and experimental bits landed in mainline. If I were the maintainer of a patch like BFQ, I might possibly take the effort to a more inclusive kernel, and integrate and refine it there after it's mainlined.

brendangregg · on June 10, 2016

Yes, there's two frameworks in the kernel that now have some overlap: ftrace (incl. hist triggers) and perf_events (which can incl. BPF). But it's not a case where they magically got included. Initially they didn't really have overlap: ftrace was doing tracing, and perf_events was doing PMC stats. But they grew over time (especially perf) to the point where there was some overlap. This was noticed and has been discussed, and argued about. Eg, see the later half of https://lwn.net/Articles/442113/. At some point Steven Rostedt, ftrace author, had said:

> Now that perf has entered the tracing field, I would be happy to bring the two together. But we disagree on how to do that. I will not drop ftrace totally just to work on perf. There's too many users of ftrace that want enhancements, and I will still support that. The reason being is that I honestly do not believe that perf can do what these users want anytime in the near future (if at all). I will not abandon a successful project just because you feel that it is a fork.

Hist triggers is an enhancement to ftrace, not its own thing. BPF is a kernel facility originally designed for network I/O, that can also be used by ftrace and perf.

But yes, why not just one unified project? Such a project could bring in the experts from the other external tracing projects too (SystemTap, LTTng, ktap, sysdig, ...). Well, for better or for worse, Linux isn't a company, and Linus isn't its CEO. Scott McNealy, CEO of Sun while DTrace was developed, was fond of the saying:

> all the wood behind one arrow

For an April fool's prank, Sun staff put a giant wooden arrow through his office window. http://www.slideshare.net/brendangregg/from-dtrace-to-linux/...

If Linux was actually Linux Microsystems, the CEO could pick one tracing project and all teams would either toe the company line or leave. But we don't have that.

What we have, including external tracers, is a mess. There's too many tracers. And this becomes a tax on contributors who want to help fix the problem. Which one should you pick? Better lean the pros and cons of them all, which is no easy task.

Fortunately we're getting enhanced BPF, which does everything, so the tracing mess will finally have an end. Hist triggers was a surprise. It would have been a slam dunk before enhanced BPF existed, but now?? I suppose I understand it as an ftrace enhancement, and not its own thing.

bcantrill · on June 10, 2016

That's not the way that Sun worked at all, actually -- and certainly not with respect to DTrace. To the contrary, Sun often deliberately allowed competing projects, believing that the internal competition would generate a better result. (Or, perhaps more likely, Sun simply did not have the command-and-control to prevent such internal competition.) So contrary to your assertion, there was never an executive fiat around DTrace (indeed, there was never much executive fiat at all at Sun), and DTrace had loads of internal competition -- but DTrace succeeded where the others didn't because it was actually useful for solving the problems that others weren't.

brendangregg · on June 10, 2016

There's roughly ten different tracers for Linux. Yes, Sun liked competition, but it did not endure this situation like this -- what project had ten competitors, all staffed and paid for by Sun? If Linux were a company, I cannot believe the situation would have gotten this out of hand.

bcantrill · on June 10, 2016

There were (at least!) ten different kernel debuggers at peak in the late 1990s -- and many different tracers too. (As the saying goes, you haven't heard of them for a reason.) My point is that Sun didn't get out of the morass of myriad half-assed solutions by executive fiat; it got out of it by a small, focused, self-motivated team that was hell-bent on building something that could work on real problems. The nascent success of DTrace had nothing to do with executive management, corporate heft, or marketing; these things only were brought to bear when DTrace had already succeeded where it mattered most: solving real problems.

brendangregg · on June 10, 2016

A nice story, but it's not working out that way for Linux. The Linux kernel engineering talent is there, but split among the different tracers, and, have different priorities to work on. And this has dragged on for years. A rational company could see that it wasn't working (no nascent success), and then combine the right forces to get it done. Linux isn't a company.

cm3 · on June 10, 2016

You summed it up nicely, thanks. I kinda had LTTng and the other solutions in mind when I said there's many competing implementations, with varying degrees of functionality and maturity, and of course not all in mainline.

As a user, it's a mess because for example Erlang/OTP has added LTTng support, which I expect to be born out of a user requirement based in an environment like SLES or RHEL. That doesn't make it universally practical on a random kernel. Then we have DTrace emulation in SystemTap, and I've heard it kinda works, but I don't know how many probes exist for it.

If I want to trace on illumos or FreeBSD (NetBSD as well), there's just DTrace and I can share tracing code to some extent and the DTrace probes, say, in VMs have to be implemented once.

Like you, I hope there will be a coherent solution, but I don't see it yet.