Oh God. Pilots use LGTM too.

papruapap · on Feb 22, 2023

Let's Gamble, Try Merging

junon · on Feb 22, 2023

This is a life changing comment for me. Thank you.

bobleeswagger · on Feb 22, 2023

Let's Get That Money

ravenstine · on Feb 22, 2023

Move fast, break things, bruh.

mcherm · on Feb 22, 2023

Notice that they are not using it as the only means for verifying system. They are using it as an additional check that supplements the extensive regression test suite. The test suite failed in this case because the bug occurred only under high load (after this, they added a new test case that simulates high load). The human verification that was in place in production as an additional safety check operated as intended and detected the bug.

dentemple · on Feb 22, 2023

What we need to do is package LGTM as an OpenAI-based automated service...

dclowd9901 · on Feb 22, 2023

Honestly, I’m glad they do that much. God forbid they just blindly accept the data.

tialaramex · on Feb 22, 2023

Indeed. Designing software so that it helps humans to eyeball stuff where humans eyeballing stuff might help is really good.

For example in that incident in Ireland where crew erroneously told their jet plane it was very, very cold, and then (cold air is dense, less power needed) they barely had enough thrust to take off before they ran entirely out of runway - that was partly caused (Swiss cheese model) by the pilots being issued with software that didn't make it easy for them to cross check the facts they can eyeball. Is it certain they'd have realised if their software did show this information - like the software issued to similar crews with other outfits? No, but that's one more opportunity for everybody to not die, and the whole point of the Swiss cheese model is that lots of things have to go wrong or else you're fine.

One of the other potential mitigations for that incident illustrates where human eyeballs don't help. In principle a human pilot could realise they aren't accelerating enough. In practice humans are lousy at estimating acceleration to the tolerance needed so this isn't effective unless you use the same strip so often that your brain learns exactly how things ought to look.

However thanks to GPS a machine can measure acceleration very well, and it can know how long your chosen runway is and what its take off speed is - so it can estimate if you're accelerating enough and the latest models can announce "Caution: Acceleration" early in the take off roll, before V1, which means the crew can abort and check why they didn't have the intended acceleration - regardless of the cause.

agloe_dreams · on Feb 22, 2023

That has, historically been my issue with LGTM. It is ultimately a meaningless rubber stamp. Many people get a large PR and give it a quick scroll, type LGTM and move on. If I had it my way, every PR with over 100 LOC diff should require two PR code comments for a reviewer to approve it.

dclowd9901 · on Feb 22, 2023

I see it more as “this result is commensurate with my experience,” the problem with it, of course, being reliant on the pilot’s experience. But otherwise we’re just talking about a “sniff test” on a number. LGTM sucks in software because software inherently is a lot more complex than a number.

dotancohen · on Feb 22, 2023

Let's Get This Moving

sfmike · on Feb 22, 2023

devils advocate, what if No AI will ever reach a point of being perfect without a human LGTM verification

hyperthesis · on Feb 22, 2023

looks good to me