If you watch the top tier social deduction players on YouTube (things like Blood on the Clocktower etc), they’d figure out weaknesses in the LLM and exploit it immediately.
Assuming that the models getting better at SWE benchmarks and math tests would translate into positive outcomes in all other domains could be an act of spectacular hubris by the big frontier labs, which themselves are chock-full of mathematicians and software engineers.
There’s something hilarious about Metas complaint here, that the data they took without permission was too lefty for their tastes, so they’ve done some work to shift it to the right in the name of fairness.
This is a great initiative which I fully support. But why does it take 4-6 years (including several months of complete road closure) to build what is, in the grand scheme of things, a quite small bridge?
Believe it or not, the people that design and build these bridges have a lot more experience than you do. So if they're doing something that doesn't make sense to you, you should ask yourself what you're missing, and not question their basic competence.
The point isn't just to make a bridge, or they would use a simple concrete bridge.
The point is to make a bridge that is preferable to crossing the highway. And to do that they need to landscape it as if it were part of the landscape. That takes more time because the bridge needs to be stronger to handle the weight of the soil and plants, which means the design, permitting, and construction phases are longer.
They're closing the surrounding road for the soil movement part of the project because that part of the project entails risks of landslides, and it's safer to just close the road. However, the road won't be closed the entire time; it will only be closed for a few hours at a time. The months-long period is just the part of the project when the road will be subject to potential closure.
So the article says that Agoura Road (which is not the highway) will be closed, but the project FAQ still says that it won't be? And the government pre-notices don't seem to mention anticipating closing the road either...
Because they aren’t building a cheap old bridge animals will probably learn to use. They are effectively regrading the mountainside and landscaping over this bridge to try and blend it in with the landscape. Imo its a waste of money. Crabs learn to use the dead simple ugly bridges. Crabs. A coyote or a mountain lion or deer (what you’d see using this thing in CA) all already use human style bridges to cross highways.
Completely agreed. All the nonsense about getting just exactly the right soil composition, letting the soil "age", and what-not, seems bizarre and silly. If an animal needs to use the bridge, it'll learn to use the bridge. Spending so much extra money and time to get the vegetation just right is a waste of taxpayer dollars.
They can go hyperspeed if they want. See freeway building after northridge earthquake or when they replaced an entire bridge over the 405 in one night.
>This is also a cultural issue. In large cities, people often don't feel as being part of the community and they don't take pride in their surroundings. They put rubbish everywhere, vandalise. There is little done to change that. They see neighbour has nice flowers in the garden? Instead of admiring, they will cut them off.
I don't think this aligns with the lived-experience of most Britons. The big cities are mostly litter-free areas, and people can have well tended gardens go unmolested by neighbours.
London is not representative of the rest of the UK. It’s about 11% of the population, consisting of numerous councils and boroughs with different demographics and “upkeep”.
It's trivial to get around these rules. Northern Irelnd is (or was at some point) a country of origin for both the EU and the UK. So a company could produce something in Greece, ship it to Dublin within the EU, then truck it to Belfast in Northern Ireland, and export it to the US with a UK certificate of origin.
>While I agree that LLMs are hardly sapient, it's very hard to make this argument without being able to pinpoint what a model of intelligence actually is.
Maybe so, but it's trivial to do the inverse, and pinpoint something that's not intelligent. I'm happy to state that an entity which has seen every game guide ever written, but still can't beat the first generation Pokemon is not intelligent.
This isn't the ceiling for intelligence. But it's a reasonable floor.
>There are so many services where you've registered on one domain (and that address is stored in 1Password), then you legitimately log on to a different domain.
This is a huge issue at the moment. For some reason, tonnes of companies have decided it's OK to have you register online at www.corporate-domain.com, and then have the login service hosted at corporate-domain-account.onmicrosoft.com, with emails arriving from mailhost.corporate-domain-mail-services.com.
This is also a problem for organizations internally.
I have a university email where IT tries to train people to recognize legitimate vs phishing emails by whether the login is on some onmicrosoft.com domain no one remembers. It then mangles all links in emails, so users without clients that demangle them can't actually see whether a link goes to that domain. And, of course, legitimate logins often involve redirects. With wide use of SSO, users can also expect login screens to appear while in a variety of vaguely related places, from journals, to news sites, to various subscription services. This is in the context of a login system that always requires otp, regardless of 'remember this device' settings, practically ends up needing at least one login per week for staff, and reportedly, can require students log in (with otp!) multiple times per day, so the login process is so frequent it is trivialized, and being careful with each login would take an enormous amount of time in total.
To further confuse things, IT repeatedly sends out fake phishing emails with links to Microsoft-owned domains with valid Microsoft SSL certificates.
I expect IT would respond that these arrangements satisfy all requirements they have, and that the solution is more user training and online webinars.
It seems like Microsoft has some sort of fake phishing system with all of these ridiculous properties, which many organizations then use.
The first time I received one, I initially thought our email server had been compromised, because rather than realizing it was a fake test, my mind went from "Why was this obvious phishing email not caught by the spam filter?" to "How does this email not have Received headers!?" to "How does an obviously fake login page have a valid Microsoft SSL certificate on a validly Microsoft-registered domain name and a Microsoft-ASN IP address!?" to "How much of the university's infrastructure would have to be compromised for an attacker to do that!?".
Sometimes it is due to using third parties for some portions, and those services needing DNS control but not supporting sub-domain delegation in the custom domain options. Why a company running it all themselves would do the same is more of a mystery, though simple not-joined-up-thinking is rather likely.
Typical symptom of a dysfunctional organization. It's easier to do anything else than to get the team managing the company domain to add a subdomain for them.
>Wouldn’t the market find a balance then, where the marginal utility of additional computation is aligned with customer value? That fix point could potentially be much higher than where are now in terms of compute.
I think the author's point here is that the costs are going to continue to fall for inference at an astonishing rate. We're in a situation where the large frontier companies were all consolidated around "inference is computationally expensive", and then DeepSeek - the talented R&D arm of a hedge fund - was able to cut orders of magnitude out of that cost. To me, that hints that nobody was focusing on inference efficiency. It's unlikely that DeepSeek found 100% of the efficiency gains available, so we can expect the cost of inference to continue to be volatile for some time to come.
It's difficult for any market to find equilibrium when price points move around that much.