Hacker News new | past | comments | ask | show | jobs | submit | more OtherShrezzing's comments login

If you watch the top tier social deduction players on YouTube (things like Blood on the Clocktower etc), they’d figure out weaknesses in the LLM and exploit it immediately.


Testing against people like that would be the way to do it. Otherwise it’s like testing a chess engine against casual players or worse.


Assuming that the models getting better at SWE benchmarks and math tests would translate into positive outcomes in all other domains could be an act of spectacular hubris by the big frontier labs, which themselves are chock-full of mathematicians and software engineers.


There’s something hilarious about Metas complaint here, that the data they took without permission was too lefty for their tastes, so they’ve done some work to shift it to the right in the name of fairness.


This is a great initiative which I fully support. But why does it take 4-6 years (including several months of complete road closure) to build what is, in the grand scheme of things, a quite small bridge?


Believe it or not, the people that design and build these bridges have a lot more experience than you do. So if they're doing something that doesn't make sense to you, you should ask yourself what you're missing, and not question their basic competence.

The point isn't just to make a bridge, or they would use a simple concrete bridge.

The point is to make a bridge that is preferable to crossing the highway. And to do that they need to landscape it as if it were part of the landscape. That takes more time because the bridge needs to be stronger to handle the weight of the soil and plants, which means the design, permitting, and construction phases are longer.

They're closing the surrounding road for the soil movement part of the project because that part of the project entails risks of landslides, and it's safer to just close the road. However, the road won't be closed the entire time; it will only be closed for a few hours at a time. The months-long period is just the part of the project when the road will be subject to potential closure.


So the article says that Agoura Road (which is not the highway) will be closed, but the project FAQ still says that it won't be? And the government pre-notices don't seem to mention anticipating closing the road either...

https://101wildlifecrossing.org/crossing-faq/

https://www.agourahillscity.org/Home/Components/News/News/39...


Because they aren’t building a cheap old bridge animals will probably learn to use. They are effectively regrading the mountainside and landscaping over this bridge to try and blend it in with the landscape. Imo its a waste of money. Crabs learn to use the dead simple ugly bridges. Crabs. A coyote or a mountain lion or deer (what you’d see using this thing in CA) all already use human style bridges to cross highways.


Not sure that it's a complete waste of money, but it's pretty obvious that animals will use any means of getting across that's convenient.


Completely agreed. All the nonsense about getting just exactly the right soil composition, letting the soil "age", and what-not, seems bizarre and silly. If an animal needs to use the bridge, it'll learn to use the bridge. Spending so much extra money and time to get the vegetation just right is a waste of taxpayer dollars.


This is lightning speed for California construction...


They can go hyperspeed if they want. See freeway building after northridge earthquake or when they replaced an entire bridge over the 405 in one night.


>This is also a cultural issue. In large cities, people often don't feel as being part of the community and they don't take pride in their surroundings. They put rubbish everywhere, vandalise. There is little done to change that. They see neighbour has nice flowers in the garden? Instead of admiring, they will cut them off.

I don't think this aligns with the lived-experience of most Britons. The big cities are mostly litter-free areas, and people can have well tended gardens go unmolested by neighbours.


Not my experience from living in South London. There is rubbish everywhere and I had my front garden vandalised many times.


London is not representative of the rest of the UK. It’s about 11% of the population, consisting of numerous councils and boroughs with different demographics and “upkeep”.


South London is pretty extreme though. In the towns and smaller cities it's usually not as bad.


It's trivial to get around these rules. Northern Irelnd is (or was at some point) a country of origin for both the EU and the UK. So a company could produce something in Greece, ship it to Dublin within the EU, then truck it to Belfast in Northern Ireland, and export it to the US with a UK certificate of origin.


Pretty much every single Aliexpress purchase I've made has been shipped from the Netherlands for years now.

They use it to get around EU customs and tariffs, dunno how but it works.


>While I agree that LLMs are hardly sapient, it's very hard to make this argument without being able to pinpoint what a model of intelligence actually is.

Maybe so, but it's trivial to do the inverse, and pinpoint something that's not intelligent. I'm happy to state that an entity which has seen every game guide ever written, but still can't beat the first generation Pokemon is not intelligent.

This isn't the ceiling for intelligence. But it's a reasonable floor.


There's sentient humans who can't beat the first generation pokemon games.


Is there a sentient human that has access to (and actually uses) all of the Pokémon game guides yet is incapable of beating Pokémon?

Because that's what an LLM is working with.


I'm quite sure my grandma could not. You can make the argument these people aren't intelligent but I think that's a contrived argument.


Does the design need to change to address technical debt?


>There are so many services where you've registered on one domain (and that address is stored in 1Password), then you legitimately log on to a different domain.

This is a huge issue at the moment. For some reason, tonnes of companies have decided it's OK to have you register online at www.corporate-domain.com, and then have the login service hosted at corporate-domain-account.onmicrosoft.com, with emails arriving from mailhost.corporate-domain-mail-services.com.


This is also a problem for organizations internally.

I have a university email where IT tries to train people to recognize legitimate vs phishing emails by whether the login is on some onmicrosoft.com domain no one remembers. It then mangles all links in emails, so users without clients that demangle them can't actually see whether a link goes to that domain. And, of course, legitimate logins often involve redirects. With wide use of SSO, users can also expect login screens to appear while in a variety of vaguely related places, from journals, to news sites, to various subscription services. This is in the context of a login system that always requires otp, regardless of 'remember this device' settings, practically ends up needing at least one login per week for staff, and reportedly, can require students log in (with otp!) multiple times per day, so the login process is so frequent it is trivialized, and being careful with each login would take an enormous amount of time in total.

To further confuse things, IT repeatedly sends out fake phishing emails with links to Microsoft-owned domains with valid Microsoft SSL certificates.

I expect IT would respond that these arrangements satisfy all requirements they have, and that the solution is more user training and online webinars.


> To further confuse things, IT repeatedly sends out fake phishing emails with links to Microsoft-owned domains with valid Microsoft SSL certificates.

The org I work for does something similar. All links are obfuscated by some scanning service, unless it’s a trap…


It seems like Microsoft has some sort of fake phishing system with all of these ridiculous properties, which many organizations then use.

The first time I received one, I initially thought our email server had been compromised, because rather than realizing it was a fake test, my mind went from "Why was this obvious phishing email not caught by the spam filter?" to "How does this email not have Received headers!?" to "How does an obviously fake login page have a valid Microsoft SSL certificate on a validly Microsoft-registered domain name and a Microsoft-ASN IP address!?" to "How much of the university's infrastructure would have to be compromised for an attacker to do that!?".


I never understood this. Is there some reason not to use subdomains?


Sometimes it is due to using third parties for some portions, and those services needing DNS control but not supporting sub-domain delegation in the custom domain options. Why a company running it all themselves would do the same is more of a mystery, though simple not-joined-up-thinking is rather likely.


Typical symptom of a dysfunctional organization. It's easier to do anything else than to get the team managing the company domain to add a subdomain for them.


Oh yes, and the reason they will give for not adding it is security.


>Wouldn’t the market find a balance then, where the marginal utility of additional computation is aligned with customer value? That fix point could potentially be much higher than where are now in terms of compute.

I think the author's point here is that the costs are going to continue to fall for inference at an astonishing rate. We're in a situation where the large frontier companies were all consolidated around "inference is computationally expensive", and then DeepSeek - the talented R&D arm of a hedge fund - was able to cut orders of magnitude out of that cost. To me, that hints that nobody was focusing on inference efficiency. It's unlikely that DeepSeek found 100% of the efficiency gains available, so we can expect the cost of inference to continue to be volatile for some time to come.

It's difficult for any market to find equilibrium when price points move around that much.


Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: