Hacker News new | past | comments | ask | show | jobs | submit | dontreact's comments login

Cosine similarity is equal to the dot product of each vector normalized


“In my humble opinion, these companies would not allocate a second of compute to lightweight models if they thought there was a straightforward way to achieve the next leap in reasoning capabilities.”

The rumour/reasoning I’ve heard is that most advances are being made on synthetic data experiments happening after post-training. It’s a lot easier and faster to iterate on these with smaller models.

Eventually a lot of these learnings/setups/synthetic data generation pipelines will be applied to larger models but it’s very unwieldy to experiment with the best approach using the largest model you could possibly train. You just get way fewer experiments per day done.

The models bigger labs are playing with seem to be converging to about what is small enough for a researcher to run an experiment overnight.


> You just get way fewer experiments per day done.

Smaller/simpler/weird/different models can be an incredible advantage due to iteration speed. I think this is the biggest meta problem in AI development. If you can try a large range of hyper parameters, fitness function implementations, etc. in a few hours, you will eventually wipe the floor with the parties forced to wait days, weeks and months for their results each time.

The bitter lesson certainly applies and favors those with a lot of compute and data, but if your algorithms fundamentally suck or are approaching a dead end, none of that compute or information will matter.


“Layoffs usually have nothing to do with performance”

This has not at all been my experience. When forced to do layoffs in a large company, executives tend to look at performance reviews.

What are other people’s experience with this?


> This has not at all been my experience. When forced to do layoffs in a large company, executives tend to look at performance reviews

Speaking specifically to Facebook and Instagram, I know of more than one team where the manager wasn't consulted when someone higher up (I think they were advised by BCG) chose whom to cut.

The kicker? They frequently cut the highest paid. That obviously removed with a bias towards seniority. But it also meant that managers woke up to find their best-bonussed people gone while their worst performers--the cheapest on paper--remained.


Happened to me. Ranked near highest in company, promoted, pay bump two months before a layoff targeting US staff. They kept engineers in Australia who made 50-60% less. Before severance ran out, I landed a job that paid 50% more than before. Layoffs don’t indicate that much about the people laid off; they say a lot more about management.


I used to work _doing_ layoffs. Think George Clooney in Up In the Air.

I never encountered a layoff where performance reviews were considered. It was all line of business considerations.

My “favorite” was when the top performing call center got chopped wholesale simply because their lease was up soonest.


I have found at the companies I have worked for it is two things for a layoff: cutting a product/change in strategy and how much you are being paid. Layoffs for performance tend to be more one off than a massive cut.

What I am most surprised about is how many really good performers get cut for a product or strategy change. A company will be cutting a high performer while searching for a high performer at the same time like it is taboo to move people within the company.


IME, it's based off of "performance", not performance. So what happens is someone 4 levels above you on the org chart has a google sheet of all their underlings, their most recent grade - erm I mean performance review result - and their total comp.

They sort by total comp, then go down the list and figure out a reason to let that person go. Was there literally anything in your performance review summary they can ding you for? Yes? Phew, that was easy. No? Well, keep looking. "Strategic mismatch for skillset" "too junior, want senior" "too senior, want junior" "role eliminated due to headcount allocated to team being reduced" etc.

So, in the layoff I was privy to, somehow everyone who still had the large lucrative 4-year stock-denominated grants was suddenly gone, and the people who had the newer cash-denominated grants were still there. Meanwhile, several cheaper employees who were perennially underperforming were retained.

Honestly it really soured me on equity grants. It's a game of "heads I win (my company didn't grow and i got to pay you peanuts), tails you lose (my company grew and now i can just fire you and re-hire someone with a cheaper grant so you can't vest those now-very-lucrative appreciated shares)".


One of my former employers did a big layoff last year -- lots of folks who I remembered as being "important" (long time employees, large contributions, lots of domain knowledge) were let go. Seemed pretty dumb, but the wreckage stumbles forward anyway.


> Seemed pretty dumb, but the wreckage stumbles forward anyway.

the momentum in the org will continue its course, but without those long time employees who have the deep domain knowledge built up over the years, there's no way to steer nor alter course. So it's luck that a company continues on, because this momentum can't do anything else but continue on the current course.

It's why start ups can beat a behemoth.


I think it depends on the size of the company and who ends up being involved in the decision amking and when.

From what I have seen, the amount that someone is making can be a contributing factor. Maybe your performance reviews but that doesn't always paint a full picture of your actual performance since those are often kinda black and white.

In every layoff I have been a part of (wether or not I was personally affected) the managers found out the morning of and were not consulted before it happened. In more than a few cases someone critical was let go.

Making it mostly a numbers game.


Personal experience:

Counterexample 1: company had two products, one java one c++ based. Founders decided to focus on one product to extend runway. Everyone on 2nd product team laid off, regardless of perf reviews.

Counterexample 2: layoff needed to boost short term profit metrics for potential sale of company. Since focus is cost cutting, not long term viability, expensive folks targeted (ie senior high performers)

Counterexample 3 (union shop, yes rare I know but gov and academia often have union IT workers. Layoffs are purely seniority based (as in most recently joined union = first to be laid off)


HR requires at our company that entire teams be removed to show that the company is changing strategy.

This gets rid of good people, but it lower the risks of lawsuits.


Last layoff I was in, my boss was in the middle of writing my promotion paperwork. The people cut were mainly people with seniority ($$$). The company then sent my contact info to scammers in a probably well-intentioned attempt at placement help.


I've been at two small companies that had layoffs, and who was cut/retained was 100% based on what was best for the company (except some visa holders were retained). Which is what it should be.


What you said is one of the reasons a person can be laid off. Another is when the product, project, division, etc. is shut down and all personnel are laid off regardless of performance. Sure, they'll be given an opportunity to find a new role within the company but there are always far too few openings for more than a few to be retained.


In my experience performance reviews don't accurately differentiate the high performers vs the low performers. No one gets below average unless you are really struggling. The only real currency is if PMs want you on their team or not.


>No one gets below average unless you are really struggling.

Plenty of companies require someone to be rated below average. See "stack ranking" for example. Is it dumb? Probably. Is it common? Sadly yes.


My experience is that a list of employees ordered by (compensation/contribution) is created, sorted highest to lowest. Those nearer the top of the list are most likely to be let go, those lower on the list are most likely to stay. And, contribution != performance. Profitability (margin) and revenue growth of the product and team affect contribution as much as individual performance.


I don't think this is necessarily true for some larger companies. Sometimes, entire divisions are axed. It's more trouble than its worth (to the higher ups) to pick out the few good ones when the company is in turmoil. In startups, I agree, it's never "nothing to do with performance."


When whole orgs/product lines are cut, you’re laid off regardless of performance.


They still mature and yield, so the principal is not at risk. But yes, it is if you are a bank and people want to withdraw.


The talking point has been that if we do this in cases where there isn’t shelter to offer, the people will come back. Let’s see how that plays out, will be informative.

I think it’s also quite possible for some people it’s a needed wake up call


I agree with you but it also makes me think: Google's TPUs are also fixed costs and these research experiments could have been run at times when production serving need isn't as high.


They sell them on the spot market, so there’s someone that would consume the baseline compute.


I imagine it would never be optimal to set a price so low that utilization is always 100% externally


There is plenty of people who want cheap compute and are willing to wait till 3am if that's when the cheap compute is. Happens for all computing services, but for ML stuff the effect is even more pronounced because compute costs are typically a large part of the cost of many projects.


Flag this phishing attempt


I'm sorry,we find the problem, and it is fixed now.


I think numpy closely maps to how I think so it’s not as hard to read these dense lines as it would be to read expanded versions. I think my point of view is shared by a lot of leading researchers and this is why it is used more heavily.

The kinds of type safety you want might be good for other use cases but for ML research they get in the way too much.


How are they gonna pay for their compute costs to get the frontier? Seems hard to attract enough investment while almost explicitly promising no return.


What if there are other ways to improve intelligence other than throw more money at running gradient descent algorithm?


Perhaps. But also throwing more flops at it has long been Ilya’s approach so it would be surprising. Notice also the reference to scale (“scale in peace”).


6-figure free compute credits from every major cloud provider to start


5 minutes of training time should go far


6 figures would pay for a week for what he needs. Maybe less than a week


I dont believe ssi.inc 's main objective is training expensive models, but rather to create SSI.


Wonder if funding could come from profitable AI companies like Nvidia, MS, Apple, etc, sort of like Apache/Linux foundation.


I was actually expecting Apple to get their hands on Ilya. They also have the privacy theme in their branding, and Ilya might help that image, but also have the chops to catch up to OpenAI.


This is interesting. It seems like sidestepping a problem which needs to be attacked head-on: the amount of solar we need to build is more than what can be built in invisible areas isn't it? Perhaps this helps to get things going for now but can we actually build enough in areas with no visibility from residential?


The amount of land required isn't that high. Nate Lewis at Cal Tech once told me that "the amount of light falling on the numbered highways of the USA can generate more power than the US's entire generating capacity.

Of course he didn't mean that we should cover all the highways; it just says we could afford to build and maintain them all and have that much land area, so building an equivalent area of solar would be both cheaper and feasible.

That was over a decade ago; solar is both cheaper and more efficient. IIRC something small like a hundred km2 or so of desert would do the trick. We spend much more of that growing corn that we wastefully turn into fuel.


I think only if you ignore distribution and overall grid integration which would constrain where you can place solar plants and ignore that in the winter you’re not going to get much power production from the North East and PNW regions (and when you do it’s super spotty).

By that logic of ignoring constraints, nuclear takes 0 land area compared to solar and could generate more power than solar panels we’d ever produce.


Were you replying to my note? The point, as I noted, is not to say "just stick them where the roads are" but rather as a sort of Fermi estimate -- the cost of deployment is demonstrably quite feasible. Whether people want to do it or not is unrelated.

FWIW there are a lot of deserts and plains that can be used for year round generation.


My point is that the fermi approximation is flawed because HVDC is still extremely expensive to transmit power from places where solar is plentiful to where it’s needed. So there are significant real world implications that make the approximation off probably by an order of magnitude. There’s a lot of solar energy but effectively time and distance shifting it turns out to be extremely difficult and expensive. That’s why solar and wind continue struggling to replace fossil fuels in the grid (modulo places like California, Florida and Texas with abundant sunshine throughout the state) despite the generators themselves being cheaper than ever; all they’ve managed to do is absorb daytime energy growth. It’s something but our absolute fossil fuel consumption in the grid has continued to grow substantially even if as a percentage it managed to stagnate or marginally decrease. Nuclear continues to have far more success at actually replacing fossil fuels in the grid, has meaningfully less land footprint than solar, and while gen iii reactors require some work to maintain safety, it still remains remarkably safe per mwh comparable to solar and gen iv reactors have fail safe designs that don’t carry any of the same maintenance concerns. We gave up on nuclear fission too early and easily due to FUD from the coal industry and it still remains the better path to fix the grid’s contribution to global warming.


Yes and: IIRC, just 1/4 of the space devoted to golf courses.


I guess my question is:

Does this work address a specious, disingenuous argument that is being put forth by NIMBYs to block solar installation, so does it address a real pain point?

I agree we should be able to build enough solar but does this work address a real bottleneck or a fake problem presented as real by people with ulterior motives?


How often do you see farm fields, for example?

We could easily find many many large expanses of land, and given the margins on farming, many in my local area are renting / selling / building solar farms on the corners or harder-to-reach areas of their fields. Since those areas tend to have trees around anyway .... But none of that is really a viewshed problem unless you've modelled trees.

Now in hillier areas or mountainous regions this gets more compliated.


My rural community has a solar farm going in, and it is not going over well.

Lots of NIMBYs that “did their own research” to conclude that solar is actually bad for the environment or uneconomical or whatever. But the real reason for their rage is that they see it as a physical manifestation of the spread of “progressive ideas” and for that reason alone it must be stopped


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: