Hacker News new | past | comments | ask | show | jobs | submit | vasco's comments login

We added an account for our dog that doesn't get used but it's funny to see.

A previous dog of mine loved watching the BBC show Animal Hospital (1994), and would run to the television within seconds of the theme song starting. She'd just lay down in front, looking at the television for a whole episode. This was all pre-Netflix but imagine the recommendations for that dog account.

That's a funny way of looking at it because the banana republics weren't called that because they were "bananas" or something. They were called that to identify which of those countries had had state and megacorp interference and government toplings, by mostly the United Fruit Company - an American company.

Whatever the banana republics were they were turned into that by the US's doing, so it's funny that now the term comes back home.


It bears some resemblance to the Imperial Boomerang.

https://en.m.wikipedia.org/wiki/Imperial_boomerang


This has been the best TIL moment for me on HN.

Thanks, man, I am now in the rabbit hole of reading up.

In that same context, did you read the article about how diplomats were "convincing" the Mexican government to not use open source over Microsoft?

It sure sounds like the same strategy.

[1] https://lwn.net/Articles/1013776/


People commenting here about Trump corruption are correct, but it's also not new. This is regression to the mean. America has historically been a highly corrupt nation with extreme wealth inequality that occasionally has shocks (e.g. Abolitionism, the Progressive movement, WWII) that allow liberals to take over and purge the system of corruption. If anything, we've had to deal with and defeat (or at least, outlive) smarter and more well-connected fascists than Trump.

I agree and my rationale of it is that it's related to the US dedication to capitalism and thus aversion to any form of socialism (even small pockets that, in my opinion, are evidently positive for society as a whole) as some kind of governmental totalitarianism.

At the pace models improve, the advantage of going the dark route shouldn't really hold for long, unless I'm missing something.

Access to proprietary training data: Search, YouTube, Google Books might give some moat.

We have Common Crawl, which is also scraped web data for training LLMs, provided for free by a non-profit.

The Common Crawl is going to become increasingly contaminated with LLM output and training data that is more likely to have less LLM output will become more valuable.

I see this misconception all the time. Filtering out LLM slop is not much different than filtering out human slop. If anything, LLM generated output is of higher quality that a lot of human written text you'd randomly find on the internet. It's no coincidence that state-of-art LLMs increasingly use more and more synthetic data generated by LLMs themselves. So, no, just because training data was produced by a human doesn't make it inherently more valuable; the only thing that matters is the quality of the data, and the Internet is full of garbage which you need to filter out one way or another.

But the signals used to filter out human garbage are not the same the signals that would be needed to filter LLM garbage. LLMs generate texts that look high-quality at a glance, but might be factually inaccurate. For example, an LLM can generate a codebase that is well-formatted, contains docstrings, comments, maybe even tests; but it will use a non-existent library or be logically incorrect.

LLM output is uniquely harmful because LLMs trained on LLM output are subject to model collapse

https://www.nature.com/articles/s41586-024-07566-y


Problem with filtering is that LLMs can generate few orders of magnitude more slop than humans.

Are the differences between Google Books and LibGen documented anywhere? I believe most models outside of Google are trained on the latter.

It's not banned there's many approved ones: https://ec.europa.eu/food/food-feed-portal/screen/gmo/search

But it's also important to review each variation and study it closely to avoid potential food safety issues.


This appears to be the main difference between the EU and the US.

In the EU you need to prove your thing won't be harmful before you launch it. In the US you launch it, but then if it's proven to be harmful it might get banned.

I refer to that form of regulation as "closing the door after the horse has already bolted regulation".


You can worry about market manipulation, or you can buy GME shares, lock them in the Direct Registration System and wait. I just like the stock.

> any normal fire engine costing more than $1,000,000 just leaves me gob-smacked. Even $500,000 would seem outlandish to me

No, they said 500k would also be outlandish. But the commenter correctly caught that 500k would in fact be exactly the same price, inflation adjusted. This makes a big difference.


Deducting R&D in Software is very funny. I've mostly seen it been used fraudulently, but luckily I haven't seen much of it at all.

Spend a month managing kubernetes clusters (not R&D), do one commit to a custom operator or library you're "researching" and boom, magic money.


It’s funny that this comment keeps putting “researching” in quotes, but you guys do realize that R&D has the letter D in it as well right?

For Development?

Yeah, committing to a library isn’t research, but it is very clearly development.

You may not want this kind of development to receive tax breaks, which is a reasonable position to have, but this rhetorical trick of pretending R&D is only about research to pretend actions that are very clearly development are somehow illegitimate R&D is highly misleading to say it nicely.


Well R&D is just the English term I used for it - but you have a point. How the tax law evaluates this and what words are used depends on jurisdiction, whereas I was speaking imprecisely in an internet forum. In my country for example the term is similar but you usually have to submit projects in advance for approval and have reviews of your research output.

The D in R&D means developing research into a product.

That's pretty far off keep-the-lights-on or CRUD coding.


You got it exactly backwards.

R&D is not tax deductible, so the cheat would be to say the custom operator (=R&D) is maintenance and therefore OpEX and a fully deductible expense.

The entire problem is that software companies hire workers and need to deduct salaries in the month they are paid. The 5 year R&D rules basically say that your salary in one month must be tax deducted over five years. If your business is breaking even and making no net profit, you would now have to pay taxes, which turn your "break even" business into a loss making business.

There is no situation where the old rules let you get away with paying less taxes than you should. The rules simply affect the timing of the tax deductions.

The new tax rules basically assume every business has infinite access to 0% interest loans to bridge over temporary tax induced insolvency.


How does deducting R&D in software equal magic money? Isn't it just making the deduction timeframe the same timeframe as the actual expense/money paid?

It's weirder that you spend money on R&D today & then get a 20% for 5 years in a row, no? You've spend money, but also can't deduct it from your profit so you're both spending money and paying taxes on that money as if you had it.


With k8s u’re 80% researching innovations in container orchestration, & only 20% doing actual work, so it’s clearly an R&D expense ;)

Is this true though? Salaries are generally fully deductible from income at time of payment. This article ia about amortizing it, which delays deductibility. Would be more magic loss of money, no?

personally, i think amortization is generally stupid and all expenses should come off income at time of accrual with the exception of a matching amount to net debt change over the year, but i am a dreamer.


Don’t take it personally, but if you aren’t researching and developing new and improved methods of performing Information Technology to better equip your company to compete in its marketplaces, then you had better be purely in Operations.

I see you're one of the people I worked with that liked classifying things like "our A/B test says the red button converts better" or "we should let the users message each other" as Research. CRUD app #548 isn't research in my book, unless you're doing something you can actually publish papers about with some innovation. Even in "basic UX" it is possible to do research, but most people are making a button, not researching.

every startup I worked at used the research thing for things that definitely were not studying how to improve the computing field.

That isn't how the IRS defines it though. Essentially anytime software is being "developed" they will let you classify it as "R&D" (well you can claim the 174 R&D expense deduction but not the R&D Tax Credit under Section 41 which requires activities closer to what you describe).

> Section 5 of Rev. Proc. 2000-50 provides, “The costs of developing computer software (whether or not the particular software is patented or copyrighted) in many respects so closely resemble the kind of research and experimental expenditures that fall within the purview of section 174 as to warrant similar accounting treatment.” As a result, “the Service will not disturb a taxpayer’s treatment of costs paid or incurred in developing software for any particular project, either for the taxpayer’s own use or to be held by the taxpayer for sale or lease to others,” where the taxpayer either treated the costs as currently deductible expenses under former Sec. 174(a) or capitalized and amortized them under former Sec. 174(b) (id.).


Almost makes you wonder what that D stands for in “R&D”.

It stands for "marketing", obviously. After all, that's what you do to "develop" the output of your "research" into a product.

I know. Im not saying they were committing fraud. Im saying this is of questionable social value to subsidize.

It’s not about “improving the computing field”, but creating new IP (Intellectual Property).

Old industries - get paid for doing the actual work

Enterprise SW - get paid for automating things

Software companies - get paid for writing COTS (Commodity of the Shelves) software

Sharing economy startups - get paid percentage of every transaction in a specific marketplace


If you're a director of global public policy for seven years, who are you really whistleblowing, the company, or yourself?

I appreciate the information coming out, but in some of these situations I can't help but picture that "the worst person in the room" in regards to the offenses might also end up being the person that then becomes the most holier than thou when they get out of the company.

If you ever met someone who used to work in software ads and ask them about privacy you'll get what I mean.


Everything is incentives of the company. People are not told to do something evil, they're told "We don't have the budget to look into that" or "don't do that until we hear from the higher-ups."

I'm disinterested in who made what call because ultimately it's designed to be nebulous. Someone breaking out of that horrific cycle and telling us what it leads to is a good thing. Even though it will likely lead to nothing, like it did with Sophie Zhang [1]

[1] https://www.buzzfeednews.com/article/craigsilverman/facebook...


I get your point, but also, so what? The point of this whistle-blowing is that the public gets to make informed decisions about these companies that provide us with services that much of us use daily, and that the government gets a chance to figure out whether this is the truth or not and take actions accordingly. It's not about who is guilty right now but what needs to be done to prevent this in the future, IMO.

And whatever she might be guilty of doing, the chances of her ever having the same access to sensitive information in any company has gone drastically down after coming forward with this information.



Can't really eat plastic figures. And you maybe only ate frozen fish in your life but fresh fish is great.

It's way more profitable for me to work than to sleep but I still sleep.


The fishing industry is especially weird because you can't make it more productive and decades of legislation have been spent desperately trying to make it less productive so there are still some fish left. I like fresh fish but there's a very real limit as to how much we can have.

This is one of the cases where you should indeed rely on Google.

You just created a modern take on LMGTFY

It's now pronounced LMCGPTTFY

I've had people do this to me (albeit in an attempt to be helpful, not snarky) and it felt so weird. The answers are something a copywriter would have thrown together in an hour. Generic, unhelpful drivel.

Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: