More

ChikkaChiChi · 2024-07-24T16:00:02 1721836802

4o will get the answer right on the first go if you ask it "Search the Internet to determine how many R's are in strawberry?" which I find fascinating

paulcole · 2024-07-24T18:25:38 1721845538

I didn't even need to do that. 4o got it right straight away with just:

"how many r's are in strawberry?"

The funny thing is, I replied, "Are you sure?" and got back, "I apologize for the mistake. There are actually two 'r's in the word strawberry."

ofrzeta · 2024-07-24T20:00:40 1721851240

I kind of tried to replicate your experiment (in German where "Erdbeere" has 4 E) that went the same way. The interesting thing was that after I pointed out the error I couldn't get it to doubt the result again. It stuck to the correct answer that seemed kind of "reinforced".

It was also interesting to observe how GPT (4o) even tried to prove/illustrate the result typographically by placing the same word four times and putting the respective letter in bold font (without being prompted to do that).

jcheng · 2024-07-24T18:58:51 1721847531

GPT-4o-mini consistently gives me this:

> How many times does the letter “r” appear in the word “strawberry”?

> The letter "r" appears 2 times in the word "strawberry."

But also:

> How many occurrences of the letter “r” appear in the word “strawberry”?

> The word "strawberry" contains three occurrences of the letter "r."

brandall10 · 2024-07-24T20:39:03 1721853543

Neither phrase is causing the LLM to evaluate the word itself, it just helps focus toward parts of the training data.

Using more 'erudite' speech is a good technique to help focus an LLM on training data from folks with a higher education level.

Using simpler speech opens up the floodgates more toward the general populous.

brandall10 · 2024-07-24T20:30:40 1721853040

All that's happening is it finds 3 most commonly in the training set. When you push it, it responds with the next most common answer.

paulcole · 2024-07-24T23:42:38 1721864558

But then why does it stick to its guns on other questions but not this one?

brandall10 · 2024-07-25T00:54:31 1721868871

I haven't played with this model, but rarely do I find working w/ Claude or GPT-4 for that to be the case. If you say it's incorrect, it will give you another answer instead of insisting on correctness.

paulcole · 2024-07-25T03:46:09 1721879169

Wait what? You haven’t used 4o and you confidently described how it works?

brandall10 · 2024-07-25T13:44:53 1721915093

It's how LLMs work in general.

If you find a case where forceful pushback is sticky, it's either because the primary answer is overwhelmingly present in the training set compared to the next best option or because there are conversations in the training that followed similar stickiness, esp. if the structure of the pushback itself is similar to what is found in those conversations.

paulcole · 2024-07-25T16:46:19 1721925979

Right... except you said:

> If you say it's incorrect, it will give you another answer instead of insisting on correctness.

> When you push it, it responds with the next most common answer.

Which clearly isn't as black and white as you made it seem.

brandall10 · 2024-07-25T17:56:14 1721930174

I'll put it another way - behavior like this is extremely rare in my experience. I'm just trying to explain if one encounters it why it's likely happening.

ChikkaChiChi · 2024-05-22T20:34:35 1716410075

> OpenAI will receive access to current and archived content from News Corp’s major news and information publications, including The Wall Street Journal, Barron’s, MarketWatch, Investor’s Business Daily, FN, and New York Post; The Times, The Sunday Times and The Sun; The Australian, news.com.au, The Daily Telegraph, The Courier Mail, The Advertiser, and Herald Sun; and others. The partnership does not include access to content from any of News Corp’s other businesses.

These publications aren't exactly known for their level-headed, unbiased journalism. What a shame that we're poisoning the well and allowing old media to maintain a stranglehold on crafting their own narrative.

ok_computer · 2024-05-22T20:53:44 1716411224

“We” don’t have any say.

It’s naive to take the charter of openai at face value. It’s hilarious to see applications of non-profit tax filings as I’ve grown up. There is tremendous wealth and commerce flowing through the US tax free. Religions, universities, “good” tech companies. To whose benefit? Humanity’s? Lol

I bet the nonprofit with a small for profit arm that the tech is licensed from is the next best thing since the defeated incorporating in Ireland for tax avoidance.

benzible · 2024-05-22T20:38:12 1716410292

I don't think the main issue with News Corp is that they're "old" media.

ChikkaChiChi · 2024-04-29T15:58:23 1714406303

Travis Bickle wasn't a real person.

ChikkaChiChi · 2024-04-26T18:55:01 1714157701

https://archive.is/hsKG0

Title is misleading. This article is about the NHTSA investigating the efficacy of Tesla's recall last year. There is no mention of "hundreds of crashes"

ChikkaChiChi · 2024-04-25T15:41:59 1714059719

This does not appear to affect domestic customers.

Izkata · 2024-04-25T15:53:06 1714060386

How would they know a customer is domestic or foreign without some level of identification on everyone?

beaeglebeachh · 2024-04-25T18:21:21 1714069281

Bingo. They'll have to KYC everyone to avoid liability of missing a faking foreigner.

noodlesUK · 2024-04-25T15:45:37 1714059937

Then surely all the good actors have to do KYC, and all the bad actors can just pretend to be American entities.

I don't agree with this on principle, but even just from a practical perspective it seems like they are leaving the door completely open by doing that. What's even the point?

charlie0 · 2024-04-25T15:56:13 1714060573

ChikkaChiChi · 2024-04-01T20:23:30 1712003010

Bard already integrates with Google Search. These companies aren't competing to be the best; just the most good enough in existing form factors.

ChikkaChiChi · 2024-03-06T22:21:10 1709763670

Time to put on my robe and wizard hat.

0cf8612b2e1e · 2024-03-06T22:24:19 1709763859

Gasp. Is bash.org dead? Internet archive is showing no mirrors in a long time.

Daneel_ · 2024-03-06T22:26:40 1709764000

Sadly, yes.

https://news.ycombinator.com/item?id=38950721

simonjgreen · 2024-03-06T23:10:34 1709766634

It goes for periods of time, and often re-emerges again later

ChikkaChiChi · 2024-02-21T21:41:23 1708551683

https://archive.ph/3AU1h

ChikkaChiChi · 2024-02-21T17:47:36 1708537656

Has anyone actually gotten a job specifically for the Generative AI skills mentioned in the article?

ChikkaChiChi · 2024-02-05T18:22:42 1707157362

Built a tool to summarize certification and licensing costs associated with jobs that require State credentialing.