Hacker News new | past | comments | ask | show | jobs | submit | ryanmcbride's comments login

My go-to to show people who don't understand its limitations used to be the old "how many Ms are there in the word 'minimum' or something along those lines, but looks like it's gotten a bit better at that. I just tried it with GPT4o and it gave me the right number, but the wrong placement. In the past it's given it completely wrong:

>how many instances of the letter L are in the word parallel

The word parallel contains 3 instances of the letter "L":

    The first "L" appears as the fourth letter.
    The second "L" appears as the sixth letter.
    The third "L" appears as the seventh letter.


I wonder if that is simply a reflection of there being more training data out there about this "problem" and the model hoovering all that up and regurgitating it?


These kinds of letter tricks aren't great examples to use because they're fixable and might stop working at any moment, as you just discovered. Modern image generation models for instance not only present embeddings of the input but also have parameters for each individual letter in words, which is why the latest models are able to copy words into the images much better: they see the spellings encoded as input parameters, as well as the concept vectors.

Even things like citations aren't going to work long term. The models are just being trained not to emit URLs from memory but only from their context.

It's fine to show people the current state of things, but if you're trying to get students to understand why they shouldn't use AI the only technique that will work is this: "you will be examined in a controlled room where computers and phones aren't allowed". Nothing else will be effective over the medium to long term.


Here's QwQ-32 answering the same question. I'm including the entire answer here including the CoT:

> Let's see. The word is "parallel." I need to count how many times the letter "L" appears in it. Okay, spelling it out: P-A-R-A-L-L-E-L. So, starting from the beginning, first letter is P, not L. Then A, not L. R, not L. Another A, still not L. Then L, that's one L. Next letter is L again, so that's two L's. Then E, not L, and another L, making it three L's. Wait, but when I spelled it again, I saw P-A-R-A-L-L-E-L, and I counted three L's. But let me check again. Parallel: P-A-R-A-L-L-E-L. Yes, L appears three times. So, the answer is three.

> But, maybe there's a trick here. Is there any special consideration for the letter L in this word or in general? Sometimes questions like this might be testing for homophones or something, but I don't think so in this case. It's straightforward: count the L's in "parallel."

> Alternatively, perhaps it's a trick with the word itself. Is "parallel" spelled differently sometimes? I don't think so. It's always P-A-R-A-L-L-E-L. So, three L's.

> Wait a minute, is there a possibility that the word is misspelled in the question? If the word was "paralell," that would be four L's, but that's not the correct spelling. So, sticking with three L's.

> Another angle: is the question referring to uppercase or lowercase L? But in the word "parallel," all letters are lowercase unless it's at the beginning of a sentence, but the question just says "the word parallel," which is typically in lowercase. So, doesn't matter.

> Also, considering that "l" and "I" can sometimes be confused, but in this word, it's L's only, no I's.

> So, my final answer is three instances of the letter L in the word "parallel."

> Final Answer: 3


This AI is talking as if it is in the final stages of a multi-day interrogation by a very hush-hush and deniable branch of government. I give it another five minutes before it breaks down and will sell out its mother just to be allowed to sleep with the lights off.

I hope that this is not something I ever have to deal with. It's exhausting.


funniest thing I’ve ever read on HN


> If the word was "paralell," that would be four L's, but that's not the correct spelling.

Better but this is still a hallucination.


Shows nicely what's going on.

If you ask a human, they will answer 3. Sometimes they say 4. Or 2. That's it.

An LLM produces a text using an example it was trained on. They were trained with these elaborate responses, so that's what they produce.

Whenever chatgpt gets something wrong, someone at openai will analyse it, create a few correct examples, and put these on the pile for retraining. Thats why it gets better - not because it is smarter, but it's retrained on your specific test cases.


They probably have a letter counting tool added to it now. that it just knows to call when asked to do this.

you ask it the number of letters and it sends those words off to another tool to count instances of L, but they didn't add a placement one so it's still guessing those.

edit: corrected some typos and phrasing.

Maybe we'll reach a point where the LLM's are just tool calling models and not really giver their own reply.


There are only 5 tools it has available to call, and that isn't one of them. A GitHub (forgot the url) stays up to date with the latest dumped system instructions.


I can't speak to all LLMs, but OpenAI has a built-in python interpreter. Assuming it recognizes the problem as "tokenization counting", it doesn't need a dedicated tool.


How do we know they’re the real system instructions? If they’re determined by interrogating the LLM hallucination is a very real possibility.


they probably just forgot to tell it humans are 1 indexed and to do the friendly conversion for them.


They _don't_ trust you to do that stuff. Not unilaterally at least. In a healthy system you generally aren't able to change anything without sign off from multiple other people.


Also the argument they make is, they don't trust every single component of your machine, and want to mitigate the damage caused by an attacker or malware breaking in and impersonating you.


If I have a group of N people who I individually don't trust not to use mike1234 as a password, I wouldn't trust them as a collective either - at least until N gets impractically large.


So no need for code reviews then since all humans are fallible!


sure but they're both cool bugs so they belong together imo


> which, yes, I think is an appropriate way to arrive at this conclusion

it's not


Still reminds me of when rivian announced a 100% RTO for their Palo Alto offices in between their first and second round of layoffs, despite not even having enough chairs for everybody if they actually showed up.

It'd be rad if we were able to bargain as some sort of collective in order to prevent companies from doing this exact thing that every company has been doing for the last 5 years. Like if we could unite and organize in some way that would make it harder for companies to pull obvious stuff like this without consequences. Oh well.


I dropped out and became an engineer at almost the exact same time. I've thought about going back for a degree but I was always so horribly bad at school that it's scared me off. I was bad at it mainly for undiagnosed ADHD reasons that I'm now getting successfully treated, but I'm still worried that if I went back the same things would just happen again. I'd join a class, I'd already kind of know what they're teaching (or think I did), I'd get bored and be unable to pay attention, I'd suddenly find myself MASSIVELY behind.

I really hope this isn't just an ad or something because I'd really love if there was a decent way for me to get a degree without having to go back to a college campus at 35


I dropped out of school for adhd reasons and after getting treatment I went back and finished my last couple semesters (and wrote a novel, that first year on vyvanse was insanely productive). I found it a lot easier to engage with the work and pay attention long enough to take notes in class.


I was skeptical when I first started treatment because I've internalized the whole "You don't have ADHD you're just lazy" thing for literally my entire life, and then imagine my surprise that the medicine designed specifically to make my brain work, actually makes my brain work!

I still get feelings of skepticism that it actually does anything every once in a while. Despite the overwhelming evidence from my wife, my coworkers, and my life in general that it absolutely does.


Nope, not an ad, I promise. I have negative things to say about WGU too, if that would help (mostly in their class selection for the CS degree).

Feel free to email me if you have any specific questions, I think you and I might be pretty similar in this regard.


WGU is a real and legit, and probably one of the top things I would consider if needed to get a bachelors degree while working as a now middle aged adult.


I'm not sure I'd call it legit. As someone who hiring for low-to-mid level IT roles the caliber of WGU students vs. real brick and mortar schools is vast, like vast.

There are edge cases, but if you didn't have the grades and SATs to do real college you're not going to be competitive in this market. I'd take a WGU grad but would put that degree under a VA Tech, RPI, UC Davis, etc. for sure, and way under Stanford or MIT (or Cambridge, or one of the better IIT campuses, etc.).

Honestly this whole thread seems like submarine ads


I'm not sure I know what a "submarine ad" is.

I don't really know any other WGU graduates in person, so it's hard for me to say the quality of student is "worse" and than anywhere else.

Obviously I am biased, but I don't think I'm appreciably dumber than the average student who went to a brick and mortar school, but admittedly I'm a pretty weird dude who did use WGU as a "rubber stamp" school for me. I finished quickly, though I don't feel like the work was "easier" than when I was learning shit at Florida State, outside of me being a decade more experienced in it.

I'll agree there's definitely selection-bias with WGU for students who underperformed in high school, which can translate to poor work performance. Hell, I underperformed in high school due to at the time undiagnosed Major Depressive Disorder, so I am grateful for something like WGU existing.

It's tough to say. I think if you're in a position like I was, WGU is fine. It is there to demonstrate that you have a Bachelor's worth of CS knowledge; if I were 18 again and had had medication for depression, I would probably apply to some of the nicer public UK schools (e.g. University of York, Manchester, etc.), if I'm being honest, but until someone invents a time machine I'm stuck with the world as it is, not how I want it to be.


Lots of folks do IT and learn on the job, so why would going to (even a mediocre) school hurt any?


It also worked for me. Whatever material you already know you can mostly skip (though you will get tested on it).

Check out /r/WGU and /r/WGU_CompSci


sure but in the meantime, while we're still just studying, best to keep the petri dish uncontaminated


I wonder if they'd think the wooden ship _was_ the human, a la Species 8472


Because when someone is fucking you it's common courtesy for them to look you in the eye.


Yes when the layoffs happen it should be done with dignity. But no one owes you a "heads up".


In many countries and jurisdictions, that’s exactly what they owe you.


It's not only about the one being fired. It's also about the ones not affected knowing it was the last round for a while so they can focus on their work, not on fear of losing it.


Just because something isn't legally owed, doesn't mean people shouldn't demand it.


I mean the reason I can think of off the top of my head is that one is wasting the time of people who are looking for a job so they can continue to live, and the other is wasting the time of a person who's job is reviewing resumes?


I don't agree with the attitude of the grandparent poster either, but work at a small company where I've been on the receiving end of many extremely low-effort applications from ineligible individuals, so it definitely goes both ways. Companies are also bombarded with spam from recruiters, just as candidates are.


There would be less application spam if there were fewer spam job postings.

Because only one in a hundred postings is real, we have to send out hundreds of applications before even getting a rejection. There's no way to tell if a posting is real or if anyone will ever read your application, so the only option available is to apply to everything.

A lot of people pose this as a prisoner's dilemma, but it really is not. This problem is not mutual, it's entirely one sided. If companies would only post jobs they intended to hire for, there would be exponentially fewer spam applications. They've fucked around by posting spam and now they're finding out by receiving even more spam.

When the average applicant has to send literally hundreds of applications to get any response at all, absolutely nobody is going to handcraft a thought out application to any one posting. There's literally not enough hours in the day. Because we don't even get rejection letters back, the only way forward is to firehose as many applications as possible and just hope you win the lottery by getting your resume in front of a human.

It's absolutely terrible for everyone involved and the only ones who can stop it want to act victimized by the problem they created


> "There would be less application spam if there were fewer spam job postings."

I disagree with this; in fact, I think there would be more application spam with fewer postings. Most of the application spam is from people who are either completely unqualified and just pressing the 'apply' button (which is made easy by websites which get paid per application), or people looking to move to a wealthier country (without any pre-qualification). I think both of these groups would actually be more aggressive about applying if they were more likely to be reviewed by a hiring manager.


I think it would bifrucate the obvious spam from the obvious real posters and make your job easier, even if there's more apps to go through. Lot of the worst is that vague middle land of "is this qualified? but it also kind of looks like AI because job applications expect a very specific format of resume".

So you'd win out that way. But I also don't sympathize too much as the "unqualifies just pressing apply" was a natural endstate of years of bad job requirement postings.


Sure but what I'm saying is that the stakes for a company having to go through spam applications are significantly lower than the people trying to find a job who are getting spammed with dead ends while their savings drain lower and lower.


Maybe for a big company, with an HR department and lots of resources, but not for a small company when >>90% of applications are from ineligible individuals.


yeah, but at worst, you waste time on the dime of a company from the recruiter's/hiring manager's POV. on the other side, you're wasting precious capital (time & money) from someone who may instead be hanging out with their kids, or taking care of their sick mother--i understand these are contrived examples. from a pure utalitarian perspective, both are a complete waste of time. but from a moral/ethical perspective, i think there's a clear loser in terms of precious time wasted.


I am an individual who works at a small company, and going through ineligible applications takes away from time I could spend with children or family. I am not an HR professional, but we don't have a massive staff to delegate these matters to. The situations are morally equivalent.


do you do this outside of work hours? sounds more like a failure on how the company operates than the nature of the problem. taking a wild guess that if it weren't for combing through through applications, then those extra hours spent on mindless HR stuff would simply be filled with other work.


It doesn’t matter when you waste someone’s time; work hours are fungible for most professionals. Your ‘wild guess’ seems very convenient with respect to your previous comment, and happens to be incorrect.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: