Hacker News new | past | comments | ask | show | jobs | submit login

I quizzed ChatGPT about math a few days back. It correctly answers that 1023 is odd, correctly answers that the square of an odd number is also odd, and is about 50/50 on whether it decides 1023^2 is even or odd.

The author of the linked post repeatedly refers to ChatGPT's "reasoning". But ChatGPT doesn't reason, it just generates text. Sometimes that text might look like reasoning to a human--it's the sort of text that a human who reasoned through the question would generate--but there was no reasoning involved. It's important to understand this.

> Based on my teaching experience, I’d say chatGPT scored like a second year math major.

Based on my experience, ChatGPT scored worse than my 5-year-old, who gets the 1023^2 question right every time.




Prompt:

> is 1023^2 even or odd?

Response:

> 1023 squared is even because it is equal to 1,046,529, which is an even number. To determine if a number is even or odd, you can check whether it is divisible by 2. If it is divisible by 2, it is even. If it is not divisible by 2, it is odd.

> In general, any number that is a perfect square will be even if the number being squared is odd, and it will be odd if the number being squared is even. This is because odd numbers, when squared, always result in an odd number, and even numbers, when squared, always result in an even number.


Interesting, I got:

> 1023 squared is 1048129, which is an odd number. To determine whether a number is even or odd, you can check whether it is divisible by 2. If it is, it is an even number. If it is not, it is an odd number. In this case, 1048129 is not divisible by 2, so it is an odd number.


Yeah, like I said it's about 50/50 whether it spits out even or odd. Try it a few times and you'll see.


lol; now get statistics on its answers to 'what is 10231 modulo 3?'.


I reminds me of study methods that involve "memorizing" as opposed to thinking about the problem, and applying various methods to solve it. It can solve some algorithm problems at a certain level, but struggles at a detailed answer in some scenarios. I guess it's not surprise, given ML is essentially exploring the local solution space of some abstract problem; it reminds me of a Taylor series expansion or a breadth-first search of an abstract problem space.


ChatGPT is a liberal arts major.

One day it will "dual major" in math and then develop a determination for when the situation demands rigor (hard calculation) or when it demands "correct word estimation" (its current interaction mode).


A liberal arts major should also be able to draw conclusions of the form

  All A are B  
  C is an A  
  Therefore C is a B.

  A = square of odd number  
  B = odd number  
  C = 1023^2
My 5-year-old is not a math major, but he manages it.


The irony of ChatGPT not being logical enough is not lost on me


Your 5 year old is doing number theory? I think I was still working on like counting when I was 5


My five year old knows about evens and odds and squares, if that counts as number theory. He's really into multiplication.


How do you know there is no reasoning involved?


If there was reasoning it would consistently get the parity of 1023^2 correct.


How do you know?




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: