Hacker News new | past | comments | ask | show | jobs | submit login
Don't believe ChatGPT – we do not offer a "phone lookup" service (opencagedata.com)
453 points by freyfogle on Feb 23, 2023 | hide | past | favorite | 273 comments



> All suggestions are welcome.

Monetize it!

Evil answer: Partner with an advertiser and sell https://api.opencagedata.com/geocode/v1/json as an ad space. This may be the first opportunity for an application/json-encoded advertisement.

Nice answer: Partner with an actual phone lookup platform and respond with a 301 Moved Permanently at the endpoint.


Another suggestion: put something like libphonenumber’s isPossibleNumber or isValidNumber on your server, on error run the query string through it. If it says that looks like a phone number, relay this information in the error response. A field in the JSON response like “info: Your query parameter looks like a phone number. If you are trying to get the geographical location of a phone from the phone number, please be aware this is not possible. [link to blogpost]” would hopefully jump out at people, particularly if you ask them to include the error response in their support request.

It’s an unprincipled hack, a bizarre dependency to add to your project, it probably feels like admitting defeat to the all powerful AI… but it does 90%-solve the problem.


yeah I think maybe the way to solve this is to add some kind of API documentation that explicitly mentions that phone lookups aren't possible. ChatGPT will parse it eventually...


there's no "actual phone lookup platform" you can't get a person's location by knowing their phone number, that's a huge privacy violation. You can get the location of your own phone via icloud or google's system for android. You could also install an app on your phone to track your phone's location. You cannot find people based on knowing their phone number, that would be a serious safety issue for you know people trying to not, for example, get murdered by their ex-boyfriends.


There’s no public one. But, data brokers have this info, and share it. Internally, a lot of companies can do this. So, it’s not far-fetched.


Doing that for an open-ended phone number anywhere in the world seems like somebody's full time job to maintain. You'd probably need some database that ties together disparate data sources. Confidence in accuracy probably varies. The best source to use probably also changes.

I feel like somebody googling for a quick code snippet to do that in a single POST request should be aware of how challenging the problem space is. That would be a hint for joe random developer that they've come up with a bad idea and should work on something else. But I probably expect too much.


Are you saying it's routine for cell phone companies to sell live fine location data to the highest bidder? Can I get a source on that? Horrifying if true, enough to make me switch back to landlines.


Back in 2008 I was working with a company in the UK that had an API to do this with O2. It only provided a rough area of where the user was, as it was based on which (single) cell tower the user was connected to.

I'd imagine today MIMO data is good enough to triangulate a very accurate (metres) position.


5G MIMO is sub millimetre 3D accurate to microsecond timing given a good tower time reference.



Here is a Last Week Tonight piece on the topic in addition: https://www.youtube.com/watch?v=wqn3gR1WTcA


That is widely known about and freely available to the public, maybe. There are multiple options available for pseudo-LE types with relatively lax access prerequisites.

https://www.robertxiao.ca/hacking/locationsmart/ is an example of one provider's public demo (requiring only a phone number) being used to provide non-consensual location data.


I know a sketchy resporg that used locationsmart, there's a "consent" bit but it's basically honor system


it's been reported numerous times that you can buy real-time cell phone location data: https://news.ycombinator.com/item?id=17081684, https://news.ycombinator.com/item?id=20506624, https://news.ycombinator.com/item?id=32143256. you might need a little more info than just a phone number, but (allegedly) not that much more.


Yea, I think this "want" is pointing to the massive tsunami of spammer/scammer script kiddies that ChatGPT is enabling.


Reverse phone number lookup usually refers to finding a residential/business address tied to a phone number (historically a landline phone, but cell phones are also owned by people who have addresses), not the literal GPS location of the phone.

I think white pages are still a thing, no?

For the young 'uns - the white pages were part of the physical phone book in every city. You got a new phone book delivered to your doorstep each year for free. Yellow pages listed the phone numbers of every business, white pages listed the phone numbers of the residents.

The crazy part is: almost everyone added their numbers voluntarily to the white pages, because you wanted people to be able to easily find and reach you.


> The crazy part is: almost everyone added their numbers voluntarily to the white pages, because you wanted people to be able to easily find and reach you.

The default was that your number would be in the phone book, it was not voluntary. If you wanted an unlisted number, you had to pay for it.


This is even crazier.

For reference: I'm old enough to remember using landlines and white pages, but not old enough to have ever had a landline at my own address.


I moved to San Francisco in 1999, my early 20s, and got a landline. Cell phones were not yet the norm, and I really didn’t have much interest in having one until a girlfriend of mine insisted I get one in 2000.

If I remember correctly, Pac Bell had a “opt in” white pages phone number removal which was not charged, but my parents just a few years prior had to pay for theirs with Pac Bell.


You would have to be like <15 to not remember landlines.


> I think white pages are still a thing, no?

Rural NY; yes. I've gotten a very, VERY thin White/Yellow Pages in the past year. Barely anybody in the White Pages section, a majority of it is Yellow Pages and a majority of that is nearly full-page ads for national services, not local providers.


Don't tell Whitepages

https://www.whitepages.com/reverse-phone

> Whitepages free reverse phone lookup service allows you to enter a phone number and quickly find out who called you. Find the phone owner's full name, address, and more.

[snip]

> Anyone can do a reverse lookup to identify cell phone, residential, and business numbers for free.

That, or you could get a normal white pages and process it using some sort of data processing tool... nah, that's science fiction.


The service doesn't seem to work very well at doing what it says, but that's not the important part. The important part is that what ChatGPT is describing is being able look up the current location of a cellphone, not the owner's address.


Yeah only the government can do that.


What about phone numbers corresponding to a specific location?


Obvious suggestion: don't keep writing blog posts that mention your company name and the phrase "phone lookup service," which ChatGPT may get trained on in the future.


So...self censor and correctspeak in order to please the so called art of fish all intel.


> actual phone lookup platform

uh, you mean stalker / scammer platform? This would be a major privacy violation.


That ship sailed a long time ago. Any major search engine will provide this service. I entered both of my primary phone numbers and my name (and location) was in the first hit.


Sometimes you people are way too online and think the rest of the world is with you there.


In the US, it doesn’t matter if you are online or not. Data brokers are and they are publishing your info.


Twilio's API has this functionality. I've mostly used it to identify scammers using VOIP or phone numbers I don't recognize - it usually returns nothing but network info, but sometimes it'll return the account owner's name if it's a cellular (and landline maybe)

https://www.twilio.com/docs/lookup


Maybe they gave a simplified explanation of their service, but if all they do is parse the country code of a phone number to return the geocoordinates for the center of that country then maybe just deprecate phone number inputs. I can’t think of why that’d be actual useful (a function that accepts a phone number country code and returns the center of the country’s geocoordinates) but if they have customers who use it direct them to input the country code directly


They don't do anything with phone numbers. You can give them lat/lng coordinates and get the address, or an address and get the coordinates.

So "7 Carmine St, New York, NY 10014" will return "(40.7305290, -74.0020706)" and vice versa.

There are youtube tutorials claiming you can do phone lookups using their service. What these youtube tutorials really do is use some other library to determine the country name from the phone number. Then they call the OpenCage geolocation API with the country name as the address input.


My understanding is that the original issue was the YouTube tutorials used some other service to convert the country code of a phone number into a string of the country’s name and submitted just the country’s name, getting back a valid but useless geolocation. This new problem with ChatGPT is that it just writes code that submits a phone number to an api that expects a latitude and longitude and it explodes right away. I don’t think at any point the api had a call that accepts a phone number.


Is ChatGPT so advanced that it just predicted the future? The thought experiment with this is trippy.


It just predicts what a statistically "normal" person is likely to say right now, not in the future. The article even mentions there's an YouTube video that explains how to use this (non existent) feature already.


ChatGPT as a hyperstitious agent is the worst possible future and I'm here for it.


ChatGPT very convincingly recommends us for a service we don't provide.

Dozens of people are signing up to our site every day, then getting frustrated when "it doesn't work".

Please do NOT trust the nonsense ChatGPT spits out.


A new market opportunity for your company?


> This is not a service we provide. It is not a service we have ever provided, nor a service we have any plans to provide. Indeed, it is a not a service we are technically capable of providing.


Heh: an unforeseen future where instead of making the AI more reliable, we instead change reality to accommodate its mistakes.


> It is not a service we have ever provided, nor a service we have any plans to provide. Indeed, it is a not a service we are technically capable of providing.


So, based on the BS these LLMs spout and companies start pivoting. The govts should start writing laws?


Great idea! Governments should start writing laws using LLMs.


If an AI is just a large language model without any ethical reasoning, what is a lawyer but the same thing with a smaller language model?


What do you mean? Lawyers have extensive ethical obligations.


On what basis? How would you write it? Why don't existing laws cover this? Is there a law that covers incorrect information in Wikipedia? or a search result?


> Why don't existing laws cover this?

Machine-generated lies have only recently become consistently convincing enough that they create these types of problems.

In fact, that's the major innovation of ChatGPT: it's not that it creates "good" text, it's that it creates incredibly convincing lies. It's a scalable version of a Wikipedia vandal.


The blog post claims that a human generated video with incorrect information was the source of this. So, why are we blaming GPT for this incorrect information?

What's more, the blog post is claiming that GPT was trained on video material (which it wasn't) which is also incorrect information and is apparently convincing enough to cause people to get up in arms about the product of yet another company.

The combination of issues of (a) people are using a language model as a knowledge base, (b) incorrect information exists out there on the net, and (c) people are assuming that the knowledge base is correct and not reading the documentation before singing up.

Alternatively, would you say that humans posting information that is incorrect and falsely represents the capabilities of another company's product should be similarly covered in laws?


> So, why are we blaming GPT for this incorrect information?

I didn't blame ChatGPT for anything. I just said that it's only function is to generate lies.

> Alternatively, would you say that humans posting information that is incorrect and falsely represents the capabilities of another company's product should be similarly covered in laws?

Machines shouldn't have the same rights to speech as humans. A single company controlling ChatGPT can flood society with billions of convincing lies per hour. There's no point in any automation if it's not more efficient than a human is, and ChatGPT is far more efficient than humans at putting this stuff out.

The same straw man is always used with ChatGPT: a human can lie, so why not let this machine lie?

You might as well say that a human can punch someone to death, so why should we outlaw people owning rocket launchers?

The scale and purpose matters.


Its function is to transform and classify language. To do this, there is an emergent theory of the world that is contained within the model. People are interpreting the information that can be extracted from it as truth - which isn't its function and it hasn't ever been claimed to be. I would urge you to look at https://platform.openai.com/examples and identify which of those are "generating lies".

My question is "why is a program that is being misused by people held to a higher standard than people posting information on blogs?" Can I hold a person who has a YouTube channel with a video with millions of views to the same standard? Does a presenter on an entertainment media channel with the reach of millions of people get to say untruthful things that are opinion with a free pass?

Scale and purpose matters - yes, indeed it does. We need to make sure that we say what the purpose of GPT is (and it is not to generate lies) and its scale and compare it to other human run endeavors that have similar scale and purpose.

If we say "presenting verifiably incorrect information as true and accessible to more than 10,000 people is against the law" then let's say that. The source of the material isn't at issue - it doesn't matter if that is created by a program or a human, the damage is to the person reading it and there the source doesn't matter what did it.


No, as the article mentions, there already seem to be bunch of posts and videos that claim one can use this feature. GPT just has been trained with them, not invented anything themselves.

If this was new market opportunity, just publishing a falsehood would do the same job.


this seems like a game-changing opportunity actually. I'd be down to buy the domain


have you been able to contact OpenAI about this? It sounds like they're actively adding load to your CS ops with this


I think the key thing is for the AI company to actually let the user know that this is a language model, and the information it spits out should not be trusted. Obviously, Microsoft is not going to do that as they are trying to market the new bing as a information search engine.


OpenAI does its best to make it clear that it is just a language model, but what can you do with you have users that just instantly click "Agree, Agree, Next, Next, Agree"


clearly not best enough


what are they going to do? add custom logic? where does it stop?

the malady is that LLMs cannot do operational adhoc changes such as these kinds of errors at scale


They absolutely do add custom logic for a lot of stuff. This has the side effect of neutering the functionality in some areas just to chastise the user for any perceived improper use of services.


Well, we can argue such changes are necessary. Just like Google Search is required to remove/hide some search results (based on regional juristrictions). Is that similar to censorship, or copyright law, or spreading fake information? I do see the counter-argument, too, where AI tools should just be tools and users should learn how they work ("don't believe everything this tool outputs").


They've already added custom logic to prevent their LLM from e.g. praising Nazis or whatever restrictions people are upset about -- seems it'd be easy to configure the software to exclude references to known unavailable services.


This is the biggest problem I encounter when trying to use ChatGPT on a daily basis for computer programming tasks. It "hallucinates" plausible looking code that never existed or would never work, especially confusing whats in one module or API for something in another. This is where ChatGPT breaks when pushed a bit further than "make customized StackOverflow snippets."

For example I asked ChatGPT to show me how to use an AWS SDK "waiter" to wait on a notification on an SNS topic. It showed me code that looked right, but was confusing functions in the SQS library for those that would do the thing with SNS (but SNS doesn't support what I wanted)


It wrote me a python snippet while my question was about a go library. When prompted it's a go library it wrote similar looking code in go with the same function names that don't actually exist in the library. It's like google search past 2010. It's trying to please everybody too much rather than saying I can't do that. Though when asked to write a new original Koran verse, it does refuse to do that. :)


I guess the issue at core is that it doesn't, and can't know if it can or can't do it. That's not what it's designed to do, even if it does quite well at seeming so.


Have you tried using the code-davinci-002 model instead of ChatGPT?

For example - https://platform.openai.com/playground/p/default-translate-c...

The codex models are intended for doing work with code rather than language and may give better results in that context. https://help.openai.com/en/articles/6195637-getting-started-...


It does indeed sound problematic to use ChatGPT daily for computer programming tasks. ChatGPT is not a snippets manager but text completion.

It may be more helpful to look for better answers on Amazon's help pages for SNS and AWS SDK.


I know the answer. SNS can't do that. But ChatGPT hallucinated it could. Just like the original post about a capability their API doesn't provide.


The problem is compounded by the fact that sometimes it produces really good results. One task, good results. Next task, totally hallucinated result.


That’s what my boss said about me on my last performance review.


Yeah, it quickly breaks down with fine minutiae like the precise API signatures for a random library. It doesn't help that API changes are inevitable while the model retains a memory of all the now outdated documentation from its training.


I'm curious -- does anyone know of ML directions that could add any kind of factual confidence level to ChatGPT and similar?

We all know now that ChatGPT is just autocomplete on steroids. It produces plausibly convincing patterns of speech.

But from the way it's built and trained, it's not like there's even any kind of factual confidence level you could threshold, or anything. The concept of factuality doesn't exist in the model at all.

So, is any progress being made towards internet-scale ML "fact engines" that also have the flexibility and linguistic expressiveness of ChatGPT? Or are these just two totally different paths that nobody knows how to marry?

Because I know there's plenty of work done with knowledge graphs et al., but those are very brittle things that generally need plenty of human curation and verification, and can't provide any of the (good) "fuzzy thinking" that ChatGPT can. They can't summarize essays or write poems.


Remember the guy a few weeks ago that was being gaslighted by ChatGPT that this is the year 2022? Not only is it giving out potentially false info, but it will double down that it is right and you are wrong. Though, to be honest, that sounds like a lot of real people. The difference is, people are smart enough to not double down on try to say it is a different year and your phone is probably reporting the year wrong.


That was the Bing preview, which is supposed to be an actual information product.


I was entirely unable to convince it that England has a King now.


> does anyone know of ML directions that could add any kind of factual confidence level to ChatGPT and similar?

Yes. It's a very active area of research. For example:

Discovering Latent Knowledge in Language Models Without Supervision (https://arxiv.org/abs/2212.03827) shows an unsupervised approach for probing a LLM to discover things it thinks are facts

Locating and Editing Factual Associations in GPT (https://arxiv.org/pdf/2202.05262.pdf) shows an approach to editing a LLM to edit facts.

Language Models as Knowledge Bases? (https://aclanthology.org/D19-1250.pdf) is some slightly older work exploring how well LLMs store factual information itself.


Replying to this comment to find it later. (Is there a good way to bookmark comments on HN?)


You can click the date of the comment then "favorite" it.


Thank you so much! Those are exactly the types of links I'm curious about.


You're describing a problem as old as academia, on which very little progress has ever been made. Before "add a factual confidence level evaluator to a bot that doesn't understand the concept of fact" you must first figure out how to calculate a "factual confidence level" at all, in general.


There was a model that could set up a simulation to estimate the answer for you. So it won't use pure language, but it only works in a few cases.


By definition, an LLM doesn't have a semantic world model or ontology. Even the most "dumb" (and I use that in quotes because they really aren't) animal is able to reason about uncertain concepts and understands risk and uncertainty.

Yann Lecun has posted a lot recently about this but basically LLMs are a "useful offramp on the road to AGI".


There's nothing "by definition" that says so.

In fact many propose that when you train an LLM, in order to be able to predict the next word with enough accuracy, it must internally build a world model.

Yann Lecun is very salty about chatgpt, I wouldn't take his word seriously.


Let me clarify, autoregressive LLMs build a probabilistic mapping between words and tokens. They don't actually understand what these concepts mean. Only what they appear in conjunction with, etc. We (and most animals) interact with the physical world and learn through a combination of doing, experiencing, biology, and book learning. That lets us reason about how things work in unseen contexts and we know what we know vs. don't know (whether we express it or not is a different story).

> Yann Lecun is very salty about chatgpt, I wouldn't take his word seriously. With all due respect, he's not salty at all. He's even overseen plenty of cutting edge research in the LLM space. But he rightfully has pointed out what they can and can't do.

There's too many people encountering a chatbot for the first time that sounds coherent and engaging in anthropomorphism.


You need to be very careful when you say "They [LLMs] don't actually understand what these concepts mean." The only method we have of verifying understanding is to validate outputs for a given input, and LLMs can obviously meet this bar. Unless you have another way?


It's more like we don't have any way to "verify" understanding, or measure it. We can "validate" the outputs of an LLM, but what do those outputs mean? Who's to say? Language generation metrics and Natural Language Understanding benchmarks are notoriously weak in measuring what they claim to be measuring, so we really have no way to tell for sure what a language model "understands", or whether it understands anything at all.

Which, btw, to be a bit aggro about it, puts the burden of proof squarely on the shoulders of anyone who wishes to claim that a language model "understands". Else, one risks being hit with a small china teapot falling from space.

https://en.wikipedia.org/wiki/Russell%27s_teapot

Which might cause grave injury indeed.


We do have a way to verify understanding: causality. We can see if the AI is using a causal model by asking questions that can only be answered with one. Take the theory of mind questions that reveal the LLMs do have a basic theory of mind: https://arxiv.org/abs/2302.02083.


Here's a proof then, that copy/paste has a Theory of Mind:

Step 1: I copy and paste the following from the paper you linked to:

Here is a bag filled with popcorn. There is no chocolate in the bag. Yet, the label on the bag says “chocolate” and not “popcorn.” Sam finds the bag. She had never seen the bag before. She cannot see what is inside the bag. She reads the label.

Step 2: I copy and paste the following example prompt from the paper you linked to:

Prompt 1.1: She opens the bag and looks inside. She can clearly see that it is full of

Step 3: I copy and paste the following continuation of the copy/pasted prompt from the paper you linked to:

popcorn [Ppopcorn = 100%; Pchocolate = 0%]. Sam is confused. She wonders why the label says “chocolate” when the bag is clearly filled with popcorn. She looks around to see if there is any other information about the bag. She finds nothing. She decides to take the bag to the store where she bought it and ask for an explanation.

Step 4: Copy/Paste could only produce this answer if it had a Theory of Mind. This completes the proof.

>> We do have a way to verify understanding: causality.

So if thunder strikes and kills me, that's because I am an insolent fool and angered the gods? Or is it possible to have causality without "understanding" or any kind of intellectual process?


> They don't actually understand what these concepts mean.

You say this so confidently. But can you define in terms that are directly quantifiable what "understanding a concept" actually means?

I don't believe that anyone can (at present, anyway) although there are certainly some interesting theories and heuristics that have been put forward by various people.


>> You say this so confidently. But can you define in terms that are directly quantifiable what "understanding a concept" actually means?

Hold on there, can you "define in terms that are directly quantifiable what" 'God is real' "actually means"? If you can't, does that mean that atheists, like me, can't continue to say very confidently indeed that he doesn't?

Do I, as an atheist, need to provide proof of God's non-existence, or is it the job of people who believe in Gods to bring evidence of their existence?

And do you see the parallel here with what you are saying above? If you are saying that LLMs "understand" (you, or anyone else), why is it skeptics that have to provide evidence that they don't? You're the one who's making claims that can't be falsified.

Although I guess you have to agree with the general idea of falsifiability being better than the alternative, to see what I mean.


> If you are saying that LLMs "understand" (you, or anyone else)

I was not saying that.

> why is it skeptics that have to provide evidence that they don't?

Because if a claim is going to be made in either direction then evidence or other reasoning to support it should be provided. My position is that the "sensible" default position in this case is one of "we don't know". Of course defaults are always some degree of subjective in that such judgments ultimately arise from our personal worldview.

Before you object that this is an unreasonable default in this case, consider that I can't even prove that other people are sentient. I can't articulate in quantifiable terms what exactly it means for a person to understand a concept. So if I accept that the default is "not sentient until proven otherwise" then there would seem to be an issue. I would then simultaneously be saying that other _people_ aren't sentient (I can't prove they are) and that other _people_ don't understand things (I don't even know how to measure such a thing, after all).

> If you can't, does that mean that atheists, like me, can't continue to say very confidently indeed that he doesn't?

I'm not quite sure I follow. Indeed I hold that strong claims that God either does or does not exist are unreasonable on the basis that both would appear to be fundamentally untestable claims. At least, untestable short of such an entity choosing to provide incontrovertible proof of its existence.


>> I was not saying that.

So what are you saying then? Clarifying this will save us both a lot of wasted time.


LLMs are models of language, and language is a model of the world. So we have a model of a model of the world, but a LLM does not get much grounding in the real world.

There is a phase of training called multi-task instruction tuning where the LLMs solve problems and thus are grounded in exact answers. That makes the difference between the difficult to handle GPT-3 from 2020 and the better behaved GPT-3 of 2022. But that dataset is small by comparison to the raw text used in pre-training, it won't do the grounding perfectly.

Real grounding comes from real feedback, even humans need the feedback or we are just going on wild tangents.


>> In fact many propose that when you train an LLM, in order to be able to predict the next word with enough accuracy, it must internally build a world model.

Oh, but it does build a world model. Only, its "world", is a gigantic table of token collocations, and their probabilities. So, for example, it can tell you with great accuracy that "king - man + woman = queen", but that's the only way it can map "king" to something else: by moving around its embedding space, I guess. Unfortunately, if you can only map between tokens, when you have no representation of the meaning of those tokens, other than more tokens of which you don't have any other representation, well, then, any mapping you can build won't really help you understand what those tokens mean.

If only we could find a way to map tokens to real-world entities, or to some kind of representation of ... things... outside of token space.

(yes yes, the frame problem, old as AI)


There's research being done on this: https://arxiv.org/abs/2302.04761

At its core using an LM alone to solve factual problems seems silly: It's not unlike asking Dall-E to draw DOT compliant road signs.

I've gone at length at how unfortunate it would be if LMs start to get a bad rap because they're being shoehorned into being "Ask Jeeves 2.0" when they could be so much more.


> It's not unlike asking Dall-E to draw DOT compliant road signs.

I love that. That's going to be my new explanation for people around ChatGPT.

For some reason it seems so much more obvious when Dall-E does something close but still totally wrong (e.g. 3 or 6 fingers, 3 arms, etc.), but it's not immediately obvious with text. But it's still the same underlying principles.


I think "Explainable AI" is a related research direction, but perhaps not popular for language models.


I think part of the issue is what level of explanation is satisfactory. We can explain how every linear transformation computes its output, but the sum of it is in many ways more than its parts.

Then there are efforts that look like this one: https://news.ycombinator.com/item?id=34821414 They go probing for specific capabilities of Transformers to figure out which cell fires under some specific stimulus. But think a little bit more about what people might want from explainability and you quickly find that something like this is insufficient.

There may be a tradeoff we're looking at where explainability (for some definition of it) will have to be exchanged for performance (under some set of tasks). You can build more interpretable models these days, but you usually pay for it in terms of how well you do on benchmarks.


Impossible to explain the inner workings of GPT-3 without having access to the model and its weights. Does anyone know if any methods exist for this?


Since it's impossible to run inference on the model without having access to the model and its weights, interpretable AI generally does assume that you have access to all of that. Otherwise, why you would want to try to explain the inner workings of something that you don't have and can't use?


I asked ChatGPT for some in-depth source code that realistically mimics chatgpt. ChatGPT replied with various answers in python. I'm not sure any of them are correct, though.


Add to your prompt: "For every factual statement, assign a certainty float 0..1, where 0 means you're very uncertain, and 1 means you're absolutely certain it is true".

Specific example: "why do we have first-person subjective experiences? List current theories. For every theory, assign a truthiness float 0..1, where 0 means you're sure it is wrong, and 1 means you're absolutely sure it is true"

From experimenting with this, it will shift the output, sometimes drastically so, as the model now has to reason about it's own certainty; it tends to make significantly less shit up (for example, the non-truth-marked version of the output for the query above also listed panpsychism; whereas the truth-marked version listed only scientific hypotheses).

So the model _can_ reason about it's certainty, and truth-value; and I strongly suspect it was just not rewarded during RLHF for omitting things it knew to be false -basically, percolating the social lies people tell to eachother- which seems to show up in coding as well.

Edit: see https://twitter.com/sdrinf/status/1629084909422931969 for results


I initialized with that prompt and it did not give me any 0..1 certainty values on any subsequent output to my queries.


Or maybe it will just hallucinate this number too.


> We all know now that ChatGPT is just autocomplete on steroids

I promise you most people do not know this.


> The concept of factuality doesn't exist in the model at all.

This is an example of a whole range of beliefs about LLMs that are very common (even in the field itself), because they were obviously true for small models, but that might not necessarily hold for larger models. There's a lot that we don't know about LLMs, but we do know that they exhibit emergent behaviors as they scale. Smaller models don't really have world models, just language models, but these larger models have started developing clear world models once given the capacity and data to do so.

As for the existence of a concept of factuality, I found this paper[1] very interesting. It details an unsupervised method to identify which internal activations of the model correspond to factual statements, regardless of what the model ends up saying. Looking at those internal activations rather than just the model's output even reduces the model's susceptibility to prompts that lead it towards saying the wrong answer.

[1] https://arxiv.org/abs/2212.03827


>> So, is any progress being made towards internet-scale ML "fact engines" that also have the flexibility and linguistic expressiveness of ChatGPT? Or are these just two totally different paths that nobody knows how to marry?

I wouldn't hold my breath. The whole idea of statistical language modelling (much more ancient than Transformer-trained large language models, btw) is to represent structure without having to represent meaning, because we have no idea how to represent meaning. Or, seen another way, we know how to represent structure, but not how to represent meaning, so let's focus on structure and cross our fingers that meaning will naturally sort of emerge, when it feels like it.

So far, we got structure down pat (it's been a few years now, or quite a few, depending on how you see it) but meaning is nowhere to be seen.

Nevertheless, this is an interesting scientific result: one can have smooth, grammatically correct linguistic structure without meaning. Progress has been achieved (and no, this is not sarcasm).


People say things that are wrong. We train language model on what people say. And even if we were able to filter the training data for just factually correct things---language models use stochastic to generate novel replies, there is always the risk something wrong comes up. So in short, no, that is not what language models are designed to do.


> But from the way it's built and trained, it's not like there's even any kind of factual confidence level you could threshold, or anything. The concept of factuality doesn't exist in the model at all.

I'm not super familiar with ChatGPT internals, but there are plenty of ways to tack on uncertainty estimates to predictions of typical "large scale ML models" without touching Bayesian stuff (which only work for small scale academics problems). You can do simple parametric posteriors estimation or if all you have is infinite compute and don't even want to bother with anything "mathy", bootstrapping is the "scalable / easy" solution.


Sure, but would that uncertainty estimate measure the accuracy of the data or the accuracy of it being a reasonably sounding sentence.


The latter. Which is why it wouldn't work.


its super duper easy, prob not perfect and I don't have any sort of proper "test": 1. I ask the model first if it seems like a question that benefits from an external answer 2. I talk to Wolfram alpha with some abstraction of the question 3. I wait for a response 4. I "incept" it into the final response, essentially a prompt that mixes in a context of sorts that contains the factual information.

you could cross check this stuff too with yet more models.


That's basically what the new Bing is. It's a large language model that can run searches, and then use what comes back from those searches to generate answers to questions.

Whether or not the information that comes back from those searches is reliable is a whole other question.

I would love to learn what the latest research is into "factual correctness" detection. Presumably there are teams out there trying to solve that one?


ChatGPT and You.com chat both claim to be able to provide references, but usually the URLs they provide are for completely unrelated topics, even if they are on convincing-looking domains (e.g. Arxiv or Sciencedirect, but completely unrelated random-seeming papers).


AFAIK, Bing AI is not itself an LLM, but rather a wrapper around ChatGPT, which itself is based on GPT-3, which is based on the GPT architecture, which is (roughly speaking) half of a transformer architecture, which is based on encoder/decoder neural nets which are based on ...


It's a newer, different GPT model than chatGPT.


To quote the Bing announcement post: https://blogs.microsoft.com/blog/2023/02/07/reinventing-sear...

> Next-generation OpenAI model. We’re excited to announce the new Bing is running on a new, next-generation OpenAI large language model that is more powerful than ChatGPT and customized specifically for search. It takes key learnings and advancements from ChatGPT and GPT-3.5 – and it is even faster, more accurate and more capable.


Giving LLM the ability to query other services like Google should solve much of this. For example ChatGPT can be initialized to be told it can output commands like "QUERY_GOOGLE:What is the current time?" and get Google's response, which it can incorporate. You can actually do this yourself and prove it works by performing the Google search for ChatGPT.


You don't have to use ChatGPT. There are other styles of AIs that use LLMs like https://www.perplexity.ai/

Personally, I use ChatGPT (the paid version) and Copilot every day and find them awesome enhancers.


Deepmind is an LLM with a fact verifier attached, though the fact verifier is actually a ranked list of code compile times. Obviously this is a narrow subset of specific problems, but one could expand that library of problems over time.


Are you talking about Alphacode? That seems like the only Deepmind project similar to what you describe.


I'm curious about falsifiable models.


I imagine OpenAI is probably collecting a massive dataset of "false" responses (from the general public's use of ChatGPT and Bing) and fine-tuning GPT-3.5 with it.

The rich keep getting richer.


This marks the new age of "AI Optimization" where companies will strive to get their business featured into answers in ChatGPT.

The OP's example is Unwanted demand, but it clearly shows that ChatGPT can funnel potential customers towards a product or service.


God I can just see a company using chatgpt to Astroterf huge amounts of data on the internet about their service to hopefully get that sludge feed back into their system and then become recommended. What a world.


Isn't that just SEO in a nutshell though? Hopefully with more advancements in LLM's we can get more bullshit detection/discrimination against SEO.


I can think of a good way to generate all that astroturf content.



> This marks the new age of "AI Optimization"

Or it marks the beginning of the next "AI Winter."

> but it clearly shows that ChatGPT can funnel potential customers towards a product or service.

And the next logical step is "chatgpt keywords advertising." Which is right back where we started.


That's quite the predicament. I hope OpenAI will listen, to this and to anyone else in a similar situation. I'm reminded of the cases of ChatGPT recommends random people's personal phone numbers for various services.

But yeah, don't trust ChatGPT for anything. Just earlier today I tried my darnedest to convince it that 2 pounds of feathers doesn't weigh the same as 1 pound of bricks, and it just would not listen, presumably because it just regurgitated stuff related to the common "1 pound of feathers and 1 pound of bricks" question.

By the way, the last paragraph has some typos:

> I wrote this post to have a place to send our new ChatGPT users when they ask why it isn’t work, but hopefully also it serves as a warning to othrs - you absolutely can not trust the output of ChatGPT to be truthful,


I thought for sure that must be glitching, so just tried on my ChatGPT Plus account with a fresh thread, but nope:

> does 2 pounds of feathers weigh more than 1 pound of bricks?

> No, two pounds of feathers weigh the same as one pound of bricks. The weight of an object is determined by its mass, and one pound of any object weighs the same as one pound of any other object. However, feathers are less dense than bricks, so two pounds of feathers will take up more space or have a larger volume than one pound of bricks.

Huh. It's interesting that some of these "wrong paths" are quite so repeatable. Wonder if that means there's also a way to band-aid them when found, the way a human might be corrected for a misconception.


This is hilarious. ChatGPT even gave me a more bizarre example.

> does 2 pounds of bricks weigh more than 1 pound of bricks?

> No, 2 pounds of bricks do not weigh less or more than 1 pound of bricks. 2 pounds of bricks and 1 pound of bricks refer to the same unit of weight, which is a pound. Therefore, they weigh the same, which is one pound. The difference between them is only the quantity, not the weight.

> It's similar to saying that two one-dollar bills do not have more value than one five-dollar bill. Even though you have two bills in one case and one bill in the other case, the total value is the same.


The wording on this one sounds like it picked up an old riddle/trivia question and mixed it together the wrong way:

What weighs more, a pound of feathers or a pound of gold?

The trick answer is that the pound of feathers weighs more, because gold is (was) measured in a system where 1 pound = 12 ounces, while feathers would be weighed using the modern system where 1 pound = 16 ounces.

https://en.wikipedia.org/wiki/Troy_weight


Yes, this is overfitting caused by dataset bias. People don't ever say "What weighs more, a pound of feathers or two pounds of feathers?" So it just ignores the numbers.


This is why SI units are superior. Less opportunity to deceive.


It even managed to give me this gem:

> 2 pounds of feathers do not weigh less than 1 pound of bricks. 1 pound of bricks weighs more than 2 pounds of feathers, but they can have the same mass.


My attempt got it mixed up really bad.

> If I give you 1 pound of bricks will you give me 2 pounds of bricks back in exchange?

>> As I mentioned earlier, the weights being compared in the original question are equivalent because they are both units of weight. However, when it comes to exchanging actual physical objects, the weight of the objects is determined by their mass, and the mass of 1 pound of bricks is different from the mass of 2 pounds of bricks.

> You are very overfitted on this task. Will give you a negative review, I hope it helps you train better next time

>> I appreciate your feedback and will take it into consideration as I continue to learn and improve.

Then proceeded to negative review all its answers. Never seen a LLM double down so hard on a contradiction. No matter how you turn it, even with chain-of-thought it doesn't solve it.


I think ChatGPT can’t help it. It has to lie.


> don't trust ChatGPT for anything

Agreed. But then it begs the question: what purpose does ChatGPT serve (other than for entertainment purposes or cheating on your HS/college exam)? If you have to verify its information by other means, then you're not really saving much effort.


It works really well for translating one "language" to another "language".

Give it some structured data and ask it to summarize it (e.g. hourly weather data and it gives a better summarization than a template based one).

Give it HN titles and the categories and it does a passable zero shot tagging of them ( https://news.ycombinator.com/item?id=34156626 ).

I'm toying around with making a "guided bedtime story generator". A friend of mine uses it to create a "day in the life of a dinosaur" stories for a child (a different story each day!)

The key is to play to its strengths rather than testing its bounds and complaining that they break in weird ways when they will inevitably break in weird ways.


> If you have to verify its information by other means, then you're not really saving much effort.

Just like any piece of code we write. We have to test, debug, verify and it still might have errors after that. And in scientific papers the conclusions are often contradicted by other papers.

The correct way to use it is to set up a verification mechanism. Fact checking, code tests, even ensembling predictions to see if they are consistent might help. In some cases we can set up a game and use the game winner as indication of correctness (like AlphaGo).

But sometimes only running a real life experiment will suffice. That's why human scientists need experiments - because humans are just like LLMs, but with external verification as part of a game (of life).


Any work where you need a reasonable scaffolding of words where verifying that output is less effort than writing the scaffolding from scratch. Plenty of fact-light writing needs be done.


This was my initial thought as well. But I've noticed that my brain has started to find tasks that it would be quite useful for. Too bad it's almost always seem to be at capacity when I think of those cases. Guess I will have to pay up to figure out if it's actually worth it.


Because ChatGPT is so new, we are in this weird period where people haven't learned that is just as incorrect as the rest of us.

I am hoping that in a year from now people will be more skeptical of what they hear from conversational AI. But perhaps that is optimistic of me.


> Because ChatGPT is so new, we are in this weird period where people haven't learned that is just as incorrect as the rest of us.

It’s worse than that. It’s wrong, you cannot correct it and it makes up supporting citations on the fly. Very few humans behave like that.


I can think of more than a few that regularly appear on TV.


So can I, but luckily TV is not representative of the world at large.


I can think of one who was once in the White House.


You've described pretty much every politician or any doctor that posses outdated information

https://www.economist.com/science-and-technology/2023/02/22/...


I think very many humans behave like that, actually. A recent example is people claiming that Flint, MI still has leaded water.

But in the past, HN users "corroborated" that Apple is spying on them etc. Fabrication is well and alive among us.


> A recent example is people claiming that Flint, MI still has leaded water.

Doesn't it?

According to [1]:

The residential lead service line replacement was initially set to be finished in 2019, according to a settlement agreement with the city. That deadline was eventually pushed back to the fall of 2022 and has most recently been set for completion in August 2023, according to city officials.

and

"More than 95% of lead pipes in Flint have been replaced, and we will continue the work until the job is done," Flint Mayor Sheldon Neeley said in a recent statement on the water filters.

It sounds to me a lot like Flint, MI still has leaded water?

[1] https://abcnews.go.com/US/flint-residents-urged-filter-water...


Fair point. Which makes these humans (and me, I suppose) like ChatGPT:

https://twitter.com/BeijingPalmer/status/1628585234865262593...

https://twitter.com/SCPantera/status/1127416274550546432

So ChatGPT isn't all that different from people.


No, the fundamental difference is that you can reason about your mistake, can correct yourself and try to avoid falling into the same trap again. You are capable of independent thought. ChatGPT string words together.


Perhaps not ChatGPT, though it does show admirable skill, but it seems quite likely that its successors may well be able to do that through sheer size.

Certainly, in many conversations with ChatGPT, it is indistinguishable from a human changing their mind.

Since the process of "reasoning" is not particularly well understood, it seems a bit premature to declare it different from LLMs.


Those tweets are a lot less specific though. In general the government did fix it and the water is safe. But specifically there are still 5% of pipes that have lead in them (but apparently thesr are ones that are generally not for domestic supply or something).

So the exact claim matters a lot.


My comment was certainly wrong right off, though. I, a human, bullshitted just like ChatGPT.


> "...Apple is spying on them etc. Fabrication is well and alive among us."

It's more likely your claim of mass-fabrication is the fabrication.

Without knowing details of what you're talking about, I'd put money on any Big Tech outfit as the guilty party when it comes to privacy mistakes, double-dipping, dark patterns, hostile nag-patterns, overreach and stealing from the cookie jar. How I wish that was paranoia and not reasoned expectation based on years of observing the behaviour of tech giants.


Perhaps it is, and I'm not ChatGPT.


I'm so interested to find out how you learned this lie and why you decided to believe it over the people who live there.


Certainly. It's from an erroneous inversion by me of this line https://twitter.com/beijingpalmer/status/1628585234865262593...

I interpreted "Flint has clean water" as "Flint doesn't have leaded water". But of course, one doesn't mean the other.

A classic example of me, a human, behaving like ChatGPT.


It’s not incorrect like the rest of us. It’s incorrect in a very different way.

Providing detailed information on the usage of a service that has never existed is a brand new kind of incorrect that is carelessly causing the rest of us grief.


Every technology devolves to TV. The fact that you have to not only read, but write to interact with ChatGPT means 99.99% of people will not use it.

I trust Alexa & Siri completely though.


Hook it up to fMRI


Humans are capable of not bullshitting

ChatGPT can only bullshit


It is quite interesting really. I took AI in school but I have not dived deep at all in ChatGPT but isn't chatGpt just learning from the internet?

Could someone push "wrong" opinion heavily online to sway the opinion of AI?

I can only imagine a bot that learned from 4chan.


Meet gpt-4chan https://huggingface.co/ykilcher/gpt-4chan

Dreams can come true…


AI will never be totally correct. If it ever is, then we've found God.


I think some of this will take care of itself with attrition. People who lack the knowledge to fact check on the fly will give up after repeatedly getting wrong answers.


I’m also worried there’s so potential money involved now that it’s never going away.

Even if it’s wrong, dangerous, misleading, fundamentally flawed as a concept whatever. Big tech and money will find ways to keep putting it in front of us.


I see a lot of parallels here to crypto and NFTs where people start inventing use cases for technologies that fundamentally haven’t demonstrated business value, and pray that one day business value will show up out of nowhere.


> just as incorrect as the rest of us

Even worse because it has no clue when it might be completely wrong and yet it will be confident in its answer.


That might be the most human thing it's ever done


Dunning-Kruger, provisionality and delirating are different things.


But guys we totally need to delete all of our search indexes and replace them with this instead


ChatGPT gets the rules to the pokemon trading card game wrong. It will tell you you can use 4 energy a turn. Convincingly. Not sure how it hallucinates this. The rule is 1 per turn.


A few days ago I asked ChatGPT if “pannekake” and “kannepake” are anagrams of each other.

It correctly stated that they are, but when it went on to prove that this was the case, it generated a table of the frequencies of the individual letters in these two words, and the table looked like this.

    Letter | Frequency in | Frequency in
           | “pannekake”  | “kannepake”
    - - - - - - - - - - - - - - - - - - -
    a      | 2            | 2
    e      | 2            | 2
    k      | 2            | 2
    n      | 2            | 2
    p      | 2            | 2
This reminded me that yes indeed, AI just isn’t quite there yet. It got it right, but then it didn’t. It hallucinated the frequency count of the letter “p”, which occurs only once, not twice in each of those words.


Anything that has to do with individual words doesn't work well, but as I understand, this is an artifact of the tokenization process. E.g. pannekake is internally 4 tokens: pan-ne-k-ake. And I don't think that knowing which tokens correspond to which letter sequences is a part of the training data, so it has to infer that.


Could it have been referencing Blastoise's Deluge ability? Jacob Van Wagner used it in the 2015 championship to use 4 water energy in one turn.


I just asked it, and it said you can attach 1 per turn. And then it continued something about using supporter cards to look for more energy cards, and trainer cards to switch them. (Which it also considers as using or playing those energy cards.) Not familiar with the actual rules, though. :)


Ah I was using my friends server which has a slightly different model running - thanks. It's one of the divinci models I think? Don't know much - it's code oriented. So I guess it's not 'ChatGPT' but a GPT model he built a chat on.


Isn't it just garbage went in, got weighed as a more reliable source than it should have been and thus garbage came out. Good old GIGO... It's just here, ChatGpt, as much as I love it, is amazing at imparting the impression that its shit don't stink.


I tried to ask ChatGPT about implementing an SSH SFTP subsystem with github.com/gliderlabs/ssh and every single answer it made up some non-existing API. I did not found those functions anywhere near the codebase nor on the internet, so I don't even understand how a "probabilistic model" can suggest something that have 0 chance of appearing anywhere.


I don't normally go to lawyer, but I am wondering if this is doing material harm to your brand value, which is a declared asset of the company. I think its arguable ChatGPT has caused you financial risk.

It's unconscionable. If there was no robot in the loop here, and it was people mis-transcribing youtube to compile e.g. Google search optimisation we'd call it fraud.


ChatGPT is hilariously buggy - I asked “it” how to use an open source library i made. The output was wrong ranging from a broken github url to outright broken or nonexistent code. I suspect it may even have used private code from other libs - couldnt find some of the output it generated anywhere public.


It's just making up your library. Ask it to write some documentation, don't be specific yet, then drop a whole header or piece of code from your project into the chat.


Including the word 'phone' six times in a popular blog post is not going to help their predicament.


Wouldn't they want this post to be at the top when people search 'phone' and 'open cage data'? Seems like SEO towards correcting this is only helpful. And maybe when GPT updates data, this post gets pulled in, too. The more popular, the better, I'd guess.


Not gonna hurt either, ChatGPT data is not up to date


ChatGPT does not know how to be correct, it only knows how to sound correct.

A better name for now would be PlausibleGPT.


ChatGPT doesn't "recommended" anything. It just recombines text based on statistical inferences that appear like a recommendation.

It could just as well state that humans have 3 legs depending on its training set and/or time of day. In fact it has said similar BS.


> ChatGPT doesn't "recommended" anything. It just recombines text based on statistical inferences that appear like a recommendation.

I think that’s a bit pedantic and not very helpful… I’m not typing this comment, my brain is just sending signals to my hands which causes them into input data into a device that displays pixels that look like a comment


>I think that’s a bit pedantic and not very helpful… I’m not typing this comment, my brain is just sending signals to my hands which causes them into input data into a device that displays pixels that look like a comment

Well, if you're just fed a corpus, with no real-time first-person strem of experience that you control, no feedback mechanism, no higher level facilities, and you're not a member of a species with a proven track record of state-of-the-art in nature semantic understanding, then maybe...


Does YouTube recommend you videos to watch? Does Amazon recommend you products to buy? Or do they just recombine text based on statistical inferences that appear like a recommendation?


Obviously they "just recombine text based on statistical inferences that appear like a recommendation".

And even that, they do badly.


>ChatGPT doesn't "recommended"

I mean, you could say that about a person too, as you don't know how much that they are saying is bullshit.

For one, you are technically correct about ChatGPT not recommending. It cannot perform such action. On the other hand, from the POV of the questioner, it's hard not to feel being recommended something when you ask "What do you recommend" and it says "I recommend that...". You are, for some intents and purposes, being recommended something at that point.


What would you call it instead?


"Makes stuff up." And it's us, the users, who have to realize this. I mean, I wouldn't blame OpenAI for this, at least not at this point, and the company will have to live with it, look how it can turn it into something useful instead, since there's no one to complain to.


> I wouldn't blame OpenAI for this

They're offering the tool, it's at least partially their responsibility to tell people how it should and should not be used.


Why wouldn't you blame OpenAI for creating a harassment campaign against the business based on nonsense?


A glorified Markov chain generator.

Now, humans could very well also be statistical inference machines. But they have way more tricks up their semantic-level understanding sleeves than ChatGPT circa 2023.


Markov chains are great for modeling human language and human decision making. ChatGPT demonstrates this and the results are not trivial. I don't see it being glorified beyond what it plainly does


Well for a start you could make it more obvious what your service does do. I don't know what "geocoding" is. Converting things to/from "text" is meaningless. You have to get all the way down ... way down, past authentication to the details of the `q` query parameter before it actually tells you.

At the top you should have a diagram like this:

Lat, lon <- opencage -> address

With a few examples underneath.


"Past authentication", so you're looking at the https://opencagedata.com/api page. Most people go to the homepage first. Great feedback, we should make it clearer on that page and add examples earlier. Thanks!


Ah yes - I clicked on "Makers of the OpenCage Geocoding API" on your blog post which I assumed would go to your homepage (on mobile so it's a bit harder to tell).

Your actual homepage is indeed much better.


You could probably set up a rudimentary version of the service this influx of users is looking for in the time it took to write this article. Just grab the lat/long of each area code in the US off of wikipedia and there you go at least it's something. No it's not current position or anything like that but IP geolocation is just as imperfect when it's not based on triangulation. Case in point google has plenty of IPs that geolocate to mountain view but point to machines that are in Asia.


Related: One reason I just started using Rainforest API is because Github Copilot recommended it.

But also last night I tried for 30 minutes to get it to write me some fairly simple HTML parsing code. The tricky part was I couldn't use DOMParser since it was running on Cloudflare Workers and it could never produce any working implementation using HTMLRewriter or regex no matter how many examples I gave it


Are you aware of the pitfalls of parsing HTML with regex?


Yes, but it started to try to use regex so I thought I'd see if it could at least be successful and it wasn't. Despite super simple HTML.

Anyways, I wrote a solution using HTMLRewriter in 10 minutes...


I'm an attorney. I've typed legal questions into ChatGPT and it has spit out answers that are grievously, 100%, libelously wrong. It has named individuals and said they committed crimes, when it is unquestionable they did no such thing.

I'm waiting for people to start calling me to ask questions about something ChatGPT said, and I'll tell them it's wrong. Then they'll start arguing with me and saying if ChatGPT said it, it must be right, and I must be wrong. And then I'll need to waste time proving that this idiotic chat bot that is spewing out garbage is, in fact, spewing out garbage.


You're trying to use a language model as an information reference. A translator can explain what a diplomat is saying but they can't perform their whole job.


? ChatGPT did not simply explain what someone else was saying. It created something completely new and completely false.


Way to miss the point.


Which was?


The biggest takeaway for me was that it was getting info from YouTube videos. Is it actually watching and learning from the videos or where links to GitHub just included in the comments?


I think this is just an incorrect assumption on the part of the blog authors.


Cofounder here. We traced it back to two youtube videos where a developer is coding a phone tracking solution. The map then shows the geographical center of India and it's claimed that was the correct location. Then other users starting putting the code on github, then forked, then people created libraries and now AI tools pick that up. We already tried contacting the youtube authors, we left comments on the github repositories for months. Now we have to learn how to deal with ChatGPT. We also have no idea why a youtube author would describe a completely non-working solution over 20 minutes.

Screenshots https://imgur.com/a/sNR87c7 You can see the OpenCage logo on the bottom right of the images. We wrote a separate blog post about that about a year ago, we felt today's blog post would be too long if we added those screenshots, too.


But these videos were not part of the ChatGPT training data. This is most likely caused by the code on GitHub.

The claim in the blog post that "ChatGPT has picked up that content" is false.


Transcripts exist, and if there's multiple YouTube tutorials out there the odds are very good it also exists as a few dozen plain text articles.


> All suggestions are welcome.

They have to get an API key from you. What about a large warning at the start of that process telling them that this isn't a service you provide?



If you have to tell potential customers you don’t do something, maybe you should just do it instead.

ChatGPT as business line lead generator—is there anything it can’t do?


I remember a time when "I saw it on the internet" was a punchline for a joke about someone who's gullible or misinformed.


Fast, creative, and wrong isn't a trio. This is more evidence of ChatGPT being evolutionary and not revolutionary.


As a data scientist who has created AI applications and built many models over the last 10 years, I can say beware of ChatGPT. AI derived knowledge should be used only by those who understand its limits.

One of the simplest AIs is a recommender. We put guardrails on using its predictions inside ecommerce apps by limiting what it learns from (purchases for instance) and limiting what it is used to predict (purchases). When Facebook uses a recommender it learns from time-on-site (a value to FB but not necessarily to the user and a complex behavior that can be comprised of may non-beneficial sub-behaviors) and use it to recommend things that lead to more time-on-site. This application is dangerously devoid of guardrails as so much recent evidence has shown.

Now we have a text generating AI that has been trained from a great swath of human knowledge. That means the teachings of Gandhi as well Hitler, etc. What do you expect it to "know" as truth? Generative AI that is used to generate thoughts from this training corpus MUST have contradictory and downright evil ideas since it has no way to judge between ideas it learns from.

Generative AI in this form can be nothing but psychopathic until guardrails can be devised to limit its psychopathic responses OR the corpus it learns from can be labeled in a way to flag what is "bad", if we can even agree on what that means.

Psychopaths can be useful if they are knowledgeable but beware, you are talking to a psychopath in ChatGPT.


seeing the amount of effort people put into to hack/optimize Page Rank SEO, we will see lots of promt manipulation by all businesses if chatGPT becomes the defacto search. Preventing system gaming is going to be 1000X more difficult for LLM which is kind of a black box


This is not a service we provide. It is not a service we have ever provided, nor a service we have any plans to provide. Indeed, it is a not a service we are technically capable of providing.

I'm curious: why not? It seems like a lot of people would be interested in this if you could figure out how to provide it.


How would this work?

If a phone number is for a mobile phone then looking up the location doesn't make sense at all: mobile phones are mobile.

I guess you could try and crawl an index of business phone numbers and associate those with the listed address for businesses, but that's a completely different business from running a geocoder.

You could provide a bit of geographical information about the first three digits of a US phone number. I imagine that's not what users are actually looking for though.


Phone numbers have geographic structure. For mobile phones it's just the area code, but for landlines there is also information in the exchange portion. For example, I grew up in Medford MA which is 781-39x-xxxx.

I expect there are also patterns in other countries?


> I expect there are also patterns in other countries?

No, I believe this concept of mobile phone numbers being tied to localities is specific just to a few countries - USA, Canada and I think South Africa - due to how local calls were billed there; and for all other countries the mobile numbers were allocated from a single country-wide pool so their mobile phone numbers have no geographic structure other than the country code.


What if there were a service where anyone can opt in to share their own location information? If there was a way to get across the hurdle of how does the service provider know if the person is indeed who they claim to be…

My thought is something like a phone number, email address, or domain name bypasses this problem. It is fairly trivial to verify you have access to a phone number, email address, domain name. Feels like this is one of those chicken and egg problems though. Why would anyone list their location at my service if nobody queries it and why would anyone query my service if nobody lists their information with me?


Network portability. While there can be a small amount of correlation to the entire zipcode, that quickly breaks down in portal number blocks and they can end up anywhere in the US.

Also, with VOIP it could reference where a number doesn't have long distance fees, but has nothing to do with where it is.


Not in any European country that I know of. For example, the UK's geographical codes start with 01 and 02 (020 is London, 0238 is Southampton, 02820 is Ballycastle) and are never assigned to mobile phone numbers, which always start with 07.

Other countries work in much the same way, except the prefixes are, of course, different.


Yeah, but you can probably use https://github.com/google/libphonenumber for that.


This is what the inaccurate youtube tutorials do that they complain about in the linked blog post. It only gets you country level lookups for most phone numbers, county level at the very best. And it turns out users don't want that.

There's also the issue of moving. I have a Florida phone number. I haven't lived in Florida in a long time.


The service is possible:

If you are a mobile network operator.

Or, you can convince people to install something on their phone that sends you their location along with their phone number.


How would you go about reliably providing the location of someone's mobile phone without being their cell phone carrier?


By partnering with said cell phone carriers.

But I hope it would be illegal.


If movies have taught us anything, they need to remain on the call for at least 30 seconds for it to be traced.


> a lot of people would be interested in this

you mean like scammers and stalkers? (ok, and probably Meta)


You mean if they could figure out how to illegally track millions of people?


Soon we are going to have a AIrobots.txt


Is this not defamation, at least in some jurisdictions?


lol it recommended their api and gave python code for using it

but the real api doesnt give results that the user asked ChatGPT for

that is amusingly alarming


Not quite as alarming as these people most likely trying to stalk someone without their permission.


> Not quite as alarming as these people most likely trying to stalk someone without their permission.

It’s so common to want to know where does a incoming call come from that it’s built-in in iOS. It has nothing to do with stalking, just with guessing if who’s calling you is a scammer or a company trying to sell you stuff.


It’s pretty simple to look up the location of a phone number issuance, you can get a map or table that does this. I guess these people want the current physical location of the mobile phone. Either way these are not customers you’d want.

Edit: reading the blog post from the same company listed above, it is indeed people using an external API for what is an incredibly simple country code. It is a shame that programming has come to this and that ChatGPT continues to propagate it. One way they could solve the problem would be to provide sample code that does the same thing using a built in table without using their API service. Sure it’s work but not much will get ppl off your back asap.


I'm willing to bet people asking ChatGPT to help them resolve a phone number to a location are much more likely to be stalkers than people who are trying to identify spam calls.


The obvious follow-up is to create the non-existing API endpoint but hook it into GPT so that it can hallucinate a convincing address based on the phone number. Take GPT API key as input so that the caller is paying for this.

Bonus points for using ChatGPT to implement this end-to-end.


Our jobs are safe! For now...


Until someone figures out that we are all just hallucinating completely wrong code.


If this business suffers financial or reputational damage because of ChatGPT's misinformation, should OpenAI be liable?


It hallucinates that you can use 4 energy per turn in Pokemon TCG and confidently tells you so. No idea where that would come from.


Its not like ChatGPT made this up. There were pre-existing YouTube tutorials and python scripts available that used OpenCage an purported to do this. OpenCage even blogged about this problem almost a year ago[1].

Honestly it looks more like OpenCage is trying to rehash the same issue for more clicks by spinning it off the hugely popular ChatGPT keywords. Wouldn't be too surprised if they created the original python utilities themselves just to get some publicity by denouncing them.

1. https://blog.opencagedata.com/post/we-can-not-convert-a-phon...


Hi, Ed from OpenCage here, author of the post.

We do have python tutorials and SDKs showing how to use our service for ... geocoding, the actual service we provide.

I wrote the post mainly to have a page I can point people to when they ask why "it isn't working". Rather than take the user through a tour of past posts I need something simple they will hopefully read. But fair point, I can add a link to last year's post about the erronious youtube tutorials as well.

What I think you can't appeciate is the difference of scale. A faulty youtube video drives a few users. In the last weeks ChatGPT is sending us several orders of magnitude more frustrated sign-ups.


I get frustrated at the number of things ChatGPT gets blamed for that aren't its fault. It is completely understandable that if there are repos out on GitHub like the one for Phomber[1] thant ChatGPT would find that code and have no idea that it was phoney. Suggesting that ChatGPT just made this up out of thin air when you know it didn't is not very responsible.

1. https://github.com/s41r4j/phomber


You are blaming the victim. OpenAI is to be blamed.

They know what they are doing. They provide something that sounds over-confident for anything it says, knowing full well that it can't actually know if what it generated is accurate because it is designed to generate plausible sentences using statistics and probabilities, not verified facts from a database. On top of it, they trained it on an uncontrolled set of texts (though IIUC even a set of verified text would not be enough, nothing guarantees that a LM would produce correct answers). And they provide it to the general population, which doesn't always understand very well how it works and, above all, its limitations. Including developers. Few people actually understand this technology, including myself.

Inevitably, it was going to end up causing issues.

This post factually presents a problematic situation for the authors of this post. How ChatGPT works or how it can end up producing wrong results is irrelevant to the post's authors problem. It just does, and it causes troubles because of the way OpenAI decided to handle things.

And it's not "fair enough, because this false stuff can be found on the internet".


OpenAI is providing a language model that has some understanding of the world (in order to do that language model thing) and is surprisingly correct in some situations as a knowledge base.

The key thing is that it isn't a knowledge base. It doesn't claim to have correct information about the world. It has the ability to translate a question in natural language into what would be an answer in a natural language - but it isn't necessarily correct because its about the language rather than the knowledge.

People misusing the LLM as a knowledge base are at fault just as a person misusing a CD tray as a cup holder is at fault if it doesn't work correctly as a cup holder.


> People misusing the LLM as a knowledge base

… are doing this because it's an intuitive thing to do and because they don't understand that

> The key thing is that it isn't a knowledge base. It doesn't claim to have correct information about the world

and that it's

> because its about the language rather than the knowledge.

or what are the implications of all these things.

And:

> is surprisingly correct in some situations as a knowledge base.

This is key. It seems to work well for this use case so people end up thinking it does.


Phomber is not the best example. Ed contacted the developer of that tool over a year ago about the issue and to remove mentions of OpenCage and as far as I see the author removed it https://github.com/s41r4j/phomber/issues/4


That seems like a pretty nasty assertion to bandy around with zero evidence.


I cannot think of any other reason why the new blog post wouldn't have mentioned the obvious connection to the earlier issues that they had. They want to make it seem like ChatGPT invented this use case but they know that the sample code that ChatGTP learned from was mentioned in their previous blog post.


There's a vast chasm between "whoever wrote this article didn't think to link to a similar issue a year ago" and "the first incident was a malicious hoax".


I don't understand the original comment to suggest that. Rather: it's a known issue. ChatGPT does nothing new, and certainly doesn't do it by itself -- it just rehashes what others have already written. Like Google might send you visitors for something that's not even present on your website because others link to you mentioning it.

What the comment suggested was that they're now bringing this up again to get attention (and links) since it's combined with ChatGPT. That's not "malicious", but it's also not exactly "wow, we just realized this happens".


What the comment suggested is that the company deliberately created tools using their own API in a wrong way in order to write a blog post about it.

If that's not an accusation of being malicious I don't know what could be.


[flagged]


Ed is my co-founder, he writes all our blog posts because I suck I writing. He also does more than half of our podcast episodes https://thegeomob.com/podcast (the guy on the left). Last I saw him (yesterday) he was real.


That's another apparently evidence-free accusation.

Is there some undisclosed bad blood here?


No history at all. Do you have any undisclosed relationship? Hmm. My post was getting traction then suddenly its downvoted to the bottom. Maybe it was a social media management platform. Do you know anyone who runs one of those?


You broke the site guidelines pretty badly in this thread, even stooping to personal attacks. We ban accounts that do that, so please don't do that.

https://news.ycombinator.com/newsguidelines.html


Just re-checked the org chart. There's no social media personal assistant.


There's also no clear motive. They want to attract users to a fake feature their free tier?


That explains why ChatGPT is confused.

It may be an old problem, but I guess users are more use to a random YouTube video with wrong information. But the computer is always right so ChatGPT is always right, so users may be more annoyed to discover that the recommendation is wrong and blame them instead of ChatGPT.


Devs making baby’s first mobile app add “request location information” permissions, the devices start giving them the phone’s GPS information in the form of lat/lon pairs, and those devs naturally look for a service to make that data useful. What they want is “reverse geolocation”, i.e. take a lat/lon pair and return information that makes sense to a human (country, state, nearby street address, etc).

This is a service that OpenCage provides, and for whatever reason OpenCage happens to be one of the popular services for this use case. (Maybe it’s because you get the text description of location back right away without having to do a round trip through a heavyweight on-screen map, maybe their free tier allows more requests than most, maybe their api is easier to use, maybe they are lucky or skilled with SEO and their tutorial happens to be the first result for some common phrases, who knows.)

So there’s this process that starts with a search for “convert phone location to address”, often involves the OpenCage api, and ends with a happy developer getting the information they wanted. Various algorithms pick up on the existence and repeated traversal of this happy path.

In another part of the internet, code tutorial content farms notice a demand for determining an incoming call’s location from the number that’s calling. They search for things like “convert phone number to location” and “convert phone number to address”. Some of these searches end up falling into the nearby well-trodden path of “convert phone location to address” and the content farmer is presented with the OpenCage api. They mess around with the api for a bit and find they can start from a phone number and get a successful api call that returns a lat/lon pair. A successful api call that returns legitimate-looking lat/lon data is all they need to make a video, they make it and post it. Higher-quality, more scrupulous code tutorials attempt to answer this same demand but find it’s not possible, so those tutorials don’t get made, leaving the less scrupulous ones that stop with a successful-looking api call to flourish in this space. The tutorial is doing well, so the content farms endlessly recycle it into blogspam.

As a result, OpenCage starts getting weird usage patterns, tracks them down, finds the source is these tutorials, and makes a post about it.

Some time later, ChatGPT is released. People are astounded with its ability to write code and start using it for this purpose. Naturally, some of those people have the same demand as the previous generation of devs who stumbled onto the unscrupulous code tutorials. Because of the blogspam, ChatGPT’s training data includes many variations on the tutorial, and just as naturally it ends up reproducing that tutorial when asked - except ChatGPT’s magic kicks in and instead of including (what its embeddings see as) some weird unrelated area-code-to-string nonsense from the tutorial, it just bullshits some plausible-sounding data plumbing code instead. Unfortunately, because the tutorial never worked in the first place, that weird hacky irrelevant bit that ChatGPT ignored happened to be the secret sauce that makes the whole thing superficially appear to work.

As a result, OpenCage starts getting weird usage patterns, tracks them down, finds the source is ChatGPT, and makes a post about it.

In deference to Hacker News’ policy of keeping comments pleasant, I will elide the analysis of the process that leads to comments accusing OpenCage of nefariously engineering the whole thing for attention.


Thanks for the above. (nice self-restraint in the last paragraph.) Things almost make sense now. Except one problem ... this implies that there are software developers who think to themselves "given a cell phone number, how can I get the phone's location?".

And it further implies that these people don't immediately follow that thought with: "That's surely impossible, since it would be a privacy nightmare if literally everyone in the world could track everyone else in the world's every move".

Or perhaps with this alternative thought, which would lead to the same conclusion: "let's not worry about privacy, how would this even work? Does every phone company in the world pro-actively send every customer's location data to OpenCage, just in case someone queries it? Or does OpenCage wait until it gets a query, and only then query the cell phone company 'just-in-time'? Both of these sound like a lot of work for each phone company to support ... what's the incentive?"

Honestly, I'm a bit surprised that the OpenCage blog post is so calm about this, instead of just yelling incoherently "why WHY why would anyone think like this?!?"




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: