Hacker News new | past | comments | ask | show | jobs | submit login
Search.chatgpt.com domain and SSL cert have been created (chatgpt.com)
127 points by daolf 9 months ago | hide | past | favorite | 117 comments



Imagine being an innocent developer trying to spin up some internal dev tooling and accidentally landing on the front page of HN to be misinterpreted as an attack against google which could affect both stock


Ah well I like your joke, but I know GPT is moving fast, but it would be unlikely and innocent dev can change DNS records of chatgpt.com


Isn't that what DevOps is supposed to be?


well, I mean, there's a review process too, if it's done right? Something business critical (like code, keys, and I'd include DNS records) should still need two pairs of eyeballs and a process before any changes are made.

DevOps doesn't necessarily mean unrestricted control.


Only the guilty ones can?



Sama & Lex Fridman literally talked about disrupting search in their latest podcast.



I hope they don't use this level of DNS to set up internal tooling.


I'm taking this opportunity to once again ask for the widespread adoption of the Name Constraints extension in x509, and subsequent roll-out of constrained intermediate CA certs signed by a publicly trusted root.

Would be so convenient to have an intermediate CA cert constrained to *.my-name.com to avoid situations like this. Being forced to either use a private PKI infrastructure or using wildcards to not leak host names is so annoying.


The point of certificate transparency is to have a public audit log of every certificate issued. Even if you had your own CA, you would be obliged to report every certificate you issue to the CT. This is a feature, not a bug.


Certificate Transparency needed to serve website owners and not some greater good. Knowing that someone issued wildcard is enough.


Certificate Transparency really serves the end user.

Because the most popular browsers (at least Chrome and Safari) generally require CT logged certificates, if you want to successfully perform a MitM attack against any user, even just some individual user, even controlling a CA, you still can't do so without publishing your fraudulent certificate to a CT log.

This is the important function of the CT log. It is an effective balance against compromised CAs and governments that might abuse CAs, because it causes such attacks to become quickly tamper-evident.

I don't think it would be possible for a system like this to be effective without publishing the actual certificate to the log.


I don't follow your threat model.

Let's say that browser is fine with CT if either leaf or intermediate certificate is logged.

If you need to issue fake certificate, you need to either log it, or you need to issue fake intermediate certificate and log it.

Either way it's visible to website owner (and other people likely won't care anyway).


It would be completely possible to do it that way, but doing it this way ensures that at no point does certificate issuance become opaque and impossible to scrutinize. We want to ensure that CAs follow certain rules, and CT logs are one way to do this. For example, a CA should not issue a certificate with a forged "not before" time. There are certainly many more cases like this.

Public CT logs mean that the property of transparent certificate issuance extends to the entire Internet, which is good. If you want private certs, you can use a private CA and deploy it to the machines in your domain. Totally reasonable alternative in my opinion.


You can just buy a regular wildcard certificate for *.my-name.com

If your organisation is competent enough to handle an intermediate CA certificate safely, you're certainly competent to handle a wildcard cert safely which is a much easier task.

Sadly it's unlikely you'll ever see the Name Constraints extension adopted. All it takes is one model of 15 year old smart TV failing to respect it, and the CA/Browser Forum will consider it too dangerous to allow.


While I principally agree, the neat thing with the intermediate CA is that it can be centralised and support ACME, which makes maintaining the certs so much easier.

In my current org we have hundreds of TLS termination "configuration points" (cdn's & cloud loadbalancers / networking appliances / k8s ingress controllers / raw VM's). We have standardised on ACME issued certs for almost everything. Using a wildcard certificate would force us back to manual cert updating procedures, or finicky scripts. Undoubtedly causing issues when certs become expired.

(Not to mention the trust boundaries. An org can be competent enough to handle an in-house CA securely, and simultaneously have a bunch of quasi-sloppy vendors for stuff like the visitor badge kiosk.)

But I sadly agree that it will probably never happen…


This would be so great. It also just mirrors the DNS trust model nicely, which is what’s used for X.509 trust anyway by most CAs.


Let’s encrypt and similar ACME compliant services allow you to get wildcard certs through their DNS-01 challenge.


A wildcard cert is an unnecessary risk, though.

Just because I trust a server to hold the cert for preview.example.com doesn’t mean I’d want it to be able to pose as prod.example.com, for example.


Why? As I understand it, the domain owner can assign the name you “trust” to any server already. Might as well trust all names by that domain owner.


Because if your one wildcard cert gets compromised somehow, the attacker can now impersonate every subdomain of yours. Consider what happens if there's an old test box called daves-test-box.example.com that has a copy of the wildcard cert valid for *.example.com. Dave quits and never updates his box. Eventually an unpatched CVE gets used to steal the cert. Now the attacker can phish or MitM your users of www.example.com using the stolen cert and browsers will trust it. If you'd instead of wilcard certs used specific ones, then the only thing the attacker can do with it is MitM or phish the users of Dave's test box, which is approximately zero.

There are certainly other strategies and best practices that can mitigate the risk in this scenario, but not using wildcards is a good one to include.


It depends on the system design. If I have an organization like Google, with many employees, *.google.com would be a horrible cert.

That's not every organization.

I maintain many groups of /related/ servers (including dynamic ones which appear and disappear at whim). There, a wildcard makes a lot of sense. If I have https://[username].[domain]/, https://[git project].[domain]/, https://[client].[domain]/ or similar, that integrates nicely with a lot of security infrastructure in ways which http://[domain]/client/ don't.

E.g. A network can filter https based on domains, but can't look inside the envelope. A browser will highlight domains to users who want to know where they are. Etc.

There are also many good reasons why managing one credential per server (which can map to many domains) is better practice than managing a dizzying array of credentials.

So I agree about the general best practice, but there are exceptions. Mandating (as opposed to recommending / encouraging) non-universal best practices is usually bad practice.


Yes, I could not agree more. I've been in both the dynamic servers and the dizzying array situations and those are absolutely good reasons to use a wildcard. I worked for a company that white-labelled services so every customer had a vanity subdomain, and managing certs because an utter nightmare until we finally just bought a wildcard.

Recommending encouraging non-wildcard certs is the optimal strategy. Only thing I would add, is recommending default to non-wildcard and evaluate deviations case by case.


I don't think you realize the ammount or scrutiny and approvals required to allow an exployee to use a sub on the main domain in corp.. A very very.. very limited ammount or people can do DNS changes for the main domain with a crap ton of signatures and eyes monitoring the whole thing.

Dave can test his stuff on a newly bought domain for testing or the internal domains.


No one is saying wildcard certificates should be mandatory. An old test box shouldn't have a wildcard certificate for sure.

Yours is not an argument against wildcard certificates! Yes, like, everything else ever, wildcard certificates can be misused.


Nobody is proposing to make them mandatory.

They were proposed here as an alternative to domain-restricted sub-CAs, but GP and me have given counterexamples as to why they're not (or at least not without downsides).


You're arguing against a strawman (an argument that nobody is making).

> No one is saying wildcard certificates should be mandatory.

Nor am I saying they shouldn't ever be used.

You may interpret it differently, but to me:

> Why? As I understand it, the domain owner can assign the name you “trust” to any server already. Might as well trust all names by that domain owner.

Essentially means "default to a wildcard." My example is absolutely a good reason why you should not default to a wildcard. There are situations where they make good sense. I use them myself. It's a terrible idea to use them everywhere and always, which is usually what ends up happening when wildcard certs are the default approach.


As with most things, it's a tradeoff of security vs convenience/usability. The CIA Triad comes to mind. I advocate for using separate domains for dev, staging, and prod (at least prod vs. non-prod) and for a wildcard cert for a non-prod domain, the convenience far outweighs the security risk IMHO.

But yeah generally speaking, it's best to avoid wildcards unless there's an actual benefit to using them, even when it's not a prod domain.


And the beautiful thing about domains is that they're hierarchical, so you can arbitrarily split your trust boundaries.


A cert for .test.domain seems reasonable, for example, especially if the test infrastructure is dynamic, and you e.g. have CI/CD for a Cartesian product of:

every branch

* several test data sets

* several feature flag / configuration sets

* ...


A server allowed to hold preview.example.com (and its associated DNS records) cannot pass dns-01 for *.example.com. Unless you have no authz on your DNS configuration, in which case this server is allowed to hold prod.example.com since it can edit that record.


I know, but what I mean is that just getting a wildcard cert and handing it to all servers that need it comes with some tradeoffs, as does requesting a single-host cert publicly for each host (mainly that I need to talk to a CA, which needs to be available, and it'll publicly log a possibly internal-only, preview etc. hostname).

Having domain-constrained sub-CA certificates granted by the exact same mechanism we use for wildcard certs today would combine the advantages of both.


The main point of DNS-01 is that it doesn't have to be the same machine requesting the cert and using it. You can easily use DNS-01 from your laptop to get a cert for prod boxes. I have a script that runs as a k8s cron job that uses DNS-01 to renew all the certs and stick them in k8s secrets automatically.


As a red teamer I agree - I always found this ridiculous


It's a clever way of getting around the accusations of stealing content. They can say that they are scraping it to make it searchable, just like Google.


It will be interesting to see what happens to copyright claims against ChatGPT. Google can just remove claimed content from its index, what will OpenAI do?


Use their legal and PR fund to fight back.


If all you have to do to beat these claims is throw money at the problem, then why haven't the other (better funded) search engines done that?


Clogged arteries, i.e. layers upon layers of risk-averse and clueless management. Top management paralysed by what the stock market may think of their ideas. You could also ask why Bing has been forever underperforming despite all the cash Microsoft has throw at it so far?


I'm talking about the copyright claims. If Bing and Google stand down when people request their sites to be removed, how would OpenAI have the resources to push against that?


Logic need not apply. Lawyers will be paid to work around that.


I know about things like https://crt.sh but how could you be notified about something like this? Is there some service that allows you to be alerted whenever a new certificate is generated for a domain?


There is this tool from Facebook.

"Certificate Transparencyis an open framework which helps log, audit and monitor publicly-trusted TLS certificates on the Internet. This tool lets you search for certificates issued for a given domain and subscribe to notifications from Facebook regarding new certificates and potential phishing attacks."

https://developers.facebook.com/tools/ct/search/


crt.sh has rss feeds for any search result you throw at it, e.g.: https://crt.sh/atom?identity=chatgpt.com

most commercial offerings are about monitoring your own domain, e.g. from cloudflare, sslmate, etc.


You can set up your own certificate transparency listener, and get notified of every certificate created, in realtime, assuming you can handle the load. In my company we do this to scan new domains for potential phishing domains, to take them down before they become active.


And if you need a concrete tool, use something like Certstream [1].

[1] https://certstream.calidog.io/


The amount of json you get from it it terrifying. If you do play, I found when piping though jq it could not keep on on my machine, but a jq clone called jaq handled it with no problem.


Interesting. What exactly are you looking for? Domain names that are similar like Micr0sotf.com and such?


There is free service called Certstream [0]. It does not provide notifications, you need to ingest the stream, look for the patterns of interest to you and handle notifications by yourself. But it's fairly easy and the service is commonly used by security teams.

[0] https://certstream.calidog.io/


Easy, just refresh Hackernews front page 24/7.


Cloudflare provides this as a feature, you can choose to get an email every time a certificate for your domain is generated.


There is a rumour of an OpenAi event apparently next week, so likely what this is.

Some creators also seem to suggest they know what is going on, youtube mattvidpro hinted at it when talking about the gpt2-chatbot, he mentioned he knew something but couldn't talk about it or get sued.


is there a source for this outside 1 guy on twitter? If they were holding an event I would imagine invites would have gone out a while ago like their dev day..


FWIW, that one guy on twitter is spot on more often than not in his leaks and predictions.


Search chatgpt ah but you repeat yourself, we already use LLMs to replace this antiquated notion of loading webpage snippets and pointing me in 5 different directions.

I'd say "just chatgpt it"? is closer to being in the lexicon and this url just doesn't roll off the tongue

>"here let me search.chatgpt that for you"

There I was hoping sama-gpt5-chatbot had some creativity chops for naming new things but they must have decided not to use it this time.


In my experience LLMs are terrible at naming things, usually some combination of cringe and silly


> >"here let me search.chatgpt that for you"

Why would you say it like that though? You don't set "let me google.com that for you"

In the same way you don't say "let me chat.openai it for you", likely search.chatgpt.com will just become the new default interface to chatgpt, and "to chatgpt" something will mean to look it up on search.chatgpt.com


Today's chatGPT and today's (not OpenAI) web search are very different creatures and experiences. Are you saying that today's chatGPT experience will be abandoned and replaced with a web search experience? That seems terribly unlikely to me.


Just today, I was having difficulty with printing in AutoDesk software. Copilot(Previw) was right there at the bottom, I opened it and asked for steps to solve the issue. It told, in steps, exactly what to do, which worked as instructed. Solving my issue. The original article from AutoDesk was also linked. My immediate thinking was search is doomed.

Compare this to what I would have to do with google. Open browser, type the query, guess which of the top 10 results are likely to answer the issue I was having. Click few link to open them in new tab. Read though it to see if the correct problem was being discussed. If not see another link in the top 10 list. Repeat.

I even thought where in that interface, Microsoft could place the future ads. It's totally in the realm.

EDIT: At this point, ChatGPT is basically old school "I am feeling lucky" on steroids.


> Compare this to what I would have to do with google. Open browser, type the query, guess which of the top 10 results are likely to answer the issue I was having. Click few link to open them in new tab. Read though it to see if the correct problem was being discussed. If not see another link in the top 10 list. Repeat.

This is outdated. I repeated your experiment with Google query [how do I print from Autodesk] and the top result is the step by step instructions (not a link but the actual instructions, with a link to the source following). The second result is a link to Autodesk's own help docs for the print function. No ads above the fold.

But you make a point that people will keep assuming what they want to anticipate how the search will unfold, and I admit this is in part due to the enshittified commercial and news-related (and other) results, the bias is coming from somewhere, right? But I would encourage anyone to at least verify this assumption before posting it as fact.


I do it the whole time with bing Copilot. Didn't visit Google to search for my daily questions like for several months. It feels so outdated to search for a proper result in old goog.

So actually, what Copilot does is to prefilter and condense results. If I like it and it seems legit, it's enough to get my answers. Sometimes I take up the source. Most of the times I reprompt Copilot to list me alternatives and find errors in them. That discloses erroneous things like contradictions enough.

But what's more useful is "how can I mix up two dictionaries with different keys one by one into a set."... 20 seconds later, I can read how and don't have to search SO for whole 20 minutes of reading.

But yes, it will impose the same changes as Google did. Before Google, you should know where you get the information needed or already have it in your hrad/learned. After Google, you don't have to. So everyone changed into stupid mode but being way more productive. So, I fear, that's the next wave of stupid:)


Well, that was me trying to simplify the query for HN audiences. Let me recall the actual conversation.

Me: How to add missing monochrome.ctb file in autodesk dwg trueview?

[Upto this part google is same. Pointing to the same link, but not showing any steps though.]

Copilot: Here are the steps. Put them here in this folder......

Me: I put it there as suggested. Still didn't work. How to make tureview refer another folder location?

Copilot: Here are the steps.

Me: It worked thank you. [I really thanked it ;)]


A new search system is coming. At least I've already seen about 50 visits to my website from ChatGPT.


Why do you think this implies they are building a search service rather than just scraping your site for more training data?


That's an assumption, as they could easily develop a search engine and incorporate the same advertising model used by Microsoft Bing.

I've noticed an interesting pattern: before releasing LLaMA 3, OpenAI provided access to ChatGPT 3.5 without requiring registration.

Meta might also follow this path with its own search engine. Did you know that you can now ask questions directly on Instagram and receive answers from Meta AI using LLaMA 3?

Why go elsewhere for information when you can easily find it in the app you use every day?


Altman has said explicitly on several podcasts that they are working on search and that it was something he is particularly excited about.

I am guessing that this falls under the "we will steamroll" clause of OpenAIs gradual move towards AGI.


On the other side he said, the "next Copy of Google Search" would be boring. So let's see what they try to make differently

https://youtu.be/jvqFAi7vkBc?t=4681


I can tell you that a piece of trillion dollar pie does not taste boring.


he did also say they weren't going to try and make their own chips in an Oxford interview. I think he is known for saying one thing while doing the other. Like the former board said, he isn't always candid.


Wasn't ChatGPT already doing searches for you when relevant? For example if you asked for recent news? I can't get it to do it again for some reason it will share month old publications right now.


That was through bing. And bing uses similar algorithms to google for ranking. OpenAI can do way better.


Does this mean that the next gpt model could be online, like Gemini?


Sounds like they're building something like what Phind.com does


Or maybe something like Perplexity AI.


ChatGPT already has browsing, it's just way slower than Perplexity. I'm impressed how quickly a Perplexity search runs and integrates the results into the LLMs response.


Perplexity is pretty amazing. I've tried asking really obscure questions and it just answers them almost immediately and with decent sources every time.


RIP Google. Yet another story of a company killed by "effective" management. The McKinsey school isn't so effective after all, is it.


Can someone tell me how all that leet coding doesn't lead to their AI having better coding ability? More fizz buzz has been spilled on google whiteboards than in any other place on earth.


Bad organization. So what if you hire good engineers if you make then do wrong tasks.


So does that mean my chat will be publicly searchable?


They would be open themselves to a class action lawsuit and no one would use their LLMs so I doubt it


I am not from the US, is a “class action lawsuit” supposed to be something scary? From the instances I’ve seen over the years, these lawsuits result mostly in tiny $22M fines and a slap on a wrist.


No


Isn't that conflicting with their microsoft deal?


Unless this is the Microsoft deal, and they're using Bing as their indexing backend.


"chat gee pee tee search" is a terrible name. In comparison, bing is genius


I don't know about that, I always hated the name Bing...and I worked there and thought the product was actually pretty good. IMO Microsoft should have bought search.com and competed with the verb Google.


I don’t know – Microsoft has been pretty efficient at poisoning the name “Bing” for me, and I was initially neutral on it.


How about 'bong'


After a certain point of popularity, bad names become good names. Their awkwardness and abrasiveness become unique and distinctive.


I don’t know. ChatGPT has sticked.


Using azure bringing MS money. Even bing is not growing, the money comes in. If the search is good, it will explode and make MS buy more of hardware, making them the biggest cloud search. they can scale on it.

So the money will come in, that or that way.


Probably uses Bing to an extend.


Ooo exciting :)


Most underwhelming product launch ever.


what's the point of this post?


Somebody has PUT options on Alphabet? :-)


[flagged]


What makes you think an AI can't be gamed in the same ways?


I think calling it AI horsepiss implies it will be just as bad, just in different ways (and I agree)


Search engine manipulation isn't really even a search problem. It's a Google problem. Google's struggle with search engine manipulation begins and ends with their advertisement based business model. As long as their main source of revenue is the very ads on the websites that pollute their search results, they will never find a solution to the problem.

Search engine spam increases the search count (apparently an important KPI), and it also increases ad impressions, both on the SERP and on the results pages.

This is a conclusion that is straight from Page and Brin themselves[1].

[1] http://infolab.stanford.edu/~backrub/google.html#a


That's overly pessimistic. It will be gamed to the same *ends*, but you'll have to game it in slightly different ways.


Lmao what so you think the LLM is trained on? And the data set is frozen in the past because LLMs are polluting the web with generated garbage.


that's the point.


Well if you were being sardonic it was completely lost.


lost on you, for sure. but here an explanation: https://news.ycombinator.com/item?id=40235268


ChatGpt will happily sell you snake oil wrapped in your favourite flavor of BS


But only after telling everyone a couple of times that doing so might be incredibly dangerous to humanity.


yes.


Tls, no?


There is no such thing as an "SSL certificate" or "TLS certificate". There are certificates, which are used in various protocols including SSL and TLS. You can use the same certificate for both. The name "SSL certificate" is just a shorthand indicating the intended purpose of the certificate, nothing more. As such there really is no point in being pedantic over SSL vs TLS.


More accurately they are X.509 certificates.


If we want to be pedantic, X.509 mainly defines the binary format of these certificates.


X.509


We all know what it means.


I know what "nucular" means too


tty. World Wide Web.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: