Hacker News new | past | comments | ask | show | jobs | submit | cwsx's comments login

I switched to them a few months ago, I was previously using duckduckgo (and Google before that). As most of you have probably noticed Google search results have seriously dropped in quality the last few years, but especially in 2023. I'm no longer able to get meaningful results for almost any topic, especially if it's technical, the only results are AI generated (?) / obvious SEO spam websites. It takes me multiple different search terms and clicking through multiple results to find anything semi relevant, and even then it's a shallow article maybe summarising what I'm looking for. Unfortunately DDG seems to be going the same way.

Whereas Kagi reminds me of the 'old' google search. The results are meaningful and relevant, not diluted with pages of generic article results. They also offer a lot of great customisation options like being able to block or boost certain sites in results. They have some built in lists for common filler sites. I can't comment on the AI variation but I hear that's progressing well.

I wouldn't call myself a power user of Kagi, but even then I'm getting far better results than other search engines, definitely worth the price per month.

I'm not affiliated with them in any way, just thought I'd share my anecdotal experience.


> I wouldn't call myself a power user of Kagi, but even then I'm getting far better results than other search engines, definitely worth the price per month.

This only works as long as Kagi is a niche. The moment any search engine becomes commonplace I think they will inevitably succumb to SEO. Otherwise, they would have to change their methodologies every once in a while to completely flip the ecosystem.


I think it also has to do with incentives. If your business model is selling ads then you have a balancing act between user and customer satisfaction.

With Kagi as I understand it, the customer is the user since it’s a premium product that isn’t selling ads. There’s really no good reason for them not to just nuke bad actors.


Not necessarily. You'll still be able to nuke the whole domain from your results, permanently. That means you'll see the spam once, and getting a new domain promoted to the top takes time and effectively money.

I also hope that domains which get blocked by lots of people will get reviewed for global downranking, but I don't think that's happening yet?


That is true, I really wish I can tell Google to simply filter out learncpp.com and some other websites.


I managed to filter that, geeksforgeeks.org and towardsdatascience.com out with Kagi. It's quite helpful being able to slightly reduce prioritization on a per site basis so that instead of showing up as top result it'll be buried a bit but still accessible.


You can with browser extensions, fwiw.


uBlacklist can only block sites. But Kagi can raise or lower sites in pagerank, and can pin sites on the top. Boosting sites up in the result is more efficient than blocking spam sites one by one.


Since the browser extension only works on the FE, this just means you are hiding the site and receiving fewer results on a page.


> The moment any search engine becomes commonplace I think they will inevitably succumb to SEO

Thankfully when I come across a irrelevant domain in Kagi I can just remove it from any future search results completely. If enough people do that, it may show up on the "most commonly removed" list inviting others to also ax it.

I rarely ever have an issue with spam on Kagi just by largely using the standard filters, and I'm confident this will remain the case.

And unrelated but I really like that I can redirect all reddit urls in search results to old.reddit.com, twitter to nitter etc. very helpful in searching on mobile.


If that SEO means removing ads and tracking from your page to get a higher rank, I'm cool with it. :)


Two-sided/platform market dynamics are really interesting to this economist.

I wonder if kagi's going to have to charge for listings some day, instead of users paying in, if they intend to grow substantially.


There is a good chance that it will remain niche due to the paid and forced-login model. This is a good thing. I hope they will manage to position themselves well as an alternative search engine with clean, unmanipulated results; and be careful about unhealthy (greedy) growth.


SEO should be called "GEO", it's google optimization. Spam keyword blogsites only work because google prioritizes that stuff. They're driven by ad revenue so they're incentivized to show commerical sites over non commercial ones, etc. ,etc, etc.,etc.


Except that the problem isn't specific to google search. The others are much the same.


hopefully what will happen is no single search engine will be dominant, ensuring that problem can't happen (we'll probably have other problems instead)


A paying search engine will always be a niche


Google Search is a victim of its own success.

They are the biggest search engine; every SEO trick, every spam attack is spearheaded against them. But also being the biggest and the inevitable, they can afford to blunt their search tool somehow in order to show more lucrative sort-of-hits and sell more ads. A moral hazard to do such a thing is always present fr any market-dominating player.

Kagi, in comparison, is tiny, and almost nobody cares to attack their algorithms. Back in 1990s, when Macs were a small minority in the PC-dominated world, they were the safest desktop machines, because almost nobody cared to write malware for them. Now that Macs are a sizable segment of computers in hands of important people, they are targeted by malware all right.


> every SEO trick, every spam attack is spearheaded against them.

Sure, but also they're ignoring extremely basic issues. "every SEO trick" is one thing, "just copy the SO content and still get ranked on the first page" is them not caring. We can worry about them dealing with the complex issues after they address the low hanging fruit.


I've been curious for a while too and I've been trying to de-google myself a tiny bit each year (more or less dropped Chrome in 2023).

Once I actually grab a full time job again I wouldn't mind grabbing my own subscription here to try it out. I'm curious if 300 searches/month is truly enough for me, though. And what would happen if I go over that rate. Am I simply unable to search more for that month?


Fwiw, I initially burned through the free searches in a few days, so definitely not enough IMO. Add the fact that free searches never got refreshed for my account, and I was pretty much unable to properly test the service for months. But bangs still work after the limit, thus I kept it as default given that I heavily use bangs to search other services.

Still I ended up subscribing, and after properly testing, I can recommend. The service is good, the blacklist feature is essential to me now; is just that the free tier is shit.


The free searches aren't supposed to "refresh". They are once per account.


Look at your browser history to find out how much you search. I was surprised to find that I’m consistently nowhere near 300.


Yeah, I found the opposite for me, as I expected. I did a little under 400 queries in the last 30 days. I could definitely cut down a lot of redundant or simple searches to get under 400, but given how ubiquitous it is for me to simply so random questions (or simply search around a lot for documentation via search engine) I'd rather not have to worry about it.

On top of that, this is during a month without any job (where I'd search even more on the clock). I hear it's 1.5 cents per query over but I can imagine doing 600+ searches once I'm employed again.


> I was surprised to find that I’m consistently nowhere near 300

Per month?

My current Kagi searches from 3rd of January until today sits at 1256 searches. For sure I'd do 300 searches in a week, and on a particularly hairy day I might do it in a day.


300 per day??? That’s a search per minute for 5 hours. Are you even doing anything else?


Eh?

> 3rd of January until today sits at 1256 searches

1256 / 23 (days between today and Jan 3rd) = 54.6 searches on average per day.

Some days higher, some lower. Sometimes it can take a couple of tries to get the search right, so you do 5-10 searches in one minute maybe. Doesn't seem farfetched to me.


> For sure I'd do 300 searches in a week, and on a particularly hairy day I might do it in a day.


Yeah, that's "on a particularly hairy day", not "per day" for a full month...


I didn’t say it was every day.


Ok, well, thanks for the intellectually stimulating discussion, I hope you have a nice day :)


Personally I'm fine with the 300 searches/month, however that means I don't use Kagi for searches that are extremely simple.


I just prefix "simple" searches with !g or !gi, bangs don't count against the limit


Why waste your mental energy on this? Searches cost like 1.5c once you go over the limit. It's not worth thinking about.

Also you can enable browsing history and use bookmarks to autofill stuff without having to use a search engine.


Yeah, once you hit the free limit, you get served a subscription wall once you try to do a search


Does it have an option to exclude commercial websites? That'd be quite useful to me. Pretty much every time I try to find information about a product, all I find are sites trying to sell it to me (but I already have it and want to find information about it, damn it!).


Another Kagi user here, yes, the customization of results is way better than any other search engine I've used. Eg, personalization can be manually set to lower or raise weight of results from specific domains. This has become extremely useful to not only filter out bad sites, but increase relevant results when you regularly get information from sites like GitHub etc.

Stats are released about these as well so you can easily copy heavy used fiters [0].

[0] https://kagi.com/stats?stat=leaderboard


Interesting that HN is pinned way more than stackoverflow.


If they do/done a user survey, it would be interesting to see where all paying users are coming from. My guess is that a substantial amount of users come from hearing about Kagi on HN or in HN comments.


There is a lens available in settings that seems like a good fit, though I haven’t tried it myself yet.

Small Web: results that favor noncommercial domains and topics.


Iirc there is an exclude (or at the very least, weights), though you'd have to do it by hand. Though i do think there is a social feature to install other peoples weights.


I just signed up. You get 100 searches for free to try it out.


Small warning. If you click the "more" button at the bottom of a list of results, it silently does another search and deducts that from your remaining free searches.


I did that a couple months ago, and just signed up for a paid tier after I tried to go back to duckduckgo and started losing my mind. Kagi is better for discovering new content and mediocre places on the internet.


The question to about the obvious quality drop for Google is, Is this intentional? Perhaps some cost saving or ROI measures? Or the motive always was to just train their AI and we just helped with that?


Just perverse incentives.

Google isn't incentivized to be a good search engine.

They are incentivized to be just good enough that you don't go elsewhere while increasing the number of ads / paid results.


Commenting to follow, curious about the answer.

From what I've found through Google (with no real understanding of llm) 2^16 is the max tokens per minute for fine tuning OpenAI's models via their platform. I don't believe this is the same as the training token count.

Then there's the context token limit, which is 16k for 3.5 turbo, but I don't think that's relevant here.

Though somebody please tell me why I'm wrong, I'm still trying to wrap my head around the training side.


You are right to be curious. The encoding used by both GPT-3.5 and GPT-4 is called `cl100k_base`, which immediately and correctly suggests that there are about 100K tokens.


Amazing, thanks for the reply, I'm finding some good resources afyer a quick search of `cl100k_base`.

If you have any other resources (for anything AI related) please share!


Their tokenizer is open source: https://github.com/openai/tiktoken

Data files that contain vocabulary are listed here: https://github.com/openai/tiktoken/blob/9e79899bc248d5313c7d...


GPT 2 and 3 used the p50K right? Then GPT-4 used cl100K



I'm a completely unknown artist with 4 songs on Spotify, mostly released during 2020. In total I'm at 54388 plays, which has earned $42.41. This is across all platforms, though Spotify is 95% of the plays.

I'm not sure if Spotify has dropped their payout per play since 2020 but I'm likely at the lowest payout rate and I'd say it's not terrible (although it's not great). You also get paid more for Spotify premium streams, which afaik was the majority of my streams.


Phillip J Airfry made me smile, love the name.


Do this frequently enough and they'll start inviting you back with free travel + accommodation.


Shout-out Nebula, an alternative YT/creator platform which has no ads or sponsor segments. It's a monthly subscription but fairly cheap, and it gives you access to all videos on the platform (unlike patreon which is for a single creator). The monthly subscription cost is then split between all creators on the platform.

It's not a 1-1 alternative to YT as creators have to opt in, so most (imo low effort) videos/creators won't be on there. It's fantastic for any tech/engineering/history/news though, high quality/effort vids with no bullshit.

Note: I have no vested interest in Nebula, I'm just a user that's happy to support good creators and a platform that's actively opposed to advertising.

If this counts as an ad/spam - let me know and I'll delete this comment.


I don't think I've ever completed a training course on time, only after getting "urgent: 7 days to complete".

I've got severe ADHD so these types of assessments are near impossible due to the slow dialogue and forced wait time. Though most of these courses give you multiple (or unlimited) attempts, so I'll screenshot each slide + wrong answers and brute force until I'm done. At least I can get other stuff done in the meantime.


What I do instead is attempt to reverse engineer what JavaScript function I need to call or web request is needed to make it think I competed the test/videos.

A common easy way is to just re-enable the “next” button. Even if it takes me longer than just doing legitimately, I find it more educational.


I was annoyed when a government security training exercise ranked "send the data using a trusted courier" as more secure than encrypting the data, and marked my answer incorrect.

My manager was impressed when I scored 110%.


That's probably a bad idea from a legal point of view.


The approach of trying to know what exactly the user does in their browser on their own computer and from that information to conclude whether something in front of the computer happened (the learning) is nonsensical at best and crime at worst (when done without consent or secretly). Allow the user to give deliberate signals by marking parts as done and if necessary analyze the datetimes of those signals.


But the OP is talking about training their work assigned them. Which is presumably done on company hardware.

You have no expectation of privacy on company hardware, they are allowed to do anything they want to do with the signal you provide them...


What expectations I have, is my own business. I definitely don't agree with being mistreated by an employer and it would probably make me look for another job, if an employer did that kind of thing.


'Expectation of privacy' is a legal term. It doesn't have much to do with your personal expectations.

In any case, what I meant is that actively lying and deceiving about whether you did the compliance training (or any other employer ordered training) can probably get you into hot water.


I do the same


The problem wasn't necessarily the A/B testing rather than what came next, Cambridge Analytica. This current headline seems to follow the same message, that being able to manipulate a users mental health is beneficial to Meta.


Collecting the data isn't a problem, but utilization is? Seems to be the same problem.


When I lived in Sydney I had a few friends that frequently bought Rikodeine (dihydrocodeine cough syrup), which is also meant to be tracked via pharmacists recording IDs. My friends knew what pharmacies didn't bother to track it, normally if you were a 'repeat' customer and knew the pharmacist. They were able to buy 10~ bottles in a day from 8 - 10 pharmacies, repeating every few days. This was also in Sydney's central cbd so there's a pharmacy every other block.

I assume the situation is similar with pseudoephedrine, though the drug class/restrictions may be different.


Interesting you can get rikodeine at all, given that even mild (8mg) codeine tablets are now prescription only here. That stuff (from a quick search) appears to still be available OTC.

As someone who has always used low-dose codeine+whatever analgesics when I have had a bad cold or migraine, I resent these being removed from the market recently as well. People can tell me that paracetamol and ibuprofen are just as effective until they're blue in the face, but that little bit of opiate uplift when I'm feeling like absolute shit was very psychologically helpful... Oh well, this is the world we live in.


Relative to infinity, anything finite is small. I agree with OP, if anything it should be "finite universe".


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: