More

mateuszbuda · on June 29, 2024

I tried not buying food with added sugar but it’s surprisingly difficult. Here is an interesting analysis I did some time ago which shows that for half of the food items, sugar is the main ingredient: https://scrapingfish.com/blog/scraping-walmart

mateuszbuda · on May 18, 2024

At https://scrapingfish.com/ we have both options, usage based https://scrapingfish.com/buy and subscriptions (monthly unlimited requests plan) https://scrapingfish.com/unlimited. Despite subscriptions being cheaper option per request, usage based is way more popular. Only less than 10% of our users have subscribed to unlimited monthly plan. I guess usage based plans give users more control over how much they spend or maybe they simply don't want to subscribe to another service.

telepathy · on May 18, 2024

Residential IP bandwidth is a commodity priced per GB. The rest is available OS for free. This isn't a SaaS exactly.

mateuszbuda · on April 6, 2024

I’m still working on a web scraping API (https://scrapingfish.com/). For some people it’s evil bot but for others it’s enabler for public data access. I think it’s useful.

mateuszbuda · on April 1, 2024

Anyone can share experience with https://ollama.com/ ?

bovem · on April 1, 2024

I love it. It is easy to install and containerized.

Its API is great if you want to integrate it with your code editor or create your own applications.

I have written a blog [1] on the process of deployment and integration with neovim and vscode.

I also created an application [2] to chat with LLMs by adding the context of a PDF document.

Update: I would like to add that because the API is simple and Ollama is now available on Windows I don’t have to share my GPU between multiple VMs to interact with it.

[1] https://www.avni.sh/posts/homelab/self-hosting-ollama/ [2] https://github.com/bovem/chat-with-doc

notjulianjaynes · on April 1, 2024

Stupid easy. I was speaking with an LLM after pasting two lines into terminal.

cgopalan · on April 1, 2024

I use it on my 2015 Macbook pro. Its amazing how quickly you can get set up, kudos to the authors. Its a dog in terms of response time for questions, but that's expected with my machine configuration.

Also they have Python (and less relevant to me) Javascript libraries. So I assume you dont have to go through LangChain anymore.

nipponese · on April 1, 2024

Installing additional LLMs is a single command. I am currently loving Dolphin.

jdwyah · on April 1, 2024

super easy. fun to play with. fast.

we screwed around with it on a live stream: https://www.youtube.com/live/3YhBoox4JvQ?si=dkni5LY3EALnWVuE...

If you're writing something that will run on someone's local machine I think we're at the point where you can start building with the assumption that they'll have a local, fast, decent LLM.

auggierose · on April 1, 2024

> If you're writing something that will run on someone's local machine I think we're at the point where you can start building with the assumption that they'll have a local, fast, decent LLM.

I don't believe that at all. I don't have any kind of local LLM. My mother doesn't, either. Nor does my sister. My girl-friend? Nope.

mateuszbuda · on March 26, 2024

We keep working on web scraping API with custom-made mobile proxy pool: https://scrapingfish.com/

There is no AI in it so far but we consider adding support for parsing the result to extract data using LLM.

unsupp0rted · on March 27, 2024

Suggest you let users sign up and make a few free requests. I would try this, but I'm not prepared to hand over my credit card number for no reason.

mateuszbuda · on March 12, 2024

To give you a reference point, at https://scrapingfish.com/ we charge $0.002 per API call but in our case, the API call gives you the value by itself: access to mobile proxies and cluster of browsers. For you, I would recommend to either give the API access for free as I assume users already pay for the product. Another option would be to include API access only to higher plan users who pay more and this could be an incentive for some users to upgrade their plan.

pknerd · on March 13, 2024

How is scraping fish different from a million other services?

muzani · on March 13, 2024

When there's a million other services, they don't have to be different. It's like VPNs or burgers. It just has to work and not be too expensive.

mateuszbuda · on March 13, 2024

Two main differentiators. 1. Pricing. We charge for requests and the cost of each successful request is the same as opposed to misleading API credits system used by others. Also, we sell request pack which are valid up to 1 year as opposed to monthly plans with expiring unused API credits. 2. We use our own high quality and ethically sourced mobile proxies as opposed to shared pools from large proxy providers (https://scrapingfish.com/how-ips-for-web-scraping-are-source...).

mateuszbuda · on Feb 5, 2024

Only around 5% of the products from IndieHackers generate a monthly revenue exceeding ~$8,333 (around ~$100k/year).

Source: https://scrapingfish.com/blog/indie-hackers-revenue

goenning · on Feb 5, 2024

Part time freelancing is underrated. More people should try it. That frees up a lot of time for indie hacking

gchamonlive · on Feb 5, 2024

Do you have any tips for starting out trying freelancing gigs? I think the main issue would be visibility for me, I still need to build a protifolio that speaks for my skillset, because right now I don't think I have much demonstrable skills outside of a traditional hiring interview pipeline.

philip1209 · on Feb 5, 2024

I run this free community for fractional tech workers - come check it out:

https://news.ycombinator.com/item?id=37419830

The reality is that being self-employed requires building business skills and being able to sell your skills. I previously founded a VC-funded developer marketplace, and the people that won all the jobs were the great communicators - not the most experienced (or inexpensive) developers.

Fractional work is a nice in-between where you ideally have a retained part-time contract, paid weekly or monthly, so that you aren't constantly looking for new projects.

Roark66 · on Feb 5, 2024

Exactly this. I remember back when elance was a thing I signed up and I won my first gig in 2 days. I asked my client why he choose me (charging $1k) when others were bidding to do the job for $50 and he said he liked the message I sent him... Communications is everything.

thibaut_barrere · on Feb 5, 2024

That is way higher than what I would have expected!

mateuszbuda · on Jan 23, 2024

A web scraping API: https://scrapingfish.com/

pjot · on Jan 23, 2024

I’m curious who your customers are; being tech savvy enough to use an api, but not enough to scrape the web confuses me. Mind explaining how you make money?

fullspectrumdev · on Jan 23, 2024

They seem to handle all the more annoying parts of web scraping: bypassing anti scraping things such as rate limits and captchas.

nomel · on Jan 23, 2024

Being tech savvy, there's no way I'm implementing that for $0.002 per request. $150 gets you 75k scrapes.

kube-system · on Jan 23, 2024

And if you're not doing it yourself, $150 does not get you very much labor.

mateuszbuda · on Dec 13, 2023

I've created a mobile proxy pool for a personal project of web scraping real estate data: https://scrapingfish.com/blog/byo-mobile-proxy-for-web-scrap...

Then, I expanded the infrastructure and built a web scraping API on top of it: https://scrapingfish.com

pcthrowaway · on Dec 13, 2023

This is fascinating. So does this mean that if I'm using internet via my mobile network, I can generate multiple IPs that I can use, and use them all simultaneously, or alternately?

Is there a concern of violating mobile network policies in doing so?

mateuszbuda · on Nov 5, 2023

0.2 cents is how much a single request costs for well-protect website where web scrapers look for emails (e.g. LinkedIn): https://scrapingfish.com/#pricing

Paying additional 0.2 cents per request, if it can significantly improve your success rate, is not really that much and some people use LLMs for even simpler parsing tasks to save time on development efforts.

axlee · on Nov 5, 2023

I don't see how that contradicts my point. With your provider, we're talking about a 100% price increase, for what can't be more than a few points of accuracy in return (and a huuuuge slowdown, because LLMs are slow). At scale, it's all about going through a lot of pages, and fast: accuracy is a bonus, and being 90% accurate is better than being 99% accurate if the throughput is divided by 10 or more.

victorbjorklund · on Nov 5, 2023

No one uses a saas like that one for large scale scraping (billions of requests)