Hacker News new | past | comments | ask | show | jobs | submit login
Go Read: One Year with Money and App Engine (mattjibson.com)
189 points by kasbah on May 21, 2014 | hide | past | favorite | 74 comments



I would like to thank you for creating go read and making it open source, it's what I use as RSS reader, using my own free quota of App Engine. I hope I can contribute back some day.

  "30-day trial: This action cost me about 90% of my users. Many were angry and cursed at me on twitter. I agree that it is sad I did not say I was going to charge from the beginning, but I didn't know that I would be paying hundreds of dollars per month either."
Honestly I think the sense of entitlement nowadays is way too high, people use a product that is free and needed weeks or months of your own free time, and then complain about it when you change it.

If you decide to charge for it, you're a greedy bastard, instead if it's free, they say "if you aren't paying, you are the product". Other complain that the product doesn't work, when instead it's a case of PEBCAK. When it's not, it means you're going to say goodbye to a couple of night's sleep or a weekend or too, or maybe it's ONE feature away from being perfect (again).

...sometimes I hate people :(


People get to be mad (within reason) when a free service stops being free without prior warning. Because they too invested time and effort in a product, making it part of their life, only to have that wasted when terms change.

That is not to say, in any way, that you shouldn't charge - but don't expect users of a free application to change to a pay to use model en masse. Unless, of course, you don't have any competitors who do a similar thing for free...


True, but be polite about it: "Sorry, I'm disappointed because my expectations differ from yours. I'll found an alternative. Good luck with your project"


Having to pay a few dollars a month is hardly a trigger for calling something "wasted"... In my own humble opinion.


"Honestly I think the sense of entitlement nowadays is way too high, people use a product that is free and needed weeks or months of your own free time"

How is he going to recoup anything for his time if you use his product in a way he can't charge for?


Sometime we spend time on a project for the good of the community, and it's ok, because sometimes I give back time, sometimes someone else does and so on.

It's also ok to try to get some money out of your time, if it's desired.

Not ok: use someone else's time to gain personal profit, expect someone else to pay for your actions, etc


The transition wasn't handled very gracefully...

I was an early user, and one day I went to read my feeds, only to be greeted with very curt "trial expired" screen.

That was the first indication of any form I'd had the service was going pay-to-play.

I wasn't upset, but I could see how someone would be -- it was just a very abrupt, almost rude, way to communicate the change.

(Caveat: All from memory. Memory is unreliable, so take w/ salt.)


Interesting writeup, thanks.

I'm a user and have been negatively impacted by the feed fetching optimizations - daily feeds are often a few days behind and come in bunches. Two examples:

- Penny Arcade updates its comics Monday, Wednesday, and Friday, always at 7:01AM UTC, and then news other times during the week. It's Wednesday at 4:25PM UTC - 9 hours after - and goread hasn't picked it up.

- Dinosaur Comics is updated weekdays. I'll eventually get all of them, but usually two or three at a time. For example, yesterday I marked all my feeds as read; today, I have entries from Monday and Tuesday, but not from Wednesday.

I had hoped that the move to the everyone-pays model would give you the resources (either developer or quota) to fix these issues, but they've gotten no better or maybe worse.

I haven't looked at what you're doing, but I believe Google Reader used pubsubhubbub where available to reduce/eliminate polling for many popular feeds.

I honestly didn't have a great experience with my last bug report, so I haven't tried again.


goread has pubsubhubbub support. penny arcade doesn't use it, though. The penny arcade feeds are indeed giving errors. I'll look into those.


At Theneeds we use a "sliding windows" approach to deal with polling. Say you run the scraper every hour. Each feed F_i is scraped once every n_i polls. If the feed returns more than his average news, then n_i gets decreased, while if the feed returns no news, it gets increased.

Perhaps with a similar trick you can run your scraper more frequently on some feeds, still keeping the cost under control.


Thanks! I don't really know when the website has a problem, when goread has a problem, and when I have a problem, so I end up assuming that it's goread problem. I like how the desktop web version gives a "last refreshed" and "next refreshed" indication.


everyone-pays model works very well for everyone. since 2005. http://rssforward.com/


"Second, it is impossible to meet Google's terms of service with an RSS reader. They dictate that ads may not be shown along adult or copyrighted content. Determining that for external sources of data was not going to happen. I got a few emails about violations and added a system to prevent ads on certain feeds. But I was playing whack-a-mole and it would never end. Eventually I made a mistake and they banned my site. I gave up on ads at this point."

I've experienced this myself, and I'm hearing it more and more from others. Maybe this is a market need that is going unfulfilled.


There's a great company that used to be called AdSafe, now called Integral Ad Science, that offers a service to identify adult or other non-ad-friendly content.

http://integralads.com/

I don't think it'd be feasible to run an advertising company that was less strict on adult/copyrighted content. People that buy advertising are really risk-averse on that subject, they don't want to see a meme on twitter or whatever of their brand next to porn.


Two things.

1) The content that got my site kicked off adsense was text only, and it was a girl writing about going skinny dipping. It was tamer than much of cable TV, but I'm guessing some word frequency algorithm thought it was more than that. We aren't talking ads showing up next to porn. The great frustration is that if you aren't a big player, google doesn't give a shit and maybe they will turn your ads back on, maybe they won't. I'm guessing there are a lot of people out there that would jump on any ad network that didn't make you fear this all of the time.

2) My assumption, and I could be wrong about this, is that there are a lot of companies willing to advertise that aren't as paranoid. They may be able to pay less per click for the same quality of traffic, due to less competition for opening up to content that is less than PG


RE: 2), the difficulty there is that you're now chasing the long tail of small dollars in a high-initial-touch business. Maybe a self-serve model could work, but the big corps and ad agencies representing the big money in the market are really really big on the brand safety thing. I'm not saying they're right, but as long as they're the ones signing the big checks, they don't really have to be.

I guess a self-serve business would be possible, but since your costs are the same for non-sketchy (for whatever silly definition of sketchy) content, you basically become a remarketer for the content that everyone else doesn't want to advertise on. You can sell it, sure, but it's a hard pitch.


I'm sure you're right, which is why I'm moving away from ads. The examples that give me some sliver of hope are reddit and tumblr, both of which have some form of ads.


Well, you'd think that Google could pro-actively not show ads on occasional 'bad' content rather than putting it on each individual website operator to avoid displaying the adsense iframe (or whatever it is) on content that might bother Google. They're already identifying it.. why not just not show the ad? And then only penalize accounts that have a really big % of unadvertisable content.


Interesting that this article was written in March, about a week before google announced approximately %30 price-drops across its cloud offerings. I would have been interested to see the statistics for this year too, I don't think the chart tells us much about the sustained success (or otherwise) of your subscription model, due to only showing the 2 months after the introduction of subscriptions. The 2 months show rapid drop in income, has it stabilised this year?

I'm wondering, for the icons or other cacheable content, have you thought of using cloudflare or similar? What are your costs going to be this month, after getting onto the HN front page again?

After giving it a try, I noticed that this text is quite out of date now:

    We've just released! Read the blog post about it.


Congratulations on an income-generating project! That's an immense achievement. Perhaps the infrastructure cost is worth the headaches you spare yourself by being on GAE but have you looked at dedicated infrastructure from the likes of OVH/Hetzner (on both providers you can get ~2TB storage/64GB mem boxes for around $100/mo)? I moved all my projects to a single OVH box and it's been a champ. It does mean slightly more maintenance burden (ex you need to roll your own backup/restore plan). Perhaps it is a poor fit but that seems like a possible path to get those costs way under control


Have to agree. Great article but it just confirmed my decision to stay away from app engine. What's the purpose of paying for something that's both harder to use and more costly than a traditional server? Datastore is great but I can get the benefits at a quarter of the price. At least with Heroku you have something that's easier to use.


FWIW, original lack of discussion [0] and [1]...

[0]: https://news.ycombinator.com/item?id=7402393 [1]: https://news.ycombinator.com/item?id=7408089


I actually find this very interesting. How days and times and luck plays such a role. And submitter...

Having the same exact article submitted 3 different times and getting 2 votes, 4 votes, and then #1 front page. That's interesting.


I have no idea why this took off. I just thought I'd try and re-post it as I really like goread.io and the blogpost. I don't think the submitter matters. Maybe a little if people see that it's self promotion. But there is plenty self-promotion on the front page. More of a timing and luck thing I think.


Small bit of feedback: I'd love to see pricing for Goread before I go ahead and sign up. From the landing page, it's not even clear the project is active or that it costs any money.

That might be a good way to increase revenue.


That's a big one.

It should be part of the pitch, can even be after the screenshot on the front page. E.g.,

"Sign in a give it a try for the next 30 days. If you like what we've built, the subscription is only $3/Mo or $30/Year (we'll give you 2 mo free for making that commitment)". Or something like that.

It would have bugged me if I hadn't read that there was a subscription and came upon the Account page while perusing.


I'm actually a little surprised it costs so much to run on app engine. Given that you've settled into a more predictable user model in terms of costs and probably growth, why not lower your costs by investing in dedicated server space or possibly reserved instances on AWS (unless there is a app engine equivalent).

Without knowing what your server load looks like, I would imagine you could save a couple hundred dollars a month in hosting, which would go right to your bottom line profit. A couple hundred dollars a month isn't huge, but at this point in your business that's say $2,400 a year. From the looks of it, that's at least 2-3 months worth of revenue or almost 5 months worth of profit.

I think it's at least worth considering with where your project is at right now.


A factor that's usually forgotten is the developer's time. App Engine saves you A LOT of time compared to other solutions.

* Once you write your app the App Engine way, it scales automatically. You don't need to re-write periodically to accommodate growth, add replication, shard your data, ...etc.

* Data is replicated to several servers and data centers. You don't worry about losing data due to a hard drive crash.

* If an instance crashes, or a whole datacenter dies, your app keeps running.

Basically, measure how much time you send on non-code activities. App Engines saves you 90% of that.


"The App Engine way" involves writing a lot of things from scratch that normally you wouldn't have to, and spending a lot of time on kinds of optimization that you are unlikely ever to need. This offsets that perceived benefit in terms of developer time.

You could argue that it's good to spend this extra time because it makes your site more scalable. But if you ever reach the scale "the App Engine way" becomes relevant at, you are likely paying an order of magnitude more than you would if you bought dedicated servers. For all smaller cases, you are prematurely optimizing for scalability.

App Engine's reliability is easily overstated. App Engine has had a long history of issues with their services like datastore. When datastore does not function, there isn't any reasonable way to build a backup storage system because the platform is so restrictive. Frankly, backup services are cheap, and there are plenty of hosted database services if that's all you need. Ones which don't lock you in forever.

So, yeah, sites built on App Engine do go down for various reasons and still have to be monitored just like anything else. Except that you can't use any of the usual tools because the platform is completely unique and the underlying software is proprietary and completely secret to you as a platform user.

App Engine apps realistically only run on Google's servers. So even in the best case, you must have a lot of pure faith in Google. Because if you have reliability problems, problems with the platform's restrictions, problems with the terrible support or if the price skyrockets again, or if the platform gets wound down... you will be absolutely forced to rewrite your whole project to move it off of App Engine. You really won't have any idea that this is going to happen until man-years have already been invested.


It's great for standalone, spare-time projects that don't need 100% uptime or monitoring, though. I have low-traffic websites I haven't touched in months or years and they're still running fine. On anything else I would probably have shut them down by now due to some forced upgrade or other.

You pay a bit more developer effort up front (weeks, not years, for these projects) so you can walk away, but that's when I'm actually interested in working on the code. Migration is a non-issue since I don't plan to migrate. If I have to do any real work at all, I'll probably shut the site down anyway.


AWS doesn't have the datastore. From the article: "When you compare to AWS prices, no one mentions the datastore."


Yeah, I totally get that, but I guess I don't know what part of the Google AppEngine data store that works better or is a better fit than Amazon's DynamoDB, or running your own MongoDB or CouchDB or Hbase or Cassandra or whatever.

What does Google's datastore do that makes it worth sticking to and paying more for.


Excuse my naivety but isn't data store just a managed nosql database? You could switch over to Google Cloud (AWS but from Google) and still use Data Store or you could switch to any PaaS and use something like mongohq.


> Excuse my naivety but isn't data store just a managed nosql database?

Yes, so what? Its still something you need to cost out for an alternative.

> You could switch over to Google Cloud (AWS but from Google)

App Engine is part of Google Cloud (the AWS-equivalent umbrella of offerings). You probably mean "Google Compute Engine" instead of "Google Cloud" and "EC2" instead of "AWS".

> and still use Data Store

You could (Google Cloud Datastore seems to be very much the App Engine datastore outside of App Engine) but its then a separate cost you have to include in the comparison.

> or you could switch to any PaaS and use something like mongohq

You could, but its then a separate cost you have to include in the comparison.


They not only have a datastore, they have multiple!

RDS is mysql or postgres. DynamoDB is your nosql solution.

ElasticBeanstalk + DynamoDB is what you use in AWS to get the same "App Engine" type service.


Great read.

This is especially interesting to me. My side project http://www.longboxed.com was recently launched to a modicum of regular users (~300). My app runs on Heroku on the free tier with the 'Hobby Basic' level of the Heroku Postgres database. All told it costs me ~9 dollars a month. No big deal.

However, if I ever stepped the site up to another tier I'd be looking at ~50 bucks for the database and ~36 bucks for another process instance. These expenses can add up fast for a site that doesn't currently generate any money.

Anyway - it is nice to see examples of introducing a pay model into your app after it has launched.


Congratulations on launching. I've been following your development on tumblr and it's been great seeing your process.


Great stuff, Matt. I've always wanted to build something in the RSS space that could net me some extra cash. Write-ups like this let me know it's not a totally hopeless idea!

As for optimizing when to check for new feed items, I recently moved to the "moving average" strategy described here [1] for my personal feed reader [2]. Overall, more active feeds have really benefited from it while I still need to tweak the algorithm for less active feeds, I think.

Here's to another successful year for Go Read!

[1] http://www.rn.inf.tu-dresden.de/uploads/publikationen/feedpa...

[2] https://github.com/edavis/riverpy


I wonder if the author has costed out how much running this app would cost using a VPS provider like Linode or Digital Ocean? You can get a lot of resources for under $100 a month and the needs of most apps are pretty basic (file storage+db+web server). Given the memory requirements and speed of Go, it shouldn't take much hardware to serve a lot of requests with a few go instances behind nginx say.

I'm not convinced that being locked in to one cloud provider's services and APIs is healthy long term - it means you are locked in to that ecosystem and it's harder to consider alternatives, even if your needs are quite straightforward, so you can end up in a situation where you're paying hundreds of dollars a month for hosting when you don't need to be.


What about the datastore? I can get thousands of requests per second (which I did when I released and it hit HN), and the site was as fast as if I had 1 hit per second. Traditional databases don't scale like that. If you want it to scale like that, it gets difficult and expensive. The datastore does all of this + multi datacenter replication + predictive performance regardless of load.


Matt, you might want to look at DynamoDB, which would at least let you do a cost comparison to an AWS stack.

This tweet from former AppEngine dev Brett Slatkin caught my eye: https://twitter.com/haxor/status/411351463806263296

I still think Dynamo has a ways to catch up in terms ease of use to GAE, but it provides the performance, scalability, and replication like datastore.


If you're caching feeds, would you require thousands of requests to the db per second, or just thousands of web requests?

It'd be an interesting test to try setting up a simple stack of psql/nginx/memcache and your go processes behind, and serving up the same data from it, to see what sort of performance you get, but I understand why you wouldn't bother if your app is somewhat tied in to the datastore already and works fine for you as is. Just out of interest, did you start on something else and move over to Google App Engine for performance, or start there?


Started on app engine. I've been using and happy with it for 4 years.

Yes, I could set up a simple psql stack. But I would lose automatic multi data center replication and infinite scaling, so that's not a win. My database hit 500GB in a few months. I don't want to deal with scaling up psql instances when it doubles in size over the next year again. See my first post linked below about when newsblur hit the HN frontpage and was effectively down for 3 days due to load. It's because psql doesn't scale in that direction.

http://mattjibson.com/blog/2013/06/26/go-read-open-source-go...


Thanks for elucidating - I work on sites with a different balance of users vs traffic and data (maybe more users but less data and not so spiky), so it's interesting to hear about your experience on that and why you'd go for app engine, which compared to self hosting can seem like an expensive option.


>Again, it's free software so I'm not sure what the problem is. I learned from this experience that some people will never pay, and some will. I'm not opposed to alienating people who won't support all of the time, effort, and money I put into this product.

It's always surprising to me when devs are surprised by outrage at the change/removal of a free product. There is a non-negligible cost for a user to research/choose/setup/learn a new tool. In this case, a feed reader has favorited articles, read/unread state of articles, etc. When you pull the rug out from users that have made that investment who now have to start over, they are going to be mad, regardless of what they paid.


The software is free. They can just run their own. It takes about 5 minutes of one-time setup. I wasn't taking away anything they couldn't already get for free and with minimal effort.


That's not realistic at all. Maybe for the average HN reader, but not for average users, even RSS savvy ones.

I know plenty of people who use RSS that wouldn't be able to check if Python is in their PATH as per your instructions. Unless there's some friendlier instructions that I'm not seeing.


Then maybe if it's too hard it's worth paying for.

I am always surprised (and dismayed) and the grotesque sense of entitlement engendered in people that they expect everything to be free, all the time, forever.


I agree completely, I was just offering a reason as to why people were reacting the way they were and that saying they lost nothing is just plain wrong.


Congrats Matt. Well done on testing various revenue options, instead of eating the costs just to be the "nice guy", only to have to shut it down eventually.

Ruben Gamez of Bidsketch had a similar story about switching from freemium to paid only -

http://www.softwarebyrob.com/2010/08/18/why-free-plans-dont-...


A long time user of your app, but from my own quota of appengine.

Thank you for such a lovely product. Someday I would like to contribute to your project(s) more.


I understand how app engine takes away a lot of pain wrt data storage, uptime, etc.

But, $600+ is a lot of money. You can get a good number of very decent servers for that kind of money.

If you are not afraid of managing your own data, I think hosting stuff on virtual servers is way more cost effective.


Nice article. Thanks for sharing!

btw: Lots of typos in the article, be sure to spell check !


Did you consider grandfathering existing users in?


It seemed that he was trying to move the existing users from costing him money to at least a break even situation. In this case grandfathering existing users would have propagated a negative cash flow.


Btw, what are some good options nowadays for [almost] free website launch? Until/(if ever) you get "big" of course...


For static websites, Github Pages is free, nearlyfreespeech is very cheap. For full-blown web apps, didn't you hear the guy? GAE has a pretty generous free tier to test out stuff. So does Heroku, and it's way more user friendly. I still prefer GAE because I can use Google's tools. If you want to rent instances, you're going to have to pay from the beginning afaik. AWS and GCE are the usual suspects here.

https://pages.github.com/

https://www.nearlyfreespeech.net/

https://developers.google.com/appengine/

http://heroku.com/


GAE looks interesting, but the main downside is that there is no free plan for SQL database (SQL Cloud service).


I didn't use GAE that much, I only know that whenever I needed to try something for free, Heroku's plugin architecture comes in handy, since most addons have free tiers. They even support neo4j, an amazing graph database I just started playing with (and it's in beta, so they give a shitload of space for free).


I'm using the free tier of openshift from red hat.

The performance is good enough that I'm able to cram a java application powered by jetty + mysql + phpmyadmin in a single small "gear" (they give you 3 small gears free, 1gb of hard disk).


I plan to develop in Golang, but as it seems they have Go 1.1 version.


AppHarbor (a YC company) is nice for .NET stuff. It's free unless you want a custom domain, in which case it's only $10/month.


Microsoft Azure is pretty easy to use and has continuous deployment available from Git repos. There is a free tier where you can have 10 websites.

I've got an MSDN subscription, so I've got free credits to bring me to a higher tier and it's a no-brainer. You could also qualify for the BizSpark program which provides you with a free MSDN subscription.


Just remember, the MSDN Azure credits are not meant for production. But BizSpark gives you a production use clause with the free Azure credits.


Thanks. I had no idea this was the case.


If you're willing to get your hands "dirty", you can self-host on a vps, it's damn cheap those days. DigitalOcean is a favorite around here, but there are also more "professional" companies like Linode, and more cheap options like my favorite Ramnode.


yes, but since I'm mainly "Dev", I'd like to minimize "Ops" side as much as possible :)


> number of users is not a factor for Go Read

You could charge users a fee per feed proportional to server cost (e.g., frequency of posts) and inversely proportional to the number of subscribers.

* Unpopular/infrequent feeds (like my friends' blogs) would be free

* Popular/infrequent and unpopular/frequent feeds would be cheap, maybe $1/year/user/feed

* Popular/frequent feeds would cost more, maybe $10/year/user/feed

This way you can peg your income to an exact multiple of your costs.


That's pretty awful from the user point of view though. It's bad enough to have to pay - much worse to pay a randomly fluctuating amount and feel like you need to count your pennies every time you add a feed to your reader.

Definitely would not advise this model for a feed reader.


Are there ways to get full articles of HN posts[1] through RSS?

[1]news.ycombinator.com/rss


From the article:

"A simple rule in computers is to make something run faster, have it do less work. I remember reading about how grep works quickly. Instead of splitting a file by lines and then searching for the string, it searches for the string, then finds the newlines on either side. Thus, if a file has no matches, it won't ever have to do the work for line splitting. Done the naive way, it would always split by line even if there was no match. Do less work."

Good observation, but I doubt if that's remotely even the reason why grep works quickly.



I have seen those and those were what I was implicitly referring to. And if you did read them,"- Don't look for newlines in the input until after you've found a match." is just one of them but not the major contributor for speed. It's Boyer-Moore.


Hm, there seems to be written like so there:

> Moreover, GNU grep AVOIDS BREAKING THE INPUT INTO LINES. Looking for newlines would slow grep down by a factor of several times, because to find the newlines it would have to look at every byte!

(italics added, but uppercase original!) On my reading this sounds quite huge; I seem to understand the gist is that it's still a significant gain after Boyer-Moore. But, whatever.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: