Hacker News new | past | comments | ask | show | jobs | submit login
Scraping Recipe Websites (benawad.com)
463 points by benawad on May 11, 2020 | hide | past | favorite | 196 comments



I highly useful tool in my household for dealing with the SEO/tracking scourge that recipe blogs have become is https://www.paprikaapp.com/.

Hoping someday to have some spare time to integrate this with https://grocy.info/ and have a pipeline for recipe -> preparation automation.


Big fan of this app, and I love it so I don't have to keep revisiting the sites. This is one of the few apps, I've purchased multiple times: for my iOS eco-system (I typically cook with my iPad), my android phone (so I can add recipes on the go), and my partner's devices so she can add recipes to our list.

We've developed a little workflow where we put all the recipes we want to try into an "incoming" category, and then move them to one of our custom categories when we make it and decide it's worth keeping. This is a reaction to becoming recipe hoarders when using a site like Pintarest for something similar.

The iOs has a really subtle, but nice feature when you're cooking with the app. It prevents the iPad from going to sleep and locking which you have messy fingers.


It also has a timer built in that, unlike the default iOS Clock app, allows multiple timers to be run simultaneously. Just tap on any time in the recipe to start it as well.

Paprika is well worth the couple bucks.


You might want to checkout an app I built called Cooklist. It has the features of Paprika + Grocy + Instacart + Pinterest all in one. https://cooklist.co


One reason I bought Paprika is that it is a one-time-purchase. Do you have any thoughts on offering that?


Looks really great! I have a pretty solid workflow and dataset built up in the existing setup but when I'm looking to upgrade/update going to try it out. Thanks for the rec


How are you pulling purchase information from stores?


Really sharp looking website


works in Canada?


Only in the US at the moment but planning for Canada later this year


We built this at https://ultimatemealplans.com with an eye towards consistency & simplicity.

You can

- plan your meals for the week in seconds

- generate your shopping list

- exclude foods you don't like/want

- checkout online with amazon fresh/instacart and get your groceries delivered.

Happy to demo it for any HNers who want to give it a try.


I was a private yacht chef. I applied restaurant kitchen management to a boat galley. Since I've thought about writing a blog about home economics of managing the kitchen. It is so much more than recipes.


and paprika exports into h-recipe http://microformats.org/wiki/h-recipe

there are quite a few sites that also put up their recipes in h-recipe format, which makes it very nice to scrape.


I’m a very happy paprika user too. I would like to know more about your Workflows with paprika.


How much time did/do you spend for home grocery management? I'm super interested in doing this for my family but I'm worried I won't have time to maintain it.


Not as much as I'd like to but more time on repetitive tasks then desired at the same time. I.e. I'd like to do more cooking when I know an ingredient might be getting close to expiration (that damn chicken breasts I let thaw but then forgot to cook because we ended up ordering out for the night) vs making a shopping list and doing 40 checks of how much do we have, what can we make, what did we make recently we either enjoyed and want to try again and tweak, and not repeating meals.

All of that tracking of what my partner and I are making, ingredients, tweaks, etc... is almost entirely in my head or quickly put together google sheets that are categorized but difficult to analyze in any historical fashion.

I'd rather just be better tracking my actions at point of interaction with the goods and free up more time for analytical development and more longitudinal reflection.

I really started focusing on my cooking skills about 4, 5 years ago and have appreciated the result of those efforts. A lot of chef skills I appreciate is the ability to improvise and combine what's around into an incredible dish and I think that something that let my data side push usage and expiration as inputs to that decisioning (arrowroot flour been in there for a year, I should make those fried shrimp we had when we were on whole30).


This is also why I love thin shims over already effective solutions


I love Paprika, but what keeps frustrating me is the inability to share recipes with my partner. I can Airdrop a single recipe to her, but doing this for all of them, one byone, super tedious, and then she makes some changes which I'd like to have too, and there seems to be no reliable way to get her changes back on my devices. It's all the more frustrating as Paprika does sync really well, but apparently just within a single Apple ID.


Instead of using iCloud sync create an account with their service and log in on both devices with the same account. My wife and I have been using it this way for years and it works really well.


Wow, thanks a lot! I didn't realize you could do this! Just tried and it's perfect.


You should give AnyList a try. It also features a shopping list which is handy. Plus you can both use your individual accounts and sync recipes. https://www.anylist.com/


If you're interested in switching, http://www.mysaffronapp.com/ has a bulk import from Paprika and you'd be able to share an account with your partner.


We just login to each device with the same cloud account. I don't even think it's actually an apple account, because I'm on Android.


Just tried that and it works. I didn't realize you could do this. Thanks!


I miss punchfork. Yummly is the yelp where punchfork was the craigslist but with a modernish interface for recipes.


I'm surprised nobody has mentioned "Recipe Filter" https://addons.mozilla.org/en-CA/firefox/addon/recipe-filter...

Cuts the fluff and puts the recipe front and center. I wouldn't be able to find recipes online without this.


Paprika 3 (I use the iOS version, but I believe the Mac version has the same function) has a fantastic web scraper for recipes. I've had to correct maybe 1-2 errors across 100 recipes I've brought in from a bunch of different sites. It's super helpful to look through them in a standardized way (and you can sort by ingredient/category) to figure out what to make.


Tried this out and I have to say I'm impressed on the first recipe. Scraped it correctly (albeit from the BBC which has a reasonably sane layout) and, since I've only got 75g of dessicated coconut instead of the 85g required, I wanted to scale it by 75/85 ... which worked. I just typed in 75/85 and it worked. Amazing.


Thank you all for the Paprika recommendation. I just grabbed it and imported the recipe I did a for Mothers Day Eve, and it looks great! That recipe wasn't one of the worst offenders, but it's off to a good start for Paprika.


I think most recipes are published using a microformat that makes this pretty easy, and that's why Paprika (I use it too!) so rarely screws up.


Yep. And if Google detects that your page contains a recipe and the microdata isn't perfect, or doesn't include all the things that Google wants so it can show your recipe to people without them clicking through to your site, Google sends you an e-mail through Webmaster Tools telling you to fix it, with the implied threat that your page won't be listed if you don't allow Google to use your work for free.


I'm a little torn with this. On the one hand, it's messed up that Google forces these companies to basically hand over their data, as you put it, but on the other hand, if they don't push companies to do things like this, single-visit webpages like lyrics and recipes inevitably become ad-infested, SEO-driven trash.

Maybe if they used a carrot in addition to the stick, it'd feel less sleezy, but I'm not sure what exactly that would look like.


other hand, if they don't push companies to do things like this, single-visit webpages like lyrics and recipes inevitably become ad-infested, SEO-driven trash

The opposite may also be true. If Google sent visitors to these sites instead of displaying their content without compensation, the sites wouldn't be so desperate to extract every last penny out of the reduced number of people who click through to them.


But how will I read about "Dakota", an avid yoga enthusiast who just happens to be a mom, who enjoys making healthy and savory meals for her family while blogging?

Seriously, I hope this spells an end to the Google ranking imposed nonsense that makes the simple act of searching for a recipe so insufferable.


It's definitely grown worse now, but I think that this originated from recipe sites that people actually used to follow, because the blogs were interesting and we got to know the writers, and what's changed is more that we're jumping to the first Google hit and we expect them just to grant us the information we wanted.

There is a difference between opening up a recipe site, like a favorite blog, or the New York Times (which does the same kind of spiel before its recipes), just to read and find out what interesting thing they have posted, vs doing a search for "pasta carbonara," clicking on the first link, and having to read a life-story.

I never mind opening up the recipe section of the New York Times and reading about what's so interesting about this recipe, and memorable times it was served. That's because I trust the article to be vaguely interesting, and reading it is a form of entertainment. There's a reason why no newspaper's recipe section has ever simply been: "Pasta Carbonara: 1 lb pasta. 2 oz Pancetta. 5 egg yolks. Cheese. Combine as directed below."

So I feel like the in-vogue hatred of these recipe site styles is more a reflection of how expectations on consuming and searching for recipes has changed, more than significant changes in how recipes have always worked.


I think it's also about different types of recipe collections.

There are cookbooks that are all recipes. This seems to be what the HN crowd is looking for when they search the internet.

There are cookbooks where each recipe is accompanied by a little story. These seem to sell well, judging by the number of them that appear on bookstore shelves.

And then there are cookbooks where all of the anecdotes are in the front of the book and the recipes are in the back. These are the ones I like because I can easily find what I'm looking for, but can still read the background about a recipe, if I choose. It's not right in front of me causing the actual recipe steps to continue on another page.

I think recipes and the internet don't mix, unless you're just looking up ingredients while shopping. It's one of the areas where a fat, old cookbook is always better, in my experience.

30 years from now, nobody is going to cherish grandma's dog-eared and tattered old iPad full of recipes.


I wouldn't mind the "recipe at the end" format if that were the actual case.

But it's not at the end. The actual structure of a modern recipe article is

1) Header

2) Blogpost

3) Click to read button

4) More blogpost, structured in a format that looks like an informal recipe - bulleted ingredients without quantities, discussion of steps.

5) Ad.

6) More blogpost

7) More ad that looks confusingly like a kind of recipe.

8) More blogpost

9) Nav that looks like the end of the blogpost.

10) More blogpost

11) Ad

12) Recipe ingredients

13) Ad

14) Recipe directions

15) 20 pages of chumbox, some of which structured in a way that kind of looks like a recipe at a glance.

Combined with some wonky JS that doesn't show the content until a half-second after I scroll it into view.

When I've got the recipe it feels like I've found Waldo.


And that's not the worst of it. Here's what I really hate:

- Search page and eventually find the recipe.

- Start making recipe. Up to my elbows in ingredients.

- Glance over to see the next ingredient I need, but there's now a pop-over I need to dismiss before I can see the recipe.

- Look for next ingredient, and it's scrolled off the screen because one or more adverts in the page have reloaded and have different sizes.

Most recipe sites have a nearly unusable UX.


You forgot the pop-up modal asking you to subscribe. Then you close that to get subjected to the bottom-floating one while you scroll.


>And then there are cookbooks where all of the anecdotes are in the front of the book and the recipes are in the back. These are the ones I like because I can easily find what I'm looking for, but can still read the background about a recipe, if I choose.

I inherited a series of Time Life cookbooks from my grandmother. They must have been printed in the 70s. Each of them essentially comes in two parts. A full size book with lots of pictures, introductions and visual guides and a small ring-bound booklet that's essentially all recipes. What I also find interesting about them is that many recipes are rather laborious and from scratch since they were written before having all that many kitchen appliances and pre-made ingredients.


There's also cookbooks that are almost more like a textbook, with technical cooking information followed by recipes (almost like textbook information followed by exercises).


There also might be an issue that recipes themselves can not be copyrighted. Article content and writing alongside the recipe can be copyrighted, but the recipe itself is not eligible for copyright protection at all. So if you're trying to make money off of a site, if it only had recipes, there is no protection whatsoever from someone setting up an identical copy and monetizing it themselves.


Many of the recipe sites are following a cargo cult-like methodology to SEO and only have stories before their recipes because of how they perceive Google's ranking algorithm will rank their content.


It's gone to extremes now, but to be honest - if someone wants to blog and post a recipe at the same time, that's their prerogative?

Some people actually enjoy reading those things too. There's a place for straight recipe sites, and a place for personal word-vomit blogs with a recipe at the bottom.

The web would be a sad place if you were only allowed to write your recipes in LaTeX.


Yes, they can do whatever they desire. The issue is when people are required to write life stories and superfluous content not because that's the direction they want to go, and not because it's appealing to their target audience, but because it's the only way they can rank well on Google.

On the user side if you only want straightforward recipes, you're out of luck, because they're never going to be near the top of the search results.

If you are one of the people looking to read these recipe background stories, even you get a mediocre experience, because the majority of the content you're reading was written primarily for Google (how many times can I reference salmon, fish, atlantic, norwegian, protein, healthy, omega-3 and smoked in this recipe story to hit all of the important keywords?).


>The issue is when people are required to write life stories and superfluous content not because that's the direction they want to go, and not because it's appealing to their target audience, but because it's the only way they can rank well on Google.

Citation needed. When I do a search on, say, Wiener Schnitzel (which I made last night) it looks to me as if the top searches are pretty to the point. Honestly, I tend to want some context for a recipe other than just a list of ingredients and some directions.


Same for me too - especially in the Google recipe snippets at the top of the search. I find those tend to have more straight recipes, whereas the normal search list has more of the story style.

And for context, I do like reading about the history of a recipe, why it was created etc... not all the stories float my boat, but literally all it takes is a scroll.


If a recipe is too hard to find, just move on. If there's a few paragraphs of things you don't care about before the recipe, press page down. Maybe let Dakota write what she wants on her own site.

One reason I haven't seen mentioned here for that personal content: it's also a question of building context and trust. If I go to Bon Appetit for a recipe, I know they've tested it a few times and it should be more or less OK even if I don't recognize the author. If I go to a barebones anonymous just-the-recipes site, I have no faith that it ever worked and if it did, it wasn't just a fluke and it was written down right.

Having some detail around a recipe from a previously unknown source allows me to build a connection to a persona in my head, genuine or otherwise. If a recipe doesn't work I'll know to avoid the site in the future. If it does work I can remember that connection and come back to the site again with some more confidence.


Dakota also owns many beautiful bowls and whisks handmade from sustainable materials, featured in 20 photos before the recipe hidden behind another link

I hate blogger recipes, luckily there are enough cooking sites that are well curated.


Don't forget the affiliate links to said bowls.


That's something I'm fine with. It's better than targeted ads, and sometimes answers genuine questions.


My understanding of why there's a life story copy at the top of recipes is that a recipe is not copyrightable, but a story is.

And, an AI can generate a story one time relatively easy.

https://lizerbramlaw.com/2015/04/07/copyright-protect-recipe...


My understanding was that it happened more organically.

When I see a story at the top of a page, what I feel like I'm really seeing, most of the time, is an attempt by a nano-influencer, or really just an average person, to build a brand.

You could read 100 recipes from a person that included no details about themselves, then see their name on a blog and not even realize it was them.

But if the story is from the Pioneer Woman, and now you know a bit about her family, you might be more receptive to buying her cookbook, or signing up for her subscription newsletter, or watching her TV show.

Or, more realistically, in the other direction: you could sell Netflix on a show about you based on the number of recipes people view each month, that have your life story embedded in them.

So, over time, you end up with the current situation, as infinitely many people attempt to climb the ranks.


Is there evidence that the "annoying recipe sites" in question include algorithmic stories, or are you speaking hypothetically?


I don't have any direct proof that recipe sites are doing this. I only look at just how many "stories" there are, and how many recipes. There just looks like too much writing. And it too is also very formulaic.

And it's also 'if I was to do a recipe site, I'd use a text generator'


Sounds cheaper than paying a freelancer $0.025 per word, too.

I suppose if you were just starting a blogspam recipe site, you could initially pay freelancers for the first articles, and use them as your training corpus. But since this sort of templated recipe site is already evil, just scrape all the other recipe sites and use their articles as your corpus.

"When we vacationed in { madlib( international_city ) }, we ate a { r.Name }, and it was delicious. When we returned home, we tried to recreate the recipe, but it was never quite right. After months of trying, it's finally perfect."

"As a child, I remember my grandmother making { r.Name } and eating it with all my cousins. It was her secret recipe. She never told anyone. When she died, we were all very sad because we thought the recipe was gone forever. But I found this in her old { madlib( noun ) }, and now I'm the most popular cousin at the reunion."

Repeat ad nauseam.


I've always thought of copyrighted music as "sound recipes"


My favorite: talking about how this recipe helped you cope with the September 11 attacks (although this intro is shorter than a lot that I have seen).

https://cooking.nytimes.com/recipes/1017089-maple-shortbread...


It ran in the paper 8 days after 9/11... seems pretty reasonable to me. In much the same vein as the rush by many to baking now.


The "9/11 content" is also literally four sentences long. One sentence also introduces the recipe contributor, while another describes the original source she adapted it from.

It's not particularly gratuitous, especially given the date and location.


Now we need an bot that parses the comments and applies AI to do . . . something with the people who say the substituted half the ingredients for what they had and left the others out to reply and tell them what recipe they actually made and that they can stuff their 1 star review.


Does it pose an actual problem?

When I search for recipes, I type in the food I want + recipe and then open the top 5 or so links. I quick scan for a list of ingredients. If I don't easily spot on in a few seconds I move on. I'll do this until I have a couple different lists of ingredients for making the item. This ends up taking less than a minute or two. That just isn't a significant portion of time compared to how long I'll spend comparing different recipes to find a common theme to follow.

Maybe it is because I never follow a single recipe but instead combine the common themes from a couple that the whole life story before the recipe shtick isn't something that bothers me.


This ends up taking less than a minute or two.

For you to open 5 websites, dismiss the cookie permission request on each one, dismiss the notifications request, scroll down to find the ingredients list, dismiss the scrolling activated newsletter signup, read the ingredients list, and click the 'next page' button to see the instructions to work out what you can tweak, all in an average of 24s per site is very impressive.


I just tried it. Went to the top seven sites for chicken tikka masala. Exited one for not loading, had to mute another tab with an annoying video, but got to the recipe in 6 of them in under two minutes. No popups (though I do have an ad blocker that may have prevented them).

>read the ingredients list, and click the 'next page' button to see the instructions to work out what you can tweak

That wasn't included. I was talking about scanning to verify there was an ingredient list. I pointed that out when I said:

>That just isn't a significant portion of time compared to how long I'll spend comparing different recipes to find a common theme to follow.


Or just get thrown out completely because you are from the EU.


No, no problem here. I need some backstory, heavy on cutesy metadiscourse, before I'm in the proper mood to learn about Tbsp's and oven temperature.


You dismiss the challenges of finding a single recipe by mentioning you find several and combine them? That completely defeats the purpose of a recipe. You're literally creating your own recipe at that stage.


Finding several to combine should be harder than finding any single one of those represented among the several.

As far as the purpose of a recipe, it still informs me of the ingredients and amounts so I can make a reasonable approximation. Say I want to cook chili, something that there seems to be innumerable recipes for. And say I want to add beans to mine, though my personal recipe doesn't normally include beans. So how many beans should I add? And what kind of beans should I add? Well if I check the top 5 recipes for chili with beans and see that 4 of them use kidney beans and that they tend to use 1 cup per pound of meat average, I can now modify my own recipe in a more informed fashion. This also works for cooking something when I don't know where to begin.


It's a running joke in our house. I start off wanting to make some mashed potatoes, and time and time again, I have to suffer through someone's life story--the camping trip in North Dakota when Susan's husband first discovered his love of homemade sour cream--etc. Makes me wonder if a super barebones recipe site that literally just has recipes and absolutely no fluff would be something people would gravitate towards.


“Why doesn’t everyone who is putting information out there for free not cater to exactly my needs”.

What an utter load of bollocks. These are people who are creating something that they enjoy doing and giving you information you apparently need for free. I don’t understand why anyone would trash their desire to write something that is personal and/or interesting to them about it.

And that’s setting aside that these usually make the recipes far more readable and interesting to the vast majority of people.


It's a sweet world view, but the truth is that the majority of these cooking sites are filling in content for SEO and ad purposes, and the "stories" are fictions written by Tom, a 22 year old freelancer who isn't a yoga enthusiast but is just trying to make ends meet at 3¢/word.


And, the real farce is that Google mistakenly sees all that scrolling up and down the page, looking for the recipe as 'engagement' and "Dwell time".


Is that mistaken? Does no one click on ads on the page while doing this?


the algo thinks you're having a good time.


Tom, a 22 year old freelancer who isn't a yoga enthusiast but is just trying to make ends meet at 3¢/word.

Or Doreen, a 54 year old freelancer who isn't a yoga enthusiast but is trying to make ends meet at something under 3¢/word.

I haven't actually written for the site in question. I'm not trying to imply that I have. More like saying "Yeah, this is absolutely a thing."

And it's a thing in part because my actual original blogging that's the real deal doesn't get enough tips and Patreon supporters. If people want to see less content marketing to get ad revenue and more quality writing aimed at providing something fresh, they should be looking for independent authors to support whose writing they actually like.

I was providing original content written entirely by me for years before I began doing freelance writing. I would have likely never become a freelance writer if people had been willing to leave tips, promote my writing, engage with me so I would have a better idea of what to provide for my audience and so forth.

If you don't like what's on the internet, go "look in the mirror" so to speak. I've been on Hacker News nearly 11 years and was literally homeless for nearly six of that while people around here told me "Go get a real job. Writing doesn't pay. Your expectation that your writing should be capable of providing a living wage is just silly talk."

It's not so-called market forces at work. It's human choice and those choices are rooted in what we value and all this. If this world isn't the world folks want to see, they can make other choices more in line with what they claim they want instead of "being traffic" while complaining about it.

(Edit: For the record: Most of my freelance writing is content for small business sites and I don't feel the tiniest bit of regret. I like working for a paid service and I blog about that too and get accused of the site being content marketing when it absolutely isn't. http://writepay.blogspot.com/)


This is so incredibly accurate.


They're being rightly trashed because it's a SEO tactic and nothing more. Between the preamble ranking them higher and Adsense requiring "substance" in order to monetise the page, there is a systematic issue that's leading to what is essentially useless information to what I would say is the majority of people.

They're very welcome to write their life story, but I bet if google changed their algorithm slightly you'd see it disappear - and I'd say that would be a good thing.


My god this level of cynicism is such an eyeroll.

HNers writhing over each other to be the most cynical and dismissive.

For fun, can you link me to a highly ranked recipe page with a bullshit SEO story on it in line with "the camping trip in North Dakota when Susan's husband first discovered his love of homemade sour cream"?


A few comment above yours is this link talking about the healing power of cooking after 9/11: https://cooking.nytimes.com/recipes/1017089-maple-shortbread...

Two days ago, I made this absolutely delicious "5 minute tiramisu": https://wishesndishes.com/5-minute-tiramasu-dip/ - That page doubles as a sponsored article, too (see disclaimer).

But just to prove your point and make you happy, and also because I'm hungry, I googled "sushi recipe idea" and clicked the first link that, without fail, talks about the very special valentine's day Tom^Wthe author and her husband spent years ago and how sushi is now a tradition.

https://www.fifteenspatulas.com/homemade-sushi/

This isn't a cynical thing. Go to fiverr.com and search for "write recipes". Here's the first result that comes up when doing that: https://www.fiverr.com/francosalzillo/write-text-around-a-re...

That's the more professional version of it. The guy's highly rated and it looks like it's his actual job.


Your first link is a recipe from 2001-09-19, which seems like strong evidence that it being written around 9/11 wasn't motivated by modern recipe SEO?


What did I claim?


Your parent asked "can you link me to a highly ranked recipe page with a bullshit SEO story". I was showing how your first link wasn't an SEO story.


Searched Beef Stroganoff Recipe. Result 1 from Betty Crocker was good. Here is the 2nd result:

https://www.gimmesomeoven.com/easy-beef-stroganoff-recipe/

You have to scroll halfway down to get your ingredients list after learning that her husband is a vegetarian, getting a history of the author cooking mostly plants before anyway, getting a lesson on what egg noodles are and, finally, the ingredients.


https://www.garlicandzest.com/julia-childs-boeuf-bourguignon...

"When I woke up that Sunday morning, it was a crisp 68°. I opened every window in the house (the first time we’d aired it out since April) and put on a pair of jeans and a light long-sleeved shirt. While Scott was busy appraising his fantasy football rosters, I was searing beef and taking pictures."


> What an utter load of bollocks. These are people who are creating something that they enjoy doing and giving you information you apparently need for free. I don’t understand why anyone would trash their desire to write something that is personal and/or interesting to them about it.

They're mostly copying recipes from other places then applying SEO. That's it. The ridiculous probably-made-up stories are for SEO.


It's not about trashing someone for doing something they love. It's about a prescribed format for food blogs that has all but taken over anything food related on the internet. I highly doubt it's a coincidence that thousands of food bloggers adapted a "five pages about me and then the recipe" template for their writing. As others are saying, it's an SEO tactic, and it's probably annoying to the bloggers too.

There's no arguing about it making recipes more "readable" when you have to scan down page after page to get to the actual ingredient list. That's super annoying. To put it into perspective, if the recipe was at the top, and the blogger wanted to spend five chapters talking about their life afterwards, more power to them, I wouldn't care a bit.


Or maybe, people are just fed up with the 'fluff'. I know I am. I don't need or want a 10 minute lead in to a news story that finally explains the issue. I don't have time to sift through drivel to get the information I am after. This isn't academic research project. And frankly, I don't give two shits about ANY of the blog spam fluff. When I'm looking for a recipe, I want the technicals of 'how to make this', and that's it.

I see this as more a problem of the web, than the recipe sites themselves. More and more often, web sites are shoving information down your eyeballs to keep you on the page longer. I hate this trend. I want the service or information to give me what I want, and get out of my way. Google search? Get me what I want, and get out of the way. Email inbox? Same deal. Yet the exact opposite is happening with more and more invasive time wasting drivel, being injected everywhere.


That's not why every single recipe site has these stupid blurbs... it's that Google wants unique prose on the page in order to rank it highly on SERPs


Having thought about this and my own use of cookbooks as loose inspiration rather than actually to follow in detail, I have come to the conclusion that the "fluff" is what most of the recipe-reading public want. In particular, the fluff has value even if you never cook the recipe! Which saves a lot of time and inconvenience on your part while still giving the same warm fluffy feelings.

Actual "I have these things and want to cook something" could practically be automated.

The big exception is baking, where precision ratios and time matters a lot.

(I'm also reminded of various stories of people trying to trace the origins of much loved family recipes and then discovering that rather than being an authentic traditional Calabrian whatever, their grandmother copied it off the back of a tin. I'm fairly sure my own grandmother's cookies recipe is from Tate&Lyle)


I have come to the conclusion that the "fluff" is what most of the recipe-reading public want.

I've often heard the prevailing reason why this happens on the web is because of SEO and 'bounce rates'. More time spent on the site improves ranking, so the actual recipe is pushed below the fold so users have to scroll down thereby adding more time on the site.

Have often wondered if any SEO wonks with the inside baseball can actually validate this?


I've also read that it's to do with copyright - the recipes themselves can't be copyrighted, but the text around them can. Scraper republishes your ingredients list: not a lot you can do. Scraper republishes your fluffy anecdote and pictures: BLAMMO!


Except the whole reason this article on scraping recipe sites was written was to ignore the fluff. Forcing copyright this way doesn't sound particularly helpful when the fluffy anecdote has so little value in comparison to the recipe. Or is it really the case that the average reader wants the anecode secondary to the recipe?

It feels like people are trying to make money around information that is fundamentally impractical to make money off of, so they're forced into doing whatever it takes to make money off of it anyway. "Whatever it takes" is defined by Google yet ruins the user experience, and so that is why recipe sites are this way.


I mostly like the pre-recipe text on Smitten Kitchen.

Sometimes it's a little rambling, but there's often useful information about the recipe that follows, like shortcuts that seem like they should work, but don't. She includes some interesting links or background, and getting a bite-sized glimpse of someone else's life isn't the worst thing.


I can't validate this but have been told this by multiple authors and SEO professionals. So it's anecdotal. One blogger apologized to me and told me she was embarrassed doing it but it's an industry practice. She explained it's because other recipe sites use automated scraping tools and republish their recipes in an effort to outrank original authors. The personal fluff helps slow them down. Also, a lot of recipes are bullshit, they manually steal them from elsewhere and change a few variables to evade copyright claims. Although, I guess changing a few variables is how cuisines evolve.


Recipes aren't copyrightable. It's bad manners to copy and republish one without attribution, but not illegal.

The introductory fluff is copyrightable, and that's one reason for it.


Photographs, videos, or drawings of recipes in progress are copyrightable, and usually more helpful. Furthermore, they provide evidence that the recipe is actually viable.

That's why you shouldn't expect to make a cent from creating a new recipe, unless you have a chef and photographer/artist lined up, or own your own restaurant chain.


Thanks. This makes sense. I was wondering what these fluff pieces have to do with SEO, because surely the search engines aren’t monitoring all users and how long they spend on each page in order to rank the usefulness of the content in their search results.

Bounce rates and how long a visitor stays on a page matter for those who own the sites and/or do some sort of marketing on them (ads, their own products, their services, etc.).


because surely the search engines aren’t monitoring all users and how long they spend on each page in order to rank the usefulness of the content in their search results.

Probably not the search engines, but I could imagine this being a thing that goes into Google Analytics if an online recipe property is using that (or if their blogging platform has a plugin for it) maybe? Just spitballing from the hip.


Yes, search engines monitor result clicks, bounce rates, dwell times, etc. They do affect ranking. And if you have Google analytics on your recipe site, Google has even more data about it.


Are you implying that having Google Analytics on your page increases your Google ranking?


I don't know. I've always thought it should, and as a search engine CTO, I'd use the data that way. But I have no evidence either way.


I think they specifically mentioned it does not affect it, someone asked that question in a webinar. If it would affect it, than it could be abused (eg. just send 10k bots that spend 1 hour on the site), plus a longer session doesn't necessarily mean a better site or better user experience. Google itself is a good example, where the point of Google is to spend as little time as possible on their page, as it's just a step between the user and his destination.


I have done lot of work on cocktails (major brand) and they key is the structured mark-up we are trying some tests on longer form listings ie >200 words.

I think that food is much more dispersed apart from super stars like Nigella, Jamie and the BBC so its harder for the average food blogger.

Also don't discount the target audience for these pages might not be the average developer / nh reader. Also don't discount the


> Also don't discount the target audience for these pages might not be the average developer / nh reader. Also don't discount the

the what? They got to him, the food bloggers did.

WHAT WERE YOU TRYING TO TELL US


Some of the best stories too have bits of the art inside of the food science anecdotes that contribute to a sense of why a particular variation on a recipe worked better for the author. Some authors have more of a sweet tooth, and others live in higher altitudes or tweak their recipes for camp stoves on hiking trips up the mountains. Some authors spend days and weeks of failures trying to get a proper balance of flour to baking soda/baking powder for just the exact sort of yeast rise they want from their dough, and others just wing it let the dough live or crumble as nature intends as it adds a little chaos to the whimsy and art of their eventual plating.

It's also the little touches of humanity that people want if they want to follow a particular food blogger. The anecdotes add up over time to a sense of following a workplace or family sitcom to follow week to week (whether they make the recipes or not). It's a daily or weekly "soap opera" ("flour opera"?) of an acquaintance or "friend" that you also like to crib recipes from from time to time.

Where these recipe bloggers have their steadiest audiences, those stories at the top of the recipe are the real draw day-to-day, and the recipes the fun addendum to bookmark for later.


It happened to me. When I quizzed my grandmother for her beloved chocolate chip cookies recipe, I felt like being entrusted with a huge family secret.

Nope. Toll House cookie recipe (which is on the back of every package) with an extra quarter cup of flour XD.


I read long ago that you can/should always trust recipes printed on manufacturer's packages; after all, they've chosen them to stake their ingredients' reputations on.


mashed potatoes

Steam some potatoes. (Don't boil them. No one wants watery mash.)

Add far more butter than anyone would think is reasonable. (Like, 1/2lb of butter to 1lb of potatoes. Maybe more.)

Add salt and pepper. (No, more than that. You've under-seasoned them.)

Mash them until they're the consistency you want. (For really smooth mash use an electric hand mixer instead of a masher.)

To make them better still, put lots of wholegrain mustard or garlic in with the butter.


> Add far more butter than anyone would think is reasonable. (Like, 1/2lb of butter to 1lb of potatoes. Maybe more.)

No. The butter gets lost in the potatoes and mouthfeel of the fat is compromised. That is the reason that so much is needed if you do it this way. Much better to only add it right when serving, leaving the butter and potatoes largely unmixed. Preferably added in an amount “to-taste” by the individual. I believe this was covered by McGee’s “On Food and Cooking”, but I may be misremembering.


I would buy a cooking book with recipes explained like this one. With real world tricks and explaining "why" you do it in that way. A raw list of ingredients is, in my opinion, almost useless.


America’s Test Kitchen (TV) / Cooks Illustrated (books) will scratch your itch.

My foodie friend and I use a lot of their recipes.

The cool thing is that sometimes you will disagree with their preferences (totally normal), so you can use one of their variants that has the attributes that you want. Or similarly, you can see how changing certain ingredients changes the outcome and personalize your version of the recipe accordingly. It definitely saves some experimental batches.


I’ve found that The New Best Recipe cookbook covers this pretty well for me. They have an introduction to each recipe talking about all the variants they tried, and then there’s a clearly marked recipe section.


Since this seems to be an OK thread to add humor to, I'll point out you can also stick 'em in a stew.


I've had good luck with recipes from https://www.taste.com.au/ and https://www.bbcgoodfood.com/

Alternatively, if you want super barebones, go buy a copy of Escoffier's https://en.wikipedia.org/wiki/Le_guide_culinaire (or an English translation thereof). It's arguably the definitive cookbook and the recipes are extremely terse, some just a couple of lines long. (e.g. "Take recipe A and recipe B, substituting X for Y.")

"Escoffier's introduction to the first edition explains his intention that Le Guide culinaire be used toward the education of the younger generation of cooks. This usage of the book still holds today; many culinary schools still use it as their culinary textbook. Its style is to give recipes as brief descriptions and to assume that the reader either knows or can look up the keywords in the description."


Makes me wonder if a super barebones recipe site that literally just has recipes and absolutely no fluff would be something people would gravitate towards.

There are apps like that, at least. I use How to Cook Everything. It's from Mark Bittman, who was with the New York Times at the time it was published. I don't know if he is anymore.


> Makes me wonder if a super barebones recipe site that literally just has recipes and absolutely no fluff would be something people would gravitate towards

In German there's chefkoch.de. It's full of ads but its core is basically a big DB of user-submitted recipes without fluff. Don't you have something like that in English?


I usually add "BBC" to my search term, which ① gets a straightforward page and ② ensures the measurements and oven temperatures will be metric.

The further from the UK one is, the less useful this is. The measurements should only be a problem in the US and Canada, but common ingredients can change -- e.g. the fat content of cream, or whether canned tomatoes are salted or sweetened. Also, if you're from the rest of the world, you might wonder why they've forgotten the herbs and spices :-)


> I usually add "BBC" to my search term

Do you mean at chefkoch- or google-search?

Yes, recipes are very hard to translate. For example I don't know what "cream" is and I never want extra sugar or salt in canned tomatoes.


I mean in Google search, or any other general web search.

It's probably less necessary from the UK, but from Denmark an English-language recipe search has a good chance of giving American websites. Americans have to suffer sugary and salty canned tomatoes.


I have one. It doesn't rank after 15 years.


Interesting! I wrote https://plainoldrecipe.com (open source!) to solve this, an inadvertently discovered many of the metadata tags described here.

The irony is that the content is required for SEO purposes, but once you’ve landed on the page you don’t want to see it. I wonder if there would be a way to write SEO that only the google bot sees and hide it from humans...


Your header says "plan old recipe"


Which is just dripping with irony? serendipity?


There is a way to present different things for the google bot and humans, but it can and should result in Google ban. You're probably aware and I'm a bit too verbatim.


Are there any legal issues with scraping recipe sites in a commercial app like that?

I'm assuming ingredients and directions are "facts" so can't be copyrighted, but what about the pictures?


While a recipe isn't protected by copyright in the US (and many other countries, including the UK), the wording of the recipe could well be an original literary work, the layout of the page could attract a copyright (as it does in cookbooks) and you're right that the images would be protected.

All that said, if the import is being used for personal use only and not being edited, then it's little different to printing it out and putting it in a binder. I don't know much about US fair-use laws, but in the UK it would seem that reproducing a recipe in an app for your own use would qualify as fair dealing thanks to being personal study.

That only applies if the imports are specific to the person importing them, of course. If they're shared or published, then it's a different story. Also, if you're importing more than one recipe, so it's a significant amount of the published work, then that'd be an issue too. You can't import a whole cookbook and claim it's personal study, but one recipe out of dozens is probably fine.


I assume this would go into DMCA territory, since your hosting user submitted content. As long as you don't host the scraped recipes and images publicly, I imagine it would live in a legal grey area if you had a notice that you must be allowed to use the image you upload in your jurisdiction.

It'd be similar to trying to go after google because someone uploaded a copyrighted work to their google drive. I know they have to deal with it if you share the link, but they don't go out of their way to remove content you uploaded to your google drive and never shared.


Scraping is LEGAL, all search engines scrape to some degree for example, there is a fair use component, so you can't "scrape" 100% of a site and stick it on your domain, but you can still scrape more than zero. In general it is leaning more acceptable than less.


Yeah I understand that part, my question is about showing the scraped data to your users.

https://www.yummly.com/ used to have a paid API for recipe search and currently still lets users search their index. Did they have to go and get permission from each site that they index or is it fair use?


It looks a shade more detailed than google's recipe cards, they link back to the original source for the instructions, I would bet they didn't get permission, and that they count as a fair-use search engine. The law isn't (can't be) perfectly prescriptive here, there's some line that you have to sue about to know if it has been crossed.


If the robots.txt file has no restrictions parsing and scraping is fine. Of course not all scrappers respect robots.txt but they should But as an internet’s citizen better to always reference the source


The simple truth is that the core recipes are fact-based and non-copyrightable, and the 1000-word blogspam recipe header is both copyrightable and garners better search result rankings.

So the business model is to take facts from the public domain, wrap it in bullshit prose, and then SEO the bullshit to have higher ranking than the naked source facts, for more unique visitors and ad revenue.

Making comments about "providing recipes for free" are exactly as useful as comments about "providing phone numbers for free" or "providing mailing addresses for free" or "providing the original text of 'Little Women' for free" or "providing the steps of the long division algorithm for free".

Obfuscating the public domain is not a valuable service. Automatically removing the obfuscation is valuable. A "Project Gutenberg" style repository of recipes would be recurringly donation-worthy.


This could also be useful for websites that do not print well. I have run into a few occasions where adds and other website elements printed with the actual recipe. The result was a small recipe divided on several pages mostly covered with other content. There were pictures and text formatting that I could not copy out. Often for stuff like that I just pull the HTML and edit until it prints well but I would rather have an easier way.


Here's the question... why is it so difficult to do this in Android?

Seriously, AndroidDriver for Selenium was last updated 2013... and importing it throws an HttpClient error now. Update that client and you get a class duplication hell that is impossible to exit.

All I needed was to interact with 2-3 fields on a webpage but it's been eight hours and now I hate my life.


Checkout BrowserStack -- it's dead easy -- and even if you're not using their platform, their docs are good for showing the Selenium/Driver usage.


I believe that the Webdriver/Chromedriver approach is the current recommended way of doing this.


Cool, now the next interesting step would be to categorize recipes, maybe some kind of clustering algorithm, to see how similar they are and whether they have a common ancestor.

When I look at a recipe and notice some unusual proportions I usually check against Joy of Cooking or some other standard book. I've noticed that often everything old is new again.


This is great! Its a wonderful write-up.

I've also made something almost identical - a Go library for recipes scrapers for ingredients [1] and instructions [2]. Instead of the LCA method here, in my version I try to find the longest sequence of highest scoring HTML tags and those are "ingredients" or "instructions". It works very well (although I think this one works better).

Like the article mentioned, I found that the heuristics for finding HTML elements with ingredients turn out to be surprisingly simple - they usually include just a number, a measurement, and a food! This simple heuristic worked better than other sophisticated things I tried.

[1]: https://github.com/schollz/ingredients

[2]: https://github.com/schollz/instructions


I saw all the terrible SEOd recipe websites and my first thought was: I should make a better recipe website that is simpler and is better SEOd.

---

FIRST EXAMPLE:

How to cook chicken on a skillet

Step 1 -- get this much chicken [picture]

Step 2 -- cook on skillet for 5 minutes

OPTIONAL -- here are seasonings you may add [pictures]

RELATED:

- How to cook a lot of chicken on a skillet [LINK]

- How to fry chicken breast [LINK]

---

But then I didn't understand how any of these websites are making money so I didn't do it.


The reason all of these websites are so terrible with the long winded intro-stories is precisely because they do better with SEO.


Only if the page is low quality.

Leaving a low quality page after 60 seconds is way better than leaving a low quality page after 5 seconds.


I just started transcribing every recipe I make. Even if you can extract all the essential information from a recipe site, some changes are needed:

- I need to convert recipes to metric. I am neither equipped nor inclined to cook in freedom units.

- A "can" or a "packet" is not a standard unit of measurement.

- Package sizes vary between countries. I often adjust recipes to avoid wasting food.

- I cook by mass, not volume. I convert the units them round them.

- Instructions are sometimes too verbose. I make them easier to follow while my hands are busy.

- I will make my own changes and I must write them down somewhere.

Besides, sites go down and links break. Food.com broke many of my bookmarks a few years ago. Other sites went dark. My recipes are plain text. They are editable, searchable, editable, and available offline.


I wish I had the willpower to do this consistently...


Hey Ben, thanks for that write up! You may not have time for this, but your article and the intersection of food/recipes and computer science would make a good book, at least I would read it.

I wrote [1] about 12 years ago in Clojure because for health reasons I had to track my intake of vitamin K, then decided to track all nutrients in the USDA nutrition database. I am working on a semantic web product (with another semantic product in planning) but maybe the end of this year will get to rewriting my food web app in Common Lisp and as a macOS app. I am adding a link to your article and these comments here to my notes for that project. Useful stuff.

[1] http://cookingspace.com


Neat write-up, and thanks for putting me on to jsonld.js - looks useful.

I'm building https://simplescraper.io and we're trying to create heuristics to update CSS selectors whenever a website changes. People become unhappy when a scrape task that ran smoothly on Monday suddenly returns nothing on Tuesday so while it's a tough nut to crack it's super important.

We use a combination of XPath, historical data and data type (the value may change but the type and length often remain the same or similar) to narrow down the options.

Of course there's more sophisticated methods using Machine learning etc. but it's fun to try different approaches to solve this problem.


In 2011, Google released "Google Recipe Search". With filtering based on ingredients, cook time, and calories.

https://www.wired.com/2011/02/google-recipe-semantic/

https://latimesblogs.latimes.com/technology/2011/02/google-d...


I personally just find recipes, make it as written from the website, and then (if I actually like it), I'll convert it to be sane for actually following and output into Apple Notes.

What I mean by that is most recipes call for using wwwaaayyy more intermediary bowls/plates than actually required (e.g. if spices, chopped veggies, and minced garlic are going into the pot at the same time, there's no point in using three bowls) or list ingredients out of order of how you'd actually use them.


So far the best way I've found to search for recipes is to search in a foreign language. Translate what you're looking for, then search and translate back to English. There are still recipe blogs, but 5 instead of 5,000, and usually an authentic dish, not what Michelle The Stir Fry Queen From Michigan thinks constitutes a "Moroccan" dish because it has cinnamon and tomatoes.

Would love to see someone put together a search engine that excludes recipe blogs and penalizes SEO.


This is pretty interesting. I wonder how the recipe parsers from MyFitnessPal or Pinterest compare to this. Sometimes I think they do pretty good, but often they do miss the mark. My guess is on Pinterest they only treat something as a Recipe if it contains the metadata mentioned in the article, and do the easy parse if so. MFP seems to try something a bit more advanced, but I've never been super-impressed with its parsing abilities.


This is great. I made a similar product at No Nonsense Recipes https://nononsense.recipes because I was also tired of dealing with all the dreck on recipe sites. I did scrape some recipes to seed the site with but haven't integrated it as a feature yet.

I did ignore the photos though, since while recipes are not subject to copyright, photos are.


Off-topic, but I just wanted to mention that Ben's been one of my favorite 'teachers' in YouTube. He has some quality content on React and JS stuff. For those wanting to learn React (including some advanced stuff), check out his channel! And no he didn't pay me to post this here. Hey thanks Ben - I know a bit of React and have used it on a few projects thanks (also) to you.


Any recommendations for a js lib that does all the "easy" scraping (microdata, og tags, jsonld, etc)?


The article recommended one.

> There are libraries like https://github.com/digitalbazaar/jsonld.js/ to parse JSON-LD + Microdata for you.


I thought that's what this blog post was going to be about but it's just an ad for their app. I just need the scraping functionality.


While they do end with pushing their product, I think they did a good job of outlining how they scrape the recipes. They inform the reader about json+ld, microdata, and how to scrape the sites that don't use those. They even link to a JS lib that handles the parsing for you. I think calling it "just an ad" is inaccurate.

> There are libraries like https://github.com/digitalbazaar/jsonld.js/ to parse JSON-LD + Microdata for you.


In Python I use the "extruct" package from the scrapy people. It's not very good with syntax errors in the markup.


A surprisingly good UX for recipes is Google Home. Ask it for a recipe, and it will ask if you want directions or ingredients. If you ask for ingredients, it will say them one by one, and pause between them until you ask it for the next one. My son has used it to great effect to make pancakes.


Really nice! I often copy and paste recipes into text files I have locally so this is a great alternative.

One feature request (if I may be so bold): it would be great to offer an imperial<->metric convertor. This is predominantly one of the reasons I keep copies of recipes I find and use.


It's really the conversion to weight (grams, from cups/tbsp/tsp/hogsheads) that would be valuable. It's just so much easier to clean to stick a scale under the mixing bowl.


this is coming soon!


I've been working on something similar for the past couple of days, but the trouble comes with wanting static types. There are a few projects out there that offer either a microdata parser, or types derived from schema.org but nothing that combines the two as yet


I’ve been working on this and will have a recipe-specific solution up in a couple weeks. See https://rcpe.io



Ha! I should totally know better, but for a second, I mixed up .com and .net and thought, "Ben has a blog? And posted about recipes? Did he pull them into his 8 bit computer or something??" I didn't realize my mistake until I clicked.


This is pretty interesting, I wonder if this meta could be reused for tutorials of any kind (and not only of food, a.k.a. recipes). A tutorial normally has some requisites, and then step by step guide of how to achieve it, and then the final result.


I did something similar a while ago. I still have somewhere a DB with half a million recipes somewhere. I didn't continue it because I got stuck with the client side and I didn't find anyone interested in helping me.


Is there any recipe tool out there that can do at least one of the following:

1) Scale the quantity of ingredients and cooking time as number of people to be served increases?

2) Tell me what dishes I can make with the ingredients I have?


I enjoyed the recipe scaling abilities of Gourmet: https://thinkle.github.io/gourmet/

You can also filter and search by ingredient, but that might be somewhat simplistic depending on what you had in mind.


Pleasantly surprised to learn that most recipe sites include structured metadata. Makes sense given the combination of a relatively straightforward schema, and SEO incentive from Google.


I've been using Tasty. Quick videos showing all the steps and the how it's supposed to look like along the way. That's the only way I can accept recipes anymore.


This is pretty awesome. I'm currently working on a data pipeline to demonstrate recipe scraping with kafka streams. This is going to be a big help in part of it.


Googling for recipes drove me to install ad-blocker. Have to say I never considered how google created recipe card featured snippets -- cool stuff!


Almost every comment on this page is helpful and from people's direct experiences. Wonderful :)

Thank you everyone for all of this information!


I wish I had this when I first started cooking! I love this concept, but wouldn't this also harm the creator's traffic???


This case is a perfect 'recipe' for reinforcement learning. Let me know if you want help here.


I considered going down the ML route, but didn't know where to start. I'd love to hear how you would approach it.


Neat! I am interested in developing REST API around it to support more functionality, wanna collaborate?


Next level : Shazam for cooking shows.


That'd be sweet for YouTube videos! I watch so many food/cooking YouTube videos.


That's the point : Similar to Shazam for music, you would build a mobile app that would use voice recognition to identify the cooking show and give you the recipe. Additionally you could find a restaurant that makes something similar and offer to have it delivered for a fee.


Hey Ben, if you read this, thanks for your helpful and entertaining youtube videos!


There is markup specifically for recipes. I wonder why it isn't more often used.

EDIT: Yes, the article mentions it, but doesn't give a clue why it isn't more prevalent.


Probably to make stuff like this less effective - the person who posted the recipe only makes money from it when people actually visit the page and see the ads.


People publish recipes without mark-up well that's a waste of time


So, Google actually encourages open semantical web? That is news


It's been the case for quite a while. However, it's mostly useful if you have a huge web crawl, elsewise discoverability is a bit poor.


Another tool for difficult-to-scrape sites is OCR. There are a few decent free/opensource options available:

https://source.opennews.org/articles/so-many-ocr-options/


Be careful with this, some recipes are subject to copyright law. I think you can list ingredients of a recipe with no problem, but once you get to exact measurements and prep it somehow switches over to falling under copyright law. There used to be a bunch of open sourced recipe repos/databases...but almost all of them are gone.


Among other things, I am a cookbook author, so I know a fair bit about this.

Ingredient amounts are not subject to copyright protection. Any prose - intros, descriptions, instructions are covered by copyright the same way that any other book would be. So, yes, this kind of activity is likely in violation of copyright.

Let me also say that I find it a bit insulting that people who make a living creating IP (software) would be happy to disrespect the IP of these recipe authors. By taking the recipes from outside of the revenue source (a book, a banner ad, a cookie, whatever), you are stealing from the author and the publisher.

I make 25 cents on every book that is sold. So I don't actually care if you steal from me. The money is tiny. But it is a bit insulting when people - in my own living room, reading my copy of my book - decide that they want to take a picture of a recipe instead of buying a copy. It devalues the hundreds of hours and thousands of dollars that I sunk into the creation, cooking, and photography of the book.

So here's the moral of the story. Waiters should leave good tips because they know that waiters depend on tips. And IP creators should know better than to steal IP.


> I think you can list ingredients of a recipe with no problem, but once you get to exact measurements and prep it somehow switches over to falling under copyright law.

In the US, recipes are not protected by copyright, including precise measurements and instructions for how to combine them.

If you write sufficiently creative instructions those may be protected by copyright, but I can get around that by simply not replicating those instructions. I can give exactly the same list of ingredients and a less-creative set of instructions for the same dish without infringing your copyright.

Copyright.gov has a circular which explains this with a couple examples [1].

[1] https://www.copyright.gov/circs/circ33.pdf


Can you provide links to those open source recipe resources?

Aren't there dumps of the bygone ones anywhere?


Is this post illegal as it contains information on how to commit crimes such as copyright infringement?


Is the instruction manual for a photocopier illegal?


Even if the action was a crime, discussion about an illegal action, including instructions on how to do it, are not illegal. It might be illegal in very exact situations where you are giving information directly to a person you believe will use that information to commit a crime, but sharing the information in general as a topic of discussion is protected (at least in the US, laws do differ in other countries but I find the US law as the best one to go by).


Copyright does not protect recipes. It does more likely than not protect the images being scraped, though.


Of course not, as others have said. But, did you create this account just to post this comment?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: