Hacker News new | past | comments | ask | show | jobs | submit login
How I Made Porn Video Streaming More Efficient with Python and C (toptal.com)
267 points by crm416 on May 3, 2013 | hide | past | favorite | 112 comments



I sympathize with why the mods changed the title from "Porn" to "Video", but now you've made it so that HN users are unwittingly clicking into an article that talks about porn, in such a way that could be detected by a workplace firewall. That's not an ideal usability decision. Why not use good ol square brackets to maintain the integrity of the title:

How I Made [Porn] Video Streaming 20x More Efficient with Python and C


Or the other way around

How I Made Porn [Video Streaming] 20x More Efficient with Python

as technically the editor notes are bracketed.


>but now you've made it so that HN users are unwittingly clicking into an article that talks about porn, in such a way that could be detected by a workplace firewall.

If you get heat about things like that in your workplace, then run, don't walk.


If your firewall flags up the word "porn", won't it also flag up HN for having "porn" in the title of a listed article?


I think your sympathy is overapplied here... that's very clearly an important bit of information for anyone considering clicking on this link. Thanks for putting the warning up for those of us who would not otherwise have known thanks to a poor title edit.


Why do mods change titles of posts? This is frequently complained about on HN, and seems unfortunate.


Mostly they make things better, and nobody notices. Sometimes they make things worse, and people complain.


>>Mostly

I don't mean to be a hardass, but... citation needed.

Seriously, we have no data on the number of mod edits made vs. complaints voiced. As such it seems premature to conclude that mods mostly do a good job with these arbitrary edits.


Agreed that RTMP is an abomination that needs to be exorcised from the Earth as soon as possible. Unfortunately, it is probably here to stay until something like WebRTC gains critical mass.

It's not clear what the article means by "repackaging" a stream or "pointers" to tags (especially in the diagram that shows tag pointers being transported between users). While RTMP is cumbersome, shoving media data (tags) under a per-session protocol header is essentially the standard way of moving data from one session to another.

So I'm not really following this. Is this cutting out the RTMP entirely for receiving clients, and instead sending the FLV down via another transport, like HTTP or whatever? Or is it more of, "I wrote my own RTMP streaming server in C and Python", along with some implementation details which I'm not understanding? (Not that there's anything wrong with doing so. Options are limited in the streaming-server space.)


Author of the original post here.

>It's not clear what the article means by "repackaging" a stream or "pointers" to tags (especially in the diagram that shows tag pointers being transported between users).

By repackaging I meant extracting the FLV tags which pretty much travel in the same format as in an FLV stream (.flv file) if memory serves well. Pointers to tags refer to the internal implementation. I took the FLV tags out of the RTMP stream, which resulted in an almost complete FLV stream. With the proper header prepended to it you could save it as a file and play it or stream it and play it. That's just what I did, created a header for every new user, and after that was sent I could just stream the FLV tags from a common buffer. The users had pointers that pointed in this buffer so after the header is sent it was true multicasting.

> So I'm not really following this. Is this cutting out the RTMP entirely for receiving clients, and instead sending the FLV down via another transport, like HTTP or whatever? Or is it more of, "I wrote my own RTMP streaming server in C and Python", along with some implementation details which I'm not understanding? (Not that there's anything wrong with doing so. Options are limited in the streaming- server space.)

Yes. From the source RTMP stream I extract the FLV tags, which I could use to multicast. Sending the same RTMP stuff to every user would not work, but I can easily send the FLV tag stream over HTTP if I send the crafted header first.

I hope that helped


> I extract the FLV tags, which I could use to multicast

I assume you don't really mean IP multicast (which would, incidentally, be one approach to edge-origin mirroring for the popular marketing campaigns you mentioned, at least within a data center).

Anyway that makes sense, simply sending the FLV is clever. What's the method of delivery -- chunked HTTP? Does it play well with proxies?


No, not IP multicast. Imagine it as a web server that serves FLV files which always start at the current stream position. This is what it did.

Actually bringing up the proxy issue is interesting. I'm not sure if it does. Essentially forbidding all caching would make it play nice.


Isn't that pretty close to how HTTP Live Streaming works?


You mean the apple protocol? Not sure, but I looked at that when I designed it. What I did was just the most intuitive


I appreciate the effort the mods take in curating titles, I really do - but please spare articles like this at least?

I clicked on this link at work (part-time at grad-school) and now I have a "how to run a pornographic website faster" link logged in my name.


Oh, please. The article uses the word porn exactly three times in the introductory paragraphs. Four if you count the title.

The comments here have double that, including your own use. If you were genuinely that paranoid about being "logged" for having visited a page that used the word porn (really?!) I doubt you'd be using it yourself.


I apologize if it sounded that way, but I wasn't insinuating that I'm going to get into trouble.

I'm just saying that it could get someone into trouble.

As far as the "logging" goes, it had more to do with the word porn in the title of the page, because many content-filters just parse the title for blacklisted words.


Said Mr. Hard Dick.


hardik is actually a common Indian name.


Harsh.


Yes, and Bigus Dickus is also a common Roman name.

It being common just makes it even more funny.


I love reading articles about technical issues and solutions in the Porn industry. It's like getting a peek inside a youtube scale company as they grow.


Disregarding the morality and ethics that are often center-stage, they are faced with really interesting technical hurdles. It's an old article, but I was fascinated to hear about some of the challenges they face as well: http://highscalability.com/blog/2012/4/2/youporn-targeting-2...


Are any of the porn sites larger than youtube?


The question isn't unambigiously answered, but there's a top of good stuff here (first link in the article):

http://www.extremetech.com/computing/123929-just-how-big-are...

"In short, porn sites cope with astronomical amounts of data. The only sites that really come close in term of raw bandwidth are YouTube or Hulu, but even then YouPorn is something like six times larger than Hulu."


I would rather learn if there was a porn site bigger than Netflix.. as Netflix currently uses around 30% of US bandwith [1]

1. http://www.pcmag.com/article2/0,2817,2395372,00.asp


Probably. Netflix has the big constraint of availability that only extends to the US. Porn is global. Even though the US has a above-average bandwidth usage (guesstimated) it's still only a small(ish) drop in the global network.


EU traffic in our case about ~1.2x the US traffic.


No way. One year ago Youtube users uploaded an hour of video every second.


I'm astonished by some of the responses here, that could be summarised thusly:

>"I work on workplace, in a modern western society, not some theocratic backwater, that monitors my web activity and would frown if I visited an article with the word porn in it. This on 2013. I find this OK, and won't quit my job or raise hell protesting this degrading treatment, but would rather complain for HN titles".

In an age where people fight for LGBT rights, this is what the American workplace has come to?


Unfortunately, the majority of jobs in the United States are run with this kind of degrading treatment. Many even require you to randomly pee into a cup on short notice, to make sure you didn't do any drugs recently. Some of them also monitor your Facebook accounts (as far as permissions allow) to see what you're up to there. A few require you to hand over your Facebook passwords (!) to the boss to make that easier. The corporate world is weird and scary, but not always easy to avoid.


>A few require you to hand over your Facebook passwords (!) to the boss

That was a poorly-source, probably-made-up story that now one could verify yet quickly became accepted truth. (Unless it's meant to refer to cases involving heavy security clearances, in which cases it's ho-hum)


It's easy to avoid.


Since when did workplaces become bastions of principles, rather than bureaucracies that have a business to run and have to, at some level, apply policies that suspend an employee's worktime freedom as part of the agreement that involves paying that employee's salary?

I work at a small company where no one would care. They also don't care if I spend time on Facebook, Twitter, etc. etc. But not everyone is in a company like that. Big legacy companies hire from the HN crowd too. Whether or not their policies are fair is not always a simplistic argument. However, what is as sure as rain is that trivial violations of those policies can be used as leverage to punish people, when the office politics get dirty and desperate.

It's best to let the worker -- the HN reader in this case -- make the decision whether he/she wants to pick that fight, rather than have them accidentally stumble into it.


>Since when did workplaces become bastions of principles, rather than bureaucracies that have a business to run and have to, at some level, apply policies that suspend an employee's worktime freedom as part of the agreement that involves paying that employee's salary?

Since people let them get away with it. The "agreement that involves paying that employee's salary" does not meant they should get away with treating him less than a civilized society accept.

Signing a contract to work on some place doesn't give the employer any inherent rights over the employee, besides those that the society is willing to accept.

Hitting an employee was once tolerated. Not so anymore. One time not hiring or paying blacks less was allowed. Not so anymore (not explicitly at least). Child labour was allowed. Not anymore. Racial or sexist slurs were allowed. Not anymore. Lax safety at work was tolerated. Not anymore.

So it's not like there is some undeniable inherent right of an employeer to "have the employee pee in a cup" or to "check his Facebook profile". It's just that people haven't protested enough to make it into law that it's not his fucking business what the employee does at his own time.

>It's best to let the worker -- the HN reader in this case -- make the decision whether he/she wants to pick that fight, rather than have them accidentally stumble into it.

Sure, but at least some anger should be directed against businesses having those policies, not just on HN titles, as if the policies are OK.


I wish I could up-vote this multiple times.


> In an age where people fight for LGBT rights, this is what the American workplace has come to?

Yes. With a lack of privacy expectations in 2013, any visited website is permanent added to the record. Even with no paranoia, it does add an unnecessary risk to ones career.


It's not that simple when you have a family to provide for. The job you have might be the best job for many miles.

I don't agree with the decision not to protest, and I don't think HN should really care about catering to that demographic, but I understand their motive.


Where exactly is the link between porn and LGBT rights?


The link is that sex (even gay/lesbian etc sex) is not a sin, what people like sexually is not the business of their employeer, and if we can accept people doing parades for the right to fuck each other in the ass or similar, we sure should be able to accept people reading the word "porn" on a website at work.

That's the implicit link. The explicit link is that LGBT advocates and activists have frequently defended porn and freedom of sexual expression through it, too.


The overall message is stop worrying so much about the locations of other people's genitals.


>The aggregated bandwidth of the clusters was around 50 Gbps, from which they used around 10 Gbps while at peak load.

Now that is a lot a porn.

Also, the illustrations look really good! How did you make them?


Very glad you like those illustrations, first we have created couple of the sketches on the paper and then recreated that in the Photoshop.


Hey this is awesome! Though the admin/moderator changed the title for some unknown reason from the title of the blogpost to their own.


Of course because of snobbery. NH visitors should be protected from words, which starts from "p" and ends with "orn".


"NH visitors should be protected from words, which starts from "p" and ends with "orn"."

popcorn? preworn?


  % egrep '^p.*orn$' /usr/dict/words


I think you mean /usr/share/dict/words.


Yep. That's what I get for trying to be clever.


Some distros store the words file in /usr/dict/words. Debian (and ubuntu) uses /usr/share/dict/words


Grey Worm. From Unsullied.


Like peppercorn. Wendy Peppercorn.


"Porno" sounds so much worse.


trigger-happy, i guess. Given that "Porn" definitely implies a much larger scale than merely "video streaming", I think the title change does a disservice


It was kinda link-baity and vague...


jdc... it's the name of the actual article...


These two statements are independently true.


For what it's worth, the new title is more accurate and more likely to get me to click.


This happens quite a lot / has become standard on HN (since sometimes in the last year).


If anyone is interested in an alternative to the usual RTMP servers (FMS, Red5, and Wowza), I highly recommend EvoStream (http://www.evostream.com). Compared to the alternatives, EvoStream is much more efficient. I believe TinyChat published a whitepaper discussing their transition from Red5 to EvoStream, which resulted in a decline in the number of required servers.

EvoStream is a highly scalable streaming media server written in C++ based off of the open source RTMPD (http://www.rtmpd.com). The commercial company, also called EvoStream, is a relatively new startup and they do great custom work for those not familiar with streaming media/RTMP.


It's easy to underestimate the power of switching to a better language by just doing it - guess at the syntax until it works, then refactor as you start to understand the language and its culture more. In fact I've found I learn faster this way than any other.


Is this similar to what Wowza (wowzamedia.com) does?

In terms of licensing, it is $55/mo/instance or $995 for a one-time license.

There are also EC2 instances that start at 15 cents per hour.

Much less expensive than FMS


You also have Red5 which is open source and rocks very much. It is just a -tad- less reliable than Wowza which you can't really sell to paying customers, but it's much more fun and customizable.

And to answer your question, Wowza is just a cheaper FMS which also supports other platforms next to RTMP. This article is basically taking the more performant HTTP download mechanism for static content (like YouTube uses) and then hacking it to put a live stream in it instead.


I got hit with a firewall and a "This event will be reported".

Thanks, HN Moderator.


Strange how 3/4 of the comments are more concerned with the title link, instead of the actual content.

Very nice post, was an interesting read!


To be fair I clicked on the article and immediately came back to the comments expecting the absolutely inevitable sprawl of comments about the title edit. :P

So not strange, methinks. :P


LOL at everyone who works at places that fear that the company firewall saw them read an article that had the word "Porn" in it.

Hopefully you get in trouble and fired, it will help your life in the long run.


As someone who built the infrastructure for serving porn for Kink.com, I'd say that this was a total waste of time. Spend the money on a third party CDN and serve from there.


The article suggests that this is for live streaming shows. Would a CDN based approach work in this use case?


Take a look at KinkLive.com. We were the first porn company to do live streaming in HD using a CDN (Bitgravity), all paid for, by the minute, with a micro currency system (kinks) that we built.


Very interesting! Thanks for your reply. Have you ever posted or discussed your infrastructure before?


I've posted about small parts of it in various places, but not the whole experience. I no longer work there (since April 2010), so while I know they still use quite a bit of the serving infrastructure that I built, my knowledge is now quite a bit out of date.

One fun bit that I built is called the cockblocker (as you can imagine, we used all sorts of fun names for internal projects). People who repeatedly attempt to hack the system (usually through various forms of abuse like failed login attempts) would automatically get their IP address routed to /dev/null.


With these amounts of bandwidth, does that still make sense? Sounds like this was a one-man job that worked.


I'm not really sure what you mean, but absolutely using a CDN product is the way to go when you want to sell your content on a global scale. Otherwise, you are going to deal with having to build / maintain your own and the cost of doing that is far higher than just letting other experts deal with the issues that surround it.


I've always been curious, how is working for a porn company affected your social life? does it at all?


I worked behind the camera, not in front of it. Sorry, I'm not interested in satisfying your curiosities.


This title and the article's title don't quite match...


Submitted title matched the page, but it's been changed by the mods


What service do you use for flowcharts?


Is there an open source project that accomplishes something similar?


There is on open source a Nginx module that streams RTMP and creates HLS segments that looks promising: https://github.com/arut/nginx-rtmp-module

There are also red5 (http://www.red5.org/) and rtmpd (http://www.rtmpd.com/)


I know others already said this, but I clicked on this link from work and immediately became appalled when realizing it was about porn and quickly backed out. I'd gladly read this from the comfort of my own home and I'm sure the content itself is SFW but still the point is I was mislead. Can we please [mods] not change titles in cases like this?


Don't understand. You can browse articles for leisure at work. So what was the problem with this article?


What is the association with this site toptal.com? Has anyone here ever worked with them before?


Hey Jeremy, this is our engineering blog.


Thanks. This is the only article on the blog? I was actually intrigued by the site and was wondering if anyone on HN has used this service as a dev or a client....


OP here, and Director of Community at Toptal. This is the first live article, but we've got loads more in progress covering a bunch of different projects and technologies.


Ballsy to lead with an article about porn. I'm glad you did because I know they must face scaling challenges that only a handful of services have to deal with, but rarely do I see articles about how they overcome the challenges. Thanks.


I'm glad you liked it, it was a lot of fun. I also implemented on the fly transcoding, but that's probably for a different article.

For the tech, well really at that place we used pretty much what everyone else did. Not a lot of big secrets, as far as I know the porn driving the tech revolution thing is ridiculous.


I was in talks with them as a dev. Their requirements were a bit weird, so that's why I didn't sign a contract in the end. I think there was one requirements which asked that the developer answer the employer's message within a pretty short time frame, not matter the timezone difference.


Hey Elbear, that's not in our contracts, at all. We do make sure our developers communicate within 10 hours regardless of the timezone. We feel that is more than reasonable and it's worked extremely well for Toptal. However, that's not in our contract, it's actually something we simply stay on top of as a company. Our requirements are not "weird" at all, they enforce high integrity, and in most places in the world, the concept of high integrity is "weird". The type of people who will not conform to such standards are precisely the types of individuals who we would never want to work with in a million years, and that is precisely why freelancing platforms have such a pervasively bad reputation.. because they're filled with low or even medium integrity individuals. To us, that's unacceptable. As it is to any A-player team. -Taso, CEO, Toptal


You're right that it wasn't in the contracts. It was in an internal rulebook for developers. Too bad I didn't keep it, so I could quote from it exactly.

Anyway, my final experience with TopTal was that I asked some questions about the contract that I was about to sign and never heard back.


You're lying. Your website says within 10 minutes during work hours and within 3 hours outside work hours.

http://www.toptal.com/developer/requirements


Doesn't this mean that your employees are expected to be on call for 14 hours per day though?


Your version of "on call" is not congruent with ours or most of the world. When a doctor or DBA is "on call", they have to get up, fix something, and spend hours if not days doing it. We simply enforce communication. If you want to call that "on call" then your decision, and we disagree with that definition. To answer you question, yes, if you're responsible (in our subjective definition of what constitutes responsibility), you will answer within around 10 hours. In practically all communication you can reply "I got this I'll answer you later." or something similar to show responsiveness. The word responsibility stems from "response ability" and we believe that everyone should be... responsible. That's in our DNA.


Wow, you're not wrong.

"For example, if the in-house engineers start their workday at 11am PST, our Eastern European toptal engineers will start working at 7pm EET and work through the night."

"During work hours, they are expected to respond to any client communication within 10 minutes; during non-work hours, they are expected to respond within 3 hours."

http://www.toptal.com/developer/requirements


Jeremy, also to note we'll be putting out many in-depth articles about a variety of engineering related topics in the comings weeks.


Great work. I worked with RTMP for some time and I recall the pain to reverse-engineer their protocol. One other comment, maybe the title should be more like: "How I improved a slow/inefficient RTMP video streaming service by 20x". Just a thought.


Would he get the same CTR if he changed the title?


Nothing open source in this article which was the biggest drawback of me reading this.


Unfortunately that's the case. I heard really good things about wowza these days. Not open source but a lot of things are open and flexible. Also back then red5 was also an ok alternative, not sure if that's still the case.


If you're going to change the title, at least add NSFW to it...


How is it NSFW? It is an article that is about video streaming on scale. Yes it does mention it was for a porn site, but it really does not have much to do with the article. If you work somewhere where this would be viewed as inappropriate I am not sure what can be viewed at that employer.


There's no reason anyone should cater to the consequences of you working at a prudish and illiberal organization.


The network graphics are pretty sharp, anybody familiar with where they came from? Or, are they custom?


Why is the type of content relevant? Perhaps there's something peculiar about porn users?


Because, like it or not, the porn industry dealt with a lot of the technical issues of selling video, pictures, and video streams on the internet before anyone else. They needed it to be profitable / have actual revenue.


Nothing. I can talk about my work on a blog or to a technical person without mentioning once that it's porn related and I've been a programmer in the porn industry for over 10 years. They used it as a way to gain viewers.


Congratulations to whoever messed with the title; I clicked this at work.


I've never expected that the story would be about porn.


More efficiently? For two minutes of video? I could see if you were trying to stream Zombieland or Spaceballs... but porn?!

I guess whatever inspires people to innovate is okay with me.


Original writer here.

Actually this was a website which streamed online broadcasts live from people, like ustream does but with adult content. Some broadcasts could be hours long.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: