For their category "Sexual Violence", they include both violent terms ("woman being raped", "torture porn"), and non-violent terms ("stepsis"). Those are... very different types of pornography, and exist in very different quantities.
At the very least, you can determine that a video includes "torture porn" without listening to the audio, but you can't tell that a video includes "stepsis" without listening to the audio. Especially given the disconnect between the ages of the performers, and the roughly expected ages of their characters.
I did switch that—it's not clear if those three words are the only 3 words defining the category. I suspect not, however, because:
1. It then wouldn't include the word "rough", which is far more common and indicative of sexual violence.
2. Elsewhere the page, the author includes "stepsis" as indicative of incest:
> Later titles are longer, and we start to observe a trend towards both incest ("Daughter", "Stepsis") and violence ("HARD FUCKING", "Fucked ROUGH", "Rough Fuck").
That last quote makes me think that the categories are larger than the 3 examples given, and "sexual violence" includes both the incest and violence terms.
I am the author, it's just those 3 terms for the tSNE cluster. Sorry, I can tell from some of the comments here the graphs need to be clearer. "Stepsis" is indicative of incest IMO, the "step" is a fig leaf.
I agree “stepsis” is indicative of incest, but I don’t agree “stepsis” is indicative of violence. And if you’re only using 3 words for “sexual violence”, then why did you go with “incest” instead of “rough”? Those are vastly different kinds of pornography.
The shift may be less about fundamentally new content and more about explicit labeling and marketing of elements that were previously present but implied -- i.e. -- SEO.
SEO contributes for sure, but I would reverse the statement here - it's more about new content and less about SEO. There's a feedback loop race to the bottom dynamic regardless
So I'm actually kind of lost here. As I understand it, your theory is that the anticipation of SESTA/FOSTA caused there to be more professional porn and less amateur porn on Pornhub, which in turn caused a "race to the bottom" in porn titles, along both rough/violent/rapey lines and incest lines at the same time, and furthermore that the titles reflected the actual content?
So you believe SESTA/FOSTA led to an unintended increase in the actual violence you'd see if you watched random videos on Pornhub?
I don't think much of SESTA/FOSTA, but I do think that's kind of a stretch.
These are terrible graphs. Sorry, but how do I read them? There are labels for the clusters (?) provided in the text but the legend is just called cluster so what does it represent? Dates? Sometimes there's more clusters than labels. A good graph is worth thousands of words but a bad graph is worth a thousand miscommunications
They literally just represent clusters in a learned embedding vector space that's not necessarily well understood, but is believed to map words or phrases with similar meanings to vectors that point in similar directions in a high-dimensional space. The axes themselves don't have any understandable meanings.
I appreciate you trying to help, but I think you are misunderstanding my complaint. The issue here is that even with embeddings, things get labels. Clusters have labels (organized by color) and individual data have labels. Nether of these are well defined so it is not super clear what is being said.
(I'm actually well aware of T-SNE. FYI, it is not a great tool to use and people often conflate it with PCA or dimensionality reductions. Probably fine here because it is concerned with grouping.)
If those dramatic clusters are real (as opposed to t-SNE artifacts or whatever), then there have to have been large step changes in 2010 and 2017 (and not much directional change in the other years).
Just saying "SEO" doesn't explain that. Neither does "professionalization" unless it's rapid. And if either of those did change rapidly, then the interesting question is what caused that change.
A change in, say, Pornhub's policy for allowed titles, or a SEO reaction to an abrupt change in what titles Pornhub's search system tended to surface in response to the average user search, might explain the clusters better than anything else could. But the writeup doesn't mention whether anybody looked for that (correction on edit: it does mention titles stopped being truncated).
Such abrupt change in what the users were actually looking for seems pretty unlikely.
> Political efforts may have contributed to locking in some aspects of this monetization trend. FOSTA-SESTA and sundry efforts by payment processors to limit porn exposure probably helped improve the conditions of the supply side and prevented videos of minors and rape from being uploaded.
Citation needed. That bit about improving conditions is an extremely contentious claim.
Also, regardless of speculation about unrelated effects of the legislation, the whole correlation looks questionable. SESTA/FOSTA came into effect in 2018, but 2017 is solidly inside the "new" cluster, and in fact 2017 looks like it's the most extreme year in that cluster in terms of its distance from the earlier clusters. As for "sundry efforts by payment processors", Operation Choke Point apparently started up in 2013. It was news in 2017, but it was shut down in August of 2017.
Of course, there could have been anticipatory action on SESTA/FOSTA, or a late ramp-up on Choke Point, but any number of other things could also have happened. I mean, if we're going to pick random political events, maybe porn titles changed because Trump took office in 2017. It's a better timing match.
> That's good! But it created an unintended consequence at the expense of the demand side: professional studios started to emphasize youth and violence.
... or, alternately, if you buy the "professionalization" narrative, then maybe professionals have always used titles like that, and those titles became more dominant when amateurs, who disproportionately tended to simple, SEO-free descriptive titles, were driven off of the platform by something. Was there an ID checking crackdown in late 2016 or early 2017?
It's not quite "random" to look at the most intentionally directed legislation of several years. What's your explanation? Genuinely curious. One way or another the clusters do exist, the trends do exist in both content and titling convention
My leading guess would be that Pornhub made some technical change to what titles were allowed or promoted, for who-knows-what reason. One possible guess would be that they simply started to promote titles with more descriptors, or more uncommon descriptors, in an attempt to get an easy boost to search specificity.
The timing is wrong for SESTA/FOSTA, and if SESTA/FOSTA was the reason for Pornhub making a change, even in anticipation, then it seems strange for Pornhub to intentionally make a change that would tend to emphasize titles that would increase political heat.
[On edit: ... and as I said, the "professionalization" hypothesis might also have legs as something that happened in response to an ID crackdown... but that wouldn't have to be related to SESTA/FOSTA, and would have had to happen before passage.]
Definitely plausible but it underrates the changes in actual content. It's not just SEO and titling, it's actual videos that have "stepsis" etc. as themes.
At the very least, you can determine that a video includes "torture porn" without listening to the audio, but you can't tell that a video includes "stepsis" without listening to the audio. Especially given the disconnect between the ages of the performers, and the roughly expected ages of their characters.