Back in the day I worked for Atari, writing game cartridges for their line of home computers. I had a decent relationship with marketing and had developed a reputation as one of the more helpful geeks.
One fine day a marketing guy knocked on my door and asked:
"What would it take to print out every possible eight-by-eight bitmap? We want to copyright them so our competition can't use them."
Seriously.
So I told him the story about the king who wanted to reward the inventor of chess, and upon being asked what he wanted the inventor said "one grain of wheat for the first square, two for the second, four for the third..."
I thought a bit, and added "I think that printout would outweigh the planet."
He went away. I was not a helpful geek that day.
Later I realized that I had only considered black-and-white bitmaps, and that preempting copyright on color bitmaps would have meant a lot more planets.
There's got to be a Douglas Adams style tie-in here somewhere involving aliens with planet-chewer-uppers invading and taking Saturn, Mars, Jupiter and then us for some cockamamie copyright scheme in another galaxy . . .
Once, a guy at a bar told me he wanted to do a similar thing: displaying a 100x100 image that changed once a second to run through all possible bitmaps. He didn't believe me when I told him that doing so would exceed the the lifespan of the planet (to say nothing of what state the universe would be in).
What I find useful when explaining people how long some very long count will take, is to tell them that 1 billion seconds lasts about 30 years, or that counting to a billion takes about 30 years[0]. Not sure if it would have helped with your guy, of course, but it generally provides some insight :)
[0] I wonder about estimating that time, many numbers may take longer than a second to even pronounce.
A series of 12 pairs of reduction gears, each at a 50:1 ratio. On the left, a motor spins at a high rpm; on the right, the drive shaft is embedded into a concrete block. It will be 13.7 billion years before the final gear completes a single rotation.
I wonder what the system as a whole is rated to. I suspect the motor will give out within another decade. I doubt that the system used a super long life motor, but I could be very wrong.
This is at the MIT museum. I first saw it as a child decades ago and many times sense. The motor has been visibly replaced/refurbed and I'm sure they turn it off at night. All that said, when you see the first gear or two zipping around and stand there and stare at the Nth gear trying to see any movement at all it really drives home how these things scale. As a kid I once watched it for over an hour to see if could see ANY movement in the first gear that looked still.
An 8 by 8 bitmap has 64 pixels. there are 2^64 possible configurations of bitmaps, or 18,446,744,100,000,000,000.
Your average printer can print at 600 dots per inch^2. A4 paper is 11.69in by 8.27in, with an area of 96.6763in^2. You can fit 58005.78 dots on an A4 paper assuming you can print on the edges, double that for both sides at 116011.56 dots.
Ignoring actually fitting the bitmaps on the paper, you can store 1812 bitmaps per sheet of paper. You can roughly fit all possible combinations of bitmaps on 10,176,500,000,000,000 sheets of paper.
Typical office paper weighs 5 grams. Ignoring the weight of ink, your total mass of paper would be 50,882,500,000,000 kilograms of paper. Pluto weighs vastly more at 13,090,000,000,000,000,000,000 kilograms.
You definitely don't need a planet's mass worth of paper, but maybe a couple of planets worth of production however. We'd be more likely to see planet scale enslavement in a Douglas Adams galaxy wide copyright mockery scheme, which galactic court would likely throw out since they don't appreciate mockeries of law, leaving everyone a bit disappointed and with entirely too much visibly grey paper.
Now, for the colored bitmaps case, how much does the ink cost? :-)
We're also ignoring some edge effects, where bitmaps can be "shared" by sweeping an 8x8 frame across a printed page at the level of pixels to generate the patterns. I don't know how to think about the scale of that problem right now, don't even know if it's easy to do optimally or wickedly difficult, or if Knuth has a solution somewhere in Vol 4 . . .
A black ink HP 64XL cartridge can print 600 pages assuming a 5% page coverage of ink per page. We're actually coverage roughly 50% per page, and we do that on both sides, so we can only effectively print 30 pages with our HP 64XL cartridge.
The cartridge costs about $40. We can expect to spend around $407,060,000,000,000,000 on printer ink for this endeavor, but HP might consider a bulk rate at that point.
The global world product is roughly $77,868,000,000,000. If we all unite under this noble cause (and assume no economies of scale), we can payoff the cost of the ink in only 5227 and a half years, assuming paper is free (which it totally is, it grows on tress!). If someone works out how much carbon this captures, we might prevent further global warming to troll domestic courts.
So, just how large is it? Let's try to wrap our puny human brains around the magnitude of this number with a fun little theoretical exercise. Start a timer that will count down the number of seconds from 52! to 0. We're going to see how much fun we can have before the timer counts down all the way.
Most of the generated images would just be noise though. You could probably be more clever than just generating random images, and have a decent chance of generating something that someone would design. These days you could use neural networks, but obviously that wasn’t feasible then
I tried calculating this number for 16-bit images in Python (8^8^16), and the interpreter just froze. Go was pretty quick though: 3.940200619639448e+115
You could've settled for every 4×4 bitmap, of which there are only 65,536 black and white ones. Then you could claim any 8×8 bitmap as a derivative work (by concatenation/aggregation/whatever) of up to four of your 4×4 bitmaps.
It's about fourteen kinds of ridiculous, as summarized in other threads. No rhythms, no meter, no tempo, melodies are longer than 12 notes, it's diatonic, single octave, no concept of underlying harmony, the headline is literally false, etc.
Some of the copyright lawsuits are dumb and this is effective satire or performance art but that's all it is.
It's definitely satire, but it's satire in the face of comical law. That's the point. If copyright lawyers want to argue originality based on an arrangement of notes in a 12 tone scale, and in a limited number of bars, then this is a completely valid argument against such a weak argument.
The reality is that many number one songs can be tonally compared to many classical pieces, or even pieces from the last 40 years. The current state of music copyright law is an absolute joke, and deserves to be "disrupted" (destroyed).
You're not gonna get away with copyrighting a 4-letter sequence, but the recent Katy Perry vs Flame lawsuit established that 4 notes is all it takes to have a copyrightable melody.
Usually short phrases aren’t supposed to be protectable under copyright. However, when a defendant blatantly appropriates a well-known literary phrase for a commercial purpose like selling unlicensed merchandise, courts may make an exception.
Yep. One of the arguments no sane person should countenance is that Katy Perry's songwriters happened to use a four note descending synth arpeggio in an intentional attempt to cash in on the fact a four note descending arpeggio in a different key was a motif used on one of the sixteen tracks of an album which hit number five in the Gospel Charts four years earlier. For similar reasons, Universal isn't going after most of the 3.8m websites using the phrase 'phone home', and you're probably OK using three stripes in artwork unless you're drawing them on the shoulders of sportswear or sides of shoes to make it look like Adidas.
[there actually are musicians that specialise in recording backing tracks intended to resemble a particular popular recording which aren't that recording for use in commercial products, but they tend not to get sued...]
Copyright law considers the importance of a sample to the work as a whole, in addition to just the size. Any arbitrary three-word phrase from the middle of a script is probably not copyright infringement. The most memorable line from the entire movie probably is.
Related: The Supreme Court ruled that excerpting a single paragraph from a 454-page book can be copyright infringement. The book was Gerald Ford's memoirs, the one paragraph was his reasoning for pardoning Nixon. The Court's reasoning was more-or-less that nobody cared about anything else Ford did, so excerpting the one paragraph was as good as giving the whole book away for free.
Wouldn’t that be considered something more like an implicitly-created trademark? It’s essentially the equivalent of a company motto for the the movie’s SPV company.
I would note that 4 minutes and 33 seconds of silence is copyrighted by John Cage as " 4'33" ", and a single chord sustained for 20 minutes, followed by 20 minutes of silence, is copyrighted by Yves Klein as "The Monotone-Silence Symphony".
Cage's case is more complicated and less a violation than it may sound at first. It's not 4'33" of "mathematically true silence", that is, you don't violate it per se just by having the right number of zeros in your .wav file. It's 4'33" of performed silence, where the performance is actually the incidental noise of the auditorium it is in. Having a copyright on this particular piece specifically in a performance context may still be an interesting thought experiment, but it doesn't break the system as a whole.
In the latter case, there's also no real risk of accidentally stomping on that.
The claim in this particular case is that they really have generated the entire possible melody space. Legally I think it's likely to fail on multiple levels if it is ever challenged, but part of the point is that some of those failures should also be applied to some real copyright suits that have been won.
(It is somewhat ironic that the music industry continues to be so upset about copyright even as they appear to be converging on The One True Pop Song at speed. Maybe if they acted less like some sort of bizarrely over-trained AI and cranked up the exploration constant, they'd stomp on each other less.)
>It's 4'33" of performed silence, where the performance is actually the incidental noise of the auditorium it is in.
That's an incorrect way to view it actually. 4'33" copyrightable essence is actually represented by the active production of its scoring. I.e. nothing. The background sound is not what makes it copyrightable. You can go ahead and sit at a piano for the length of the composition all you want, wherever you want, and you'll still be publically performing 4'33".
The rather humorous outcome, if one asks me, is that anyone who writes in 4 beats of silemce into a score should be violating copyright if we're going to be consistent.
Some smartass artists have actually written their silent compositions as rhythmically structured rests. That is, something like "silence in 6/4 time, as three quarter rests and three eigthth-triplet rests". It's intentionally absurd, but as the minimum-information notation would be the full-measure rest glyph and the number of measures of duration, the deviation from this is undoubtedly creative content that is copyrightable on paper.
That kind of thing is completely unenforceable with respect to performances, but in written musical notation, copying the specific notation pattern could be infringement. If you write "4/4, tempo 80, 91-measure rest", that's maybe violating the 4'33" copyright. If you write a score for a full band or orchestra that shows rests in each measure for each instrument, with key changes and tempo changes and such, you're just retelling the same joke in a different way.
> Having a copyright on this particular piece specifically in a performance context may still be an interesting thought experiment
Not even a thought experiment: it’s essentially a 4’33”-long ambient acoustic sample. There are plenty of these (though not usually that long) in sample libraries, recording e.g. traffic sounds, or diner conversation, or crickets in a marsh in summer, etc. And those are certainly copyrighted, unable to be used without license.
>Not even a thought experiment: it’s essentially a 4’33”-long ambient acoustic sample.
I would question any legal professional's authoritative standing to even advise on copyright of a work of music if they miscategorize a recording of ambient sound as a performance of a musical scoring consisting entirely of silence. The copyright doesn't apply to the ambient sound, but to the long quiessence of an artist at their instrument.
It demonstrates a complete blindness of the negative space of music, and a positivistic bias that has no place being enshrined in our legal system.
Exactly! What they're doing would otherwise seem pointless, but given how far off the rails the courts went on that lawsuit, copyright jurisprudence needs this kind of sanity check, because courts clearly aren't using enough criteria to call a melody "copied".
The whole concept of IP is absurd, and there are many absurd consequences you can derive from it.
But there are degrees of absurdity, it's one thing to do that when there's 26^100000 possible combinations, it's another when there's just 12^100 (and if you only care about melody it's overestimation, most songs will use much smaller subset of that).
It's not like this absurdity was unknown to the framers of copyright. Thomas Jefferson, writing regarding patents, wrote "other nations have thought that these monopolies produce more embarrassment than advantage to society..." And yet, they persist (and in the modern era, proliferate) because it's generally agreed society derives benefit from paying people for their purely-idea creative work.
... but that doesn't mean copyright and patent aren't a perpetual battle against the "natural" arrangement of idea, and absurdities are extremely possible when the law is misinterpreted or mis-structured.
It's weird -- legal arguments aren't protected under any sort of intellectual property regime. In fact in common law jurisdictions, it's encouraged for you base your work off that of previous practitioners. I always wonder how different our intellectual property regime would be if lawyers demanded to be paid royalties for others citing cases which they had won.
Copyright as an institution is grounded on arguments of net societal benefit, and it doesn't take much argument to demonstrate that there is net societal harm to intentionally diminishing people's ability to comply with the law.
Somewhat tangentially, in some fields and jurisdictions, the legal code which is enforced is copyrighted and paywalled by third parties or the Gov. itself.
> And yet, they persist (and in the modern era, proliferate) because it's generally agreed society derives benefit from paying people for their purely-idea creative work.
Who "generally agree"s on this? The existence of laws doesn't indicate the mood of society.
> The existence of laws doesn't indicate the mood of society.
But if laws are proliferating and regularizing instead of standing still or being abolished, and one assumes that elected representatives are acting on the will of the people, it probably does.
There's lots of controversy over how to improve copyright / patent law, but not very many people in governments in the EU, US, China, Japan, Australia, &c are talking seriously about just burning the whole copyright / patent system to the ground. At least a subset of the countries in the groups listed are generally understood to have representative governments.
Representative governments are well known for supporting powerful groups over the collective benefit of society. Just look at any of those countries tax codes and you will find a multitude of special exceptions.
Patents are of limited duration because the tradeoffs of unlimited patents are so horrific. If we accept a billion dollar drug must enter the public domain, clearly copyright should also be limited just as it was proposed in the US constitution. However, because a tiny minority has a huge benefit and society does not really notice the difference you get the modern mess of unending copyright.
"A tiny minority has a huge benefit and society does not really notice the difference" seems like a utilitarian argument that the system is working as intended; there is net positive benefit.
If it costs 350 million people 1 cent to hand me 1 million dollars they don’t notice, yet that’s a net loss. Critically, even when you include those who benefit it’s still a loss.
Only if you consider money completely fungible and use money as your utility function for determining loss here.
You haven't diminished the ability for 350 million people to do things, practically, by shaving 1 cent off of them. But adding the ability a million dollars provides one individual to do something cool with 1 million they couldn't do before has increased the overall capabilities of everyone.
In essence, you've just described Kickstarter's business model.
> "A tiny minority has a huge benefit and society does not really notice the difference" seems like a utilitarian argument that the system is working as intended; there is net positive benefit.
If tiny minority + 'society' is the entirety of the system, then that's true, but there are also plenty of players who lose out due to restrictive IP regimes—and it's hard to quantify the extent of those losses. (Whether or not the benefits they would reap from looser IP are appropriate or fair is beside the question for utilitarian computations.)
> Patents are of limited duration because the tradeoffs of unlimited patents are so horrific.
What would be the point of a patent system with unlimited duration? If we wanted that, we could just have companies not reveal their inventions in the first place
I think IP is required to monitize valuable ideaS, and monitzation leads to social availability.
On the personal level, I would not like it if publishers could freely print the works of new authors, or engineering solutions I spend years on could be copy and pasted.
Some people prefer Creative Commons, and they are free to publish their work that way. Others need or want financial compensation.
> I generally agree that society benefits from IP.
I certainly didn't mean to say that no-one agrees with it, but your personal agreement doesn't evince the general agreement that shadowgovt (an interesting username, in this context …) suggested. To be fair, neither does my skepticism provide any evidence against it.
I think we can align on the lack of evidence presented. There is also the question on what agreed to means in this sense.
In terms of public opinion, it would be interesting to know what studies have been done. I imagine if you would get broad support for an author copyrighting a book, and less on patenting a pre-existing genetic sequence.
As another unsubstantiated claim, I think if you sat down with the general public and the criteria for patents, they would mostly agree.
The challenge has to do with implementing them and the legal process around them.
If a crappy patent is issued to large corporation, it is incredibly expensive to challenge them.
They persist because they allow some people to create extreme monetization strategies, which feeds into lobbying congress for further expansion of copyright.
That doesn't explain why copyright regimes similar to the US persist in countries around the world, or why countries are finding their way to treaties that standardize international copyright enforcement that look more like the US regime than other country's regimes.
The US has a lot of clout in the world and exports its laws and culture abroad pretty heavily. Countries that have tried to outlaw Coke or Cigarettes are usually sued into the ground by large US corporations and, when that fails, the US has used sanctions to back up the accessibility of US products.
Basically, America's crazy overreach forces our laws onto other nations - this is actually one thing that really frustrates me about corporate tax loopholes, that overreach could be trivially used to force better international standards for corporate VAT taxes there just isn't the political will (due to lobbying) to get it done.
At the risk of getting stuck in a loop: "No rhythms, no meter, no tempo, melodies are longer than 12 notes, it's diatonic, single octave, no concept of underlying harmony"
Some people would tell you it's an ill-posed question.
Some might say lumping trademark, copyright, patent and trade secret laws (historically and in practice very different things) under one heading called "intellectual property" is an intentional strategy to muddy the waters and cloud any argument.
And interesting to include in a study like that how much people in each field actually understood about copyright law. My guess is the general level of understanding is pretty low.
So I can write a program which can generate your name and your sexual preference (among a lot of garbage data). Does that mean this can no longer be considered private information subject to privacy laws?
You can use a ridiculous argument for many things.
The garbage data makes all of the difference. Said accidental generation would make it legal as the only way to pick out the data is to know it already. Otherwise what separates your real name from "Name: Seymoure Butts Sexual Orientation: Mayonnaise".
The fact there is no garbage data at all shows that they are doing judgement on a nakedly wrong level in music - even by the standards of copyright.
Selling "Harry Potter but with the capitalization inverted" to dodge book copyright nor even "this key and this very long block of data which happen to decrypt to the complete works of JK Rowling, don't decrypt because that would be infringement wink wink nudge nudge".
I would presume to vast majority of the automatically generate music is basically garbage?
If we say that music (and really any information) is just numbers which can be enumerated automatically, then surely the creative action is finding and picking a number which is actual interesting out of the infinite sea of random garbage.
My point is that framing music as "just numbers" does not disprove that producing a song is a unique creative work. There may be valid arguments against copyright, but this one isn't.
Yes, of course it means that. I might randomly generate data to test with. Doesn't mean it's private and needs to be treated as such if I accidentally and unknowingly stumble across real info.
This actually reveals an important truth about the nature of information: Information is often better understood as exclusionary, rather than somehow "creative". If I have a "thing", you don't know what color it is. If I know tell you it is "red", you still don't know the exact shade, but I have excluded a lot of possibilities. How informative my statement is depends on how much is excluded. If I name an exact Pantone color, I am being much more informative.
In some sense, looking at information as being exclusionary and as being inclusive are the same thing, but there's a lot of ways in which the former actually makes more sense as a thought framework.
And in this particular context we can see how that plays out... a list of all possible melodies of a given nature actually has very little information in it, because it doesn't exclude enough. It may superficially seem to our human senses that a lot of stuff has been included/constructed, but in reality, the 'list of every possible melody' is a vapor. There's not actually anything there. It is the act of exclusion of possibilities that leads to interesting information. Such information as this list has is contained in its specification of what a "melody" is. Counterintuitively (to a lot of people's understanding), if they widened the specifications, while they would end up with a bigger list they'd end up with less information in the result.
The act of creating a song isn't a matter of creating the possibilities from the raw nothingness, it's a matter of carving them out of the exponentially-large space of possibilities and finding something there useful. The exponentially-large space is so large that it is very easy to not see it that way, because, I mean, it's huge. It doesn't feel like "removing" possibilities the way carving a 3D stone does ("I remove everything that doesn't look like my desired statue"), because the exponential space is so inexpressibly larger, and we need fundamentally different tools to address such a space, but in the end, it's the same thing.
While this isn't what the law was written for necessarily, the "creativity" requirement here could be very easily pressed into service here. They've expressed very little creativity/exclusion on this list and it would be easy to argue it falls far below the threshold necessary for copyright. As a literary criticism of the system, it is successful and thought provoking... as a legal criticism of the system it would fail completely.
> ... as a legal criticism of the system it would fail completely.
The legal criticism would be that there just aren't that many unique melodies—as demonstrated by the fact that they were able to enumerate them all—so the mere fact that two songs use the same melody is not sufficient to show that one is a copy of the other. The set of melodies that are compatible with human ascetics is even smaller. They don't actually need these auto-generated melodies to qualify for copyright for the project to succeed. It works equally well if similarity in melody is not considered sufficient evidence of copyright infringement.
Even just having the database around so that one can say that they copied the melody from here rather than from some other source might be enough. After all, unlike patents, independently producing something similar to a copyrighted work is not infringement; you have to have actually copied from the other work. If you're a musician perhaps you should listen to a few randomly-selected melodies from this program each day. Maybe it will spark something, but even if it doesn't it will at least make it harder to argue that whatever melody you come up with could only have been "subconsciously copied" from some other composer's song you may have heard decades ago.
> The legal criticism would be that there just aren't that many unique melodies—as demonstrated by the fact that they were able to enumerate them all
How does that follow? You can enumerate any finite number. And the article doesn't say how big. Is it a thousand or a trillion? "Riehl says the algorithm works at a rate of 300,000 melodies per second.". The article doesn't say how many seconds it took to generate all melodies though.
Not within a fixed time period in the real world. You're limited by the matter and energy available, and by the speed of light. However we're not talking about the theoretical ideal limits of computation. The upper bound would be 300k melodies for each second since the program was written—68.7 billion in all, according to the Adam Neely interview linked from the Press page of the project site. Which is a lot, but then there are hundreds of millions of known songs, each of which is likely to contain multiple melodies, some of which are much more likely to be chosen than others. Accidental duplication is thus quite likely.
Ha! It does not work however, as a person's sexual preference is a fact, and a statement that either guesses or spits out all recognized variants is not a source of factual information -- no-one is any the wiser afterwards than before.
On the other hand if you offered it up as a fact with reckless disregard for its truthfulness...
Again we see programmers trying to understand the law in terms of ‘how can a piece of data be illegal?’ while the law is quite happily focusing on making specific actions illegal.
‘You can’t arrest me, gold bars aren’t illegal!’
‘Yes, but carrying them out of the federal reserve vault without permission is.’
"I invented this later but independently" isn't a valid defense, even if you can prove it. So it's not actions (like copying or plagiarism) that is prohibited, it's the result.
That's the problem with IP law.
In effect you give people monopoly on numbers. When the numbers are big nobody is bothered by this, because chance of arriving at the exact same one is effectively zero. But for songs the numbers are pretty small (depending on the encoding used to compare the songs), and the absurdity is evident.
That's not correct in the case of copyright. Independent invention can be used as a defense to a copyright infringement claim.
> Generally, a plaintiff proves copying through circumstantial evidence, showing that the defendant had access to the copyrighted work [...]
>
> [...] unlike in patent law, if a defendant independently creates the substantially similar work, he is not liable to the copyright holder.
Your example is barely plausible, much less demonstrable.
The connection between "original works" using an extremely limited set of notes, and your right to privacy using some theoretical predictive algorithm is not at all obvious.
I agree. I wanted to point out the absurdity of the argument used in the article. The argument is that music is just numbers and numbers are not copyrightable. But any kind of information is "just numbers" which can be enumerated given enough time.
For example, there are a number of songs that include a rendition of Tom's Diner within a larger melody. Also popular to add to a song is the Arabian riff melody (which is so old it's not copyrightable, but you get my point) I don't know of any law suits around Tom's Diner song, but this is an example of the types of ridiculous lawsuits out there over copyright.
The code actually can produce every possible melody in MIDI. They simply have not stored every possible melody explicitly (uncompressed) on a hard drive (which is impossible, as the size is infinity).
However, if you interpret the program itself as a self-extracting compressed archive, they actually have stored every possible melody (in a compressed way).
So the question reduces to how much the type of compression matters here (is ZIP allowed? is TAR allowed? what about more sophisticated like PAQ? and what about this Rust code?). This is what I/we discussed here: https://news.ycombinator.com/item?id=22441328
> So the question reduces to how much the type of compression matters here
If you compress, you can copyright the compressed bytes.
If you don't compress, you can copyright the uncompressed bytes.
As far as that copyright extending to derivations, e.g. decompressions, the answer indeed situation-dependent. For example, converting a copyrighted font from TTF to WOFF does not remove the copyright. But converting a copyrighted font from TTF to screen pixels to WOFF removes the copyright. (Sorry I don't have a reference; probably findable.)
The "self-extracting zip" derivation would probably fall into the latter category; that is, the copyright would not transfer.
---
But even if the copyright were maintained during your advanced decompression, one could argue that editing down to very specific portion of that extremely large body of work was a substantive/transformational derivation, which they could then copyright themselves.
Transformative works are very common in art. The most famous example is Duchamp simply adding a mustache to a print of da Vinci's Mona Lisa, and copyrighting that. [1]
It comes down to this: Typefaces/glyphs are not copyrightable. The font code that produces those glyphs is.
> Typefaces cannot be protected by copyright in the United States (Code of Federal Regulations, Ch 37, Sec. 202.1(e); Eltra Corp. vs. Ringer)...However, there is a distinction between a font and a typeface. The machine code used to display a stylized typeface (called a font) is protectable as copyright. [1]
In software, a similar "black-box" derivation process has happened many times, e.g. UNIX/GNU. Copyrights applies to software source code, but not software functionality.
Determining what is the "essential, creative work" in each case in a nuanced way is a matter for courts and armies of lawyers: Apple round corners, Oracle Java APIs, etc.
This is true of fonts, as per the copyright office's
"Policy decision on copyrightability of digitized typefaces".
However the example here, and the situation with fonts seem different. It is the case that font data is viewed as utilitarian and uncopyrightable. So we have a copyrightable program, producing uncopyrightable data. The argument here seems that we have a copyrightable program producing copyrightable data.
Every possible combination of 8 notes in 12 beats is not infinite. Assuming all quarter notes it's just (8^12)-(8-1)^12 = 54,878,189,535. If you factor in rhythmic variances, the number is much much larger, but still not infinite.
More correctly, it can produce every possible monophonic sequence of tones given infinite time, which in the article was further limited to monophonic 12-tone sequences across a single octave (any other limitations I missed?).
A melody contains more than a sequence of tones. The most heartless definition would at the very least include rhythm. For every sequence of tones they output, they only produce one out of hundreds of possible melodies for that sequence.
It is of course an interesting thought to consider the definition of decompression, but on the other hand, we should also limit the contributed idiocy to the bare minimum required to break the relevant idiotic rules.
Part of me wonders if the next step with this is some sort of DL model. I wonder if, trained on one set of melodies (defined in the intuitive sense) it would generate existing copyrighted melodies not in the training set.
By that logic, a program that simply counts up from 0 (with bignums, as long as the machine has enough memory), is actually a compressed form of every single piece of data or information that has ever, will ever or can ever be created.
Music lawsuits can be based on about as much information as they're conveying. I think that's their point, too. I don't believe they were trying to be satirical, they wanted to prove a point about the nature of music itself that could be used in defending musicians against lawsuits.
One of the items they were trying to point out, often abused in lawsuits for pop music, is the idea of "Access". If you came up with an idea all by yourself, but a similar song exists that is popular enough, the court argues that just by there being the possibility that you heard it, you therefore definitely heard it and then copied it.
If this music set exists, and is freely available, shouldn't it be considered that you had reasonable access to it and therefore stole it? No, of course not, that would be a ridiculous assumption and so is the current outlook of a song being popular being enough proof that you stole the idea.
Not to mention, music is extremely formulaic. Chord progressions have a natural tendency to certain forms, with centuries of prior art, rhythm within genres of music is often the same, even melodies have a trend toward particular combinations (leading tones over chord progressions bring about lots of similar sounding solos).
Any musician trying to claim copyright for their music should remember that their song only exists on the back of centuries of musical exploration. Consider how much of the song you can say is truly novel, it's going to be nearly nothing.
The combination of lyrics + chords + melody is in my opinion, the absolute minimum you need to claim a song has been copied. Lyrics are derivative, melodies are derivative, chord progressions are derivative, but together they have the chance to be a unique combination.
>I don't believe they were trying to be satirical, they wanted to prove a point about the nature of music itself that could be used in defending musicians against lawsuits.
Sufficiently advanced “proving a point via absurdity to make a more general argument” is indistinguishable from satire.
> No rhythms, no meter, no tempo, melodies are longer than 12 notes
I am not a musician, but which of these are copied in the Tom Petty / Sam Smith case that motivated this exercise? To my untrained ear, I do hear some similarities in the relative lengths of the notes (meter?).
I don't have a great ear, but I think it's a similar melody, albeit at a different tempo and key (?). If you can't hear it, try changing the video speed to 1.5x during the Sam Smith part.
If these two songs are similar enough, then I think it could be argued that a MIDI sequence has been copied, since in both cases it requires a significant change of tempo and key. A lot of commenters seem to be missing this point: yes, the generated sequences sound different from real songs, but so do the songs involved in the ridiculous court cases. Radiohead and Ed Sheeran were sued for chord progressions, Katy Perry for a melody. The songs involved were altered about as much as the MIDI sequences would need to be to show the similarities.
I'm not incredibly familiar with the court cases, nor that of Coldplay/Satriani, but I have a hard time believing that their decisions are algorithmically binding. Like, yes, they might be similar along those particular axes but that doesn't mean those similarities are the sole reason for the court decisions. There's also matters such as - was songwriter #2 exposed to song from songwriter #1? Does the similarity in arrangement imply intent to copy? Etc.
Exactly. Harmonic progression is probably the most important part of this, as it imparts context on the melody. The same melody over I-V-IV-vi and vi-V-IV-V is not the same thing.
Courts don't care about harmony. Most US courts follow a guideline that the copyrightable parts of a composition are melody and lyrics. Nobody has ever successfully sued for stolen chord progression.
Not according to the US court system. American case law defines chord progressions as insufficiently creative for copyright purposes. Usually rhythms are too. The only copyrightable parts of a composition in precedential cases of most US courts are the melody and lyrics.
And apparently arpeggios, because the US District Court of California ruled in Flame vs Katy Perry that arpeggios are "melodic enough" for copyright protection.
Melodies are insufficient on their own in my opinion, the court rules differently I guess. Melodies are just as likely to be formulaic, and are built using similar foundational knowledge as a chord progression (only a subset of notes works in a given progression for example, and conventions lead you toward certain notes of that subset).
A combination of chords, melody, rhythm are I think the only reasonable measure that a song has been copied.
It matters a lot to the sound of the song, but does it matter to the court?
If I take the entire melody of a Beatles song, including the verse and chorus, but set it to an entirely different chord progression, would the court recognize that as an original song? What if I lifted all of the lyrics as well?
Lyrics and melody? I think that's reasonable to consider that an infringement. Melody over a new chord progression? I do think that should be considered a new work, just to limit the scope of copyright. Even if it's clear you copied the melody, I think melody alone is insufficient to call a song. Unless the original song was entirely melody. A melody is using all the same musical building blocks as the chord progressions did, why does it get special treatment?
Octave - a doubling in the frequency. Generally speaking, in twelve tone equally tempered classical music, melodies and harmonies are drawn from a scale, a subset of these twelve notes. The most notable of those scales, the diatonic scale has 7 unique notes, and the eighth note wraps back around to the beginning of the scale. Thus, moving from a note to a note double the frequency took 8 notes - hence, the octave. Diatonic is both the name of the most used of the 7-note scales mentioned above, and also a term meaning 'within the scale', i.e., 'diatonic to a (given) scale', depending on context.
Melody is the horizontal arrangement of notes for an individual voice or instrument over time. Harmony is the vertical arrangement of notes sounding at the same time, and how those transform horizontally over time as a group. Tempo is the speed in beats per minute of the background 'pulse' of the music.
Now, asking for a music theorist to give you an algorithmic definition of how to make music with any of the above? Good luck ;)
If you're looking for a book, Music: A Mathematical Offering https://www.amazon.com/Music-Mathematical-Offering-Dave-Bens... is pretty good. But you're nearly a the end of your recursion, though. Wikipedia ought to take you the rest of the way to answering those basic questions. More complex questions like why certain combinations sound good whe next to one another, on the other hand...
I found this video to be an extremely helpful and concise explanation of music theory basics:
https://youtu.be/rgaTLrZGlk0
"Learn music theory in half an hour" is obviously an exaggeration, but it really comes astoundingly close to fulfilling that promise. It contains a lot of information and each part builds on the previous parts, so it requires focus and maybe a few repetitions to 'get it', but I think the approach is fantastic for showing how many ideas of music theory are deeply connected.
Not a book as you requested, but hopefully you'll find the other reply useful. As for some of your specific questions:
Frequency is the same concept as radio frequency, but in this case refers to something we can directly sense. Radios transmit electromagnetic waves, which are photons moving at a certain rate, measured in Hertz, or cycles per second. Sound frequency refers to movement of air waves, so a more accurate analogy than radio waves is waves in a pond when a rock is thrown in. Human ears are sensitive to frequencies between 20 Hz and 20,000 Hz, so any sound you hear is a combination of frequencies. Natural language is helpful here, since higher frequencies sound 'higher' and lower frequencies sound 'lower'.
A tone is a sound at a specific frequency, also known as a note. For example, 440 Hz is designated as the note A4 by the Geneva conventions, and this is what most instrument tunings are based off.
Twelve tone equal temperament is the tuning system nearly all modern Western music uses. Certain ratios of frequencies sound pleasant, especially ratios with small numbers, such as 1:2, 2:3, and 3:4. So if we know 440 Hz is a note in the system, it would be nice to also have 587 Hz, 660 Hz, and 880 Hz. However, these frequencies will only really sound good when played with that original 440 Hz, not necessarily with each other. So instead of using them exactly, we approximate them in a useful way. The 1:2 ratio, the octave, is generally considered to be the most important, so that ratio is kept, but otherwise the notes are equally spaced (human hearing is logarithmic), or equally tempered. The most popular tuning system has twelve tones. There's no note at 660 Hz, but there's one at 659 Hz, which is pretty close, and there happens to be one at 587 Hz. Other ratios are also represented reasonably well.
A key is a collection of notes that sound good together, based on the ratios of their frequencies. The alternative would be chromatic composition, where all 12 notes are used and none is obviously 'more important'. Most music is in a specific key, but uses some chromatic notes to make the melody more interesting.
Yes, the video goes into some depth about scales/keys.
I actually had to look up what the difference between a key and a scale is, as I thought the terms were pretty much interchangeable. I've edited the last part of my other comment to reflect this:
A scale is actually an ordered set of notes belonging to a key. A key is just an unordered collection of notes. I got this wrong earlier.
So playing all the notes belonging to C major in ascending order is playing a scale, and playing the notes in any order is playing in the key of C.
Isn't it a bit of wasted space to store the generated data on archive.org (https://archive.org/download/allthemusicllc-datasets)? I mean, as we see, it's trivial to create them. The code above is kind of a self-extracting archive, so it's just the same thing but in compressed form. And I guess it should not matter (legally) whether you store it compressed or uncompressed (or less compressed).
Hm, they don't really explain this, or do they? They just say "by saving those melodies to a hard drive they have affixed them to a physical medium which is all that is necessary to copyright them".
So, by the same argument, they could copy the self-extracting archive to a hard drive, and then have the same thing, or not?
They even say that they already store it in a compressed form on the hard drive.
What if some future compressor (future ZIP format) is clever enough to see that you are going to save all possible permutations of something, and then saves this is a much better compressed way. Then suddenly when compressors do this, it means when you use such a compressor, you do not have copyright on the data anymore?
I think the focus has to be on the spirit not the technology. Saying to a lawyer/judge/jurist "hey, this script can generate every possible melody!" _feels_ quite different from saying "every single melody is already written down, on this hard-drive I am waving at you! Look!"
Takes any algorithm that can generates the digits of pi. It is proven that all the sequences of digits are present somewhere in these digits. Can you claim copyright on all the books of the universe because you can show the algorithm that can generate every possible books ?
"It is proven that all the sequences of digits are present somewhere in these digits"
This has never been proven for the digits of π. Of course, there are other digit-sequences for which it is trivially true. For example, just concatenating together every finite sequence of digits, in some suitable order.
Without being any kind of copyright expert, this argument doesn't make much sense. If you have the relevant indices into pi, I guess you could claim copyright. The index will be much larger than the original book though. Essentially you have a really bad compression algorithm. I'm not going to get very far claiming that because my decompression algorithm could output any sequence, all sequences are mine.
So in other words you need to have used the compressor to compress what you are copyrighting, and that which you used it on must have been physically stored in its entirety. Is it provable that this all occurred, if there's no requirement to still have the original? It would require that you also supply the compressor to the court, so that somebody can use your self-extracting magic decompressor, recompress the result, and end up back in the same place. But you could make your compressor simply emit your decompressor, ignoring the input!
It is to avoid FUD. It ensures no one can derail the conversation into a debate about what constitutes being ‘affixed to a physical medium’.
“But are the tunes really affixed to the medium?”
“Yes, this hard drive in my hand literally contains a bunch of MP3 files” is a lot stronger than: “Yes, in a way, because this tune-generating script technically constitutes a self-extracting archive, your honour.”
> For a work to be “copyrightable,” it must be original and fixed in tangible form, such as a sound recording recorded (affixed to) on a CD or a literary work printed (affixed to) on paper.
is a hard drive not a tangible form? Come to think of it, so is a brain, and everything else capable of storing information. That language is atrocious.
It’s about “permanence”, not being able to copyright a live performance that wasn’t recorded for example. It just means you need to be able to distribute / replay the recording, and that means using one of the current common audio technologies.
Saying 'I don't actually have that melody written down anywhere, but I easily could have, if I'd run this program' feels a bit like saying 'I don't have that melody written down anywhere, but I easily could have if I'd just thought of it first'.
You may say that in fact having the code that generates the data is effectively the same as having the data. This may convince reasonable, rational people, but the fact that we're dealing with the law that we are should tell you that we're not dealing with reasonable, rational people.
Its just the practicals of copyright law. You don’t copyright the idea of a song, a plan for a song, or code that generates a song. You register the tablature accompanied by a recording. See https://medium.com/@dawn_ellmore_employment/what-does-tangib...
You can trivially write a program which will enumerate all the possible byte sequences in the universe. So you can claim that this tiny program contains everything including god. So you can claim copyright for everything.
> You can trivially write a program which will enumerate all the possible byte sequences in the universe. So you can claim that this tiny program contains everything...
Not to get too far off topic, but even an infinitely long byte sequence cannot represent all numbers. It's possible to construct something not representable by that sequence via Cantor's diagonal argument: https://en.wikipedia.org/wiki/Cantor%27s_diagonal_argument
It doesn't contain the floating point representation of that number, which would be infinitely long, but it does contain the UTF8-encoded constructive proof that uniquely identifies that number.
Most real numbers have infinite-length constructive proofs.
Uncountable is uncountable is uncountable. If any correspondence/encoding existed between naturals and reals, the reals would be countable. (But they are not, so it is fool's errand to search for such an encoding.)
Surely not? If there are more numbers that can be represented in any amount of bytes (as shown by the diagonal argument), then you cannot represent a constructive proof for each of them in any amount of bytes.
Kind of. A constructive proof by definition means something like you can explain exactly what the number is in a finite number of bytes.
The standard diagonalization argument is a constructive proof, identifying a specific number. Normally a constructive proof is preferable. But with a bit of hand waving you can make it into a non-constructive proof that there are "unknowable" numbers that cannot be described in any finite amount of bytes.
In practice constructible for reals mean that you can approximate them with arbitrarily small know precision e.g. a sequence of retional number numbers q_n each no more that 1/n apart from the actual real number.
The point is that such a sequence needs to be constructive in the usual sense, so there are still only a countable number of them.
The argument still applies, yes. Nevertheless the string still contains all finite substrings, which includes all English sentences describing such numbers.
The implication is that human minds wouldn't be able to represent it either, at least not with language.
Perhaps I am mis-undertanding the diagonal argument, but it doesn't appear to show what you claim it shows.
The diagonal argument seems to be a proof that there are uncountably many infinite byte sequences. So while it proves that it would impossible to "enumerate" every infinite byte sequence, it doesn't prove that there exists a number that cannot be represented by some infinite byte sequence.
Indeed, I believe the opposite can be shown to be true by constructing a mapping from every finite number to an infinite byte sequence. ASCII trivially provides such a mapping for real numbers, and finite-tuples of real numbers (such as imaginary numbers) can be mapped by alternating digits from each element of the tuple.
Edit: The key distinction is that the set of finite byte sequences is infinite, but countable, while the set of infinite byte sequences is not only infinite, but also uncountable. Which of these two sets is deemed the set of "possible byte sequences" seems to be the critical distinction and the turn of phrase "all possible byte sequences in the universe" seems to imply all byte sequences of finite length.
All the possible byte sequences contain all the possible programs.
All the possible programs can receive input of infinite length and output bytes of infinite length depending on input.
So if we can supply any input to any program and treat its output as part of result, does this argument still work? Is there such a number that can't be constructed by some program receiving some input?
> All the possible byte sequences contain all the possible programs.
Correct.
> Is there such a number that can't be constructed by some program receiving some input?
Yes. Most numbers are not representable by a program...unless you allowed a program to be infinite length, in which case the first statement is no longer true.
There's nothing particularly unique about a digit-based encoding for Cantor's diagonal argument.
> in which case the first statement is no longer true.
Why is that? What would be an example of an infinite program that could not be represented by an infinite byte sequence? Indeed, it seems trivial to map an infinite series of machine code instructions to an infinite series of bytes.
Edit: It seems like the first statement is false only if you use a different understanding of "possible" for "possible byte sequences" and "possible programs" where the former excludes infinite length and the latter does not.
I internally editorialized your first statement to be "all the possible [finite] byte sequences contain all the possible programs".
Because you cannot enumerate all infinite byte sequences, according to Cantor's diagonal argument as linked earlier. [1]
Calling certain numerical representations "computer programs" in no way changes the fundamentals. There is no way -- no matter how clever your encoding -- to enumerate all real numbers.
I am not the person you were originally responding to.
You don't have to enumerate all real numbers to create a mapping from real numbers to infinite byte sequences.
I am merely pointing out that the first statement only becomes false if you editorialize it to "all the possible [finite] byte sequences contain all the [infinite] possible programs". If you treat the meaning of "possible" consistently in that sentence, I believe it remains true regardless of whether you define infinity as possible or not.
> an infinitely long byte sequence cannot represent all numbers
Okay, all natural numbers then. I don't think that changes the argument.
(Reals are a funny thing because any format you choose has numbers that are infinitely long in that format. Which is basically the same thing you said.)
Not all numbers, such as real and complex numbers. But you indeed can represent all integers, all possible programs, all digital music, digital images, digital everything. Not sure if those other music / images which cannot be digitalized are so relevant.
One could make the same argument for a mathematician with a typewriter. The fact that it could come up with a particular sequence is, I think, less important than whether or not ot actually has recorded a particular sequence.
Claiming copyright of everything won't save you from producing all that child pornography and hate speech, not to mention unauthorized copies of state secrets. ;)
For the state secrets, the state would have to identify the specific strings that constitute the violation, so that's unlikely to be a problem in practice.
But is it enough to prove that it could produce any melody? After all, couldn't a binary adder produce every executable binary possible? Well, eventually, and with enough memory.
So what is the difference between a program that views some text by reading it from a huge file and a program that views the same text by generating it on the fly? Especially if the file is generated by the very same process that happens on the fly in the second case? Why would you even have to actually build one of the two programs, enumerating all possible character sequences of some given length is a somewhat obvious idea and actually implementing that idea, especially if the text is only generated on the fly, does not bring the text into existence substantially more than just pondering the idea.
I guess for matters of copyright you have to produce a specific artifact, just blindly enumerating all possible artifacts is not good enough. Which would also mean that the existence of all those MIDI files will not make a difference.
If I were to write a "random book generator" that just regurgitate infinite text then I can copyright a pair starting index and length, but not simply all possible finite sequences of letters.
I would argue writing a book, composing a song, or inventing a device should be viewed as a search process. You are wandering around in the space of all possible books, songs, or devices looking for one that is interesting to read, pleasant to listen to, or useful to use. Your work is mainly identifying special items in the vast space of possible items, making it physical once identified is only secondary, at least for the purpose of this discussion. So writing a computer program to generate all possible songs is finding a specific and - at least somewhat - useful program among all possible programs and - ignoring its triviality and the general questions about copyright - seems copyrightable to me. On the other hand using this program to turn the space of all possible songs from something in your head into a huge pile of MIDI files on a disk seems not copyrightable to me because you do not single out interesting songs. Finding a good song did not get significantly easier, searching through and listening to a sea of MIDI files instead of playing all possible combinations of notes on a piano and judging the sound of it.
I'm not sure if it would hold up legally anyway, but with it already being generated I can see it working out a bit better.
If you only provide the code as a means of generating any melody, you could just as well replace it with any music instrument and claim you can generate any melody with it.
Ad absurdum (or maybe not so much?), any universal Turing machine can be programmed to list every possible program, specifically the one generating every melody. So the copyright claim could then just as well point to Conway's game of life or rule 110.
Yep. They could've used that space for rainbowtables. I'm old enough to remember when someone from shm00 had rainbowtables available for download, but I guess the network bills added up or the feds threatened them.
Yes, mathematically. But copyright requires copying, strictly speaking, so you need a copy from which to have copied.
An algorithm that generates all possible English sentences doesn't demonstrate that I copied this sentence from it, and the balance of probabilities (used in tort cases) suggests that I came up with the sentence, run-on as it is, rather than performing that algo and then selecting that sentence. Moreover 17USC (102? sorry I don't recall) says that media needs to be "fixed" to acquire copyright; so me having to perform the algo makes it no impediment to my owning copyright of a sentence that, if performed, the algo would by expected produce (which is sensible really, perhaps the algo is errant and can't make this sentence, maybe it only forms grammatical sentences ...
Of the "all possible melodies" dataset? Yes. Of the "all possible melodies humans would conceivably consider music, and nothing else" dataset? Not at all.
What he's calling colour is just provenence, which is a perfectly sensible concept. You can even take the position that provenance is a physical reality, if you take 4 dimensional spacetime seriously.
It's one of the more insightful things on the Internet; well worth the time to read in full. Still, a TL;DR as requested:
"Colour" is a term that encompasses provenance, chains of cause-effect, intent and related concepts. Bits are obviously colorless from compsci point of view, but are colorful from the legal POV. For instance, bits encoding a copyrighted video are just numbers, but for the judge it'll matter whether I got them from a lucky PRNG run or from a torrent site.
Not understanding that colour is real in legal systems is the core source of confusion people have wrt. intellectual property and illegal information (e.g. child porn).
Why in some cases colour matter and in some case, it does not ? no judge will care about the "colour" of child porn. Yet, they will care about the "colour" of a song.
(To clarify: Colour does not encode copyright information; it encodes provenance, intent and related ideas. Child porn is only a somewhat special case in the sense that laws are very insensitive about the shade of colour here. Usually, if it looks enough like CP, a possessor will end up in trouble, regardless of whether it's actual CP or whether they can prove they created it ex nihilo in MS Paint. But that's because child abuse is considered by societies worldwide a special kind of evil that needs extraordinary measures to combat. In almost all other cases, courts are very sensitive to the shades of colour.)
I wasn't really implying that something was bad about the article. Just pointing out that without context your comment looked like spam for some random article.
The original article is something about music. Then you say something about colors. And your article starts with some description of some game called Paranoia. Without already knowing the context this looks completely random.
It's actually very relevant, the TL;DR (from memory) is that lawyers (and judges) aren't programmers, and the source of data (referred to as the "colour of the bits") is as important as the actual bits of the data. Thus "I have this bit pattern as part of a bit enumeration" is different, to the legal system, than "I have this bit pattern because I specifically created it".
> Thus "I have this bit pattern as part of a bit enumeration" is different, to the legal system, than "I have this bit pattern because I specifically created it".
Back in the Napster days, there was a company (maybe mp3.com or something? Google tells me yes!) that would let you get high-quality encoded mp3s of your CDs by downloading their program, putting your CD into the hard drive, and it would read enough of the CD to verify that that was the CD you had, then give you access to their own encoded version of it. The idea being, you own the CD; you have a right in the US to rip it & encode it; so therefore you have a right to this mp3.
They were sued by the record labels (naturally), and lost in court; because the court made a distinction between the copy that mp3.com had of the CD and the copy that an individual could have made. mp3.com were violating copyright because they were distributing their own copy, even though the bits were identical to the copy their customers already owned.
So along the same lines: "I have this bit pattern because I copied it from mp3.com's copy of a CD" is different, legally, than "I have this bit pattern because I copied it from my own CD".
It's a bit more complicated than that. Provenance is critical... but then can you claim copyright infringement based on 8-note, 12-beat sequences? Surely e.g. Katy Perry claimed that her song is original, but courts decided (based on expert opinion) that it's too similar for this to be true. "My 8-note sequence was first, so you probably copied it!" is, unfortunately, a thing - I would have no problem if Marcus Gray proved provenance - but he didn't. The entire proof of provenance is based on "I was first, and yours is similar".
I would think that after this project is complete, at the very least arguments like those that lost the "Dark Horse" case should become not-acceptable in courts. Let Marcus Gray actually prove provenance, not infer/ speculate it - and all would be fine.
What's the legal principle that distinguishes methods of generation? Even if enumeration is definitely on one side, there is an infinite gradation from there to human generation.
The line between enumeration and a very prolific artist is blurry. What if someone instead of just enumerating them created actual songs featuring them as themata? And what if those songs were tool-assisted? But still catchy?
I see it as likely that someone will eventually try it not just as a legal strategy but because of a genuine curiosity in music and AI.
According to wikipedia
> There is a long tradition in classical music of writing music in sets of pieces that cover all the major and minor keys of the chromatic scale. These sets typically consist of 24 pieces, one for each of the major and minor keys (sets that comprise all the enharmonic variants include 30 pieces).
I don't think anyone would say there is any ill intent even though it is based on an enumeration.
> a very real dilemma and either decision will have drawbacks
That's bread and butter for the courts. You don't need judges and juries and lawyers to see through a simple hack; trying to use a trivial enumeration to defeat a copyright claim is just "ha ha, nice try, but nope" issue no reasonable layman would have a problem with. Shades of colour are of critical importance in precisely those cases that are fuzzy, where there is no obvious ruling to be made. That's what the courts are made for.
Again, there is a gradation and you've not answered the question.
What is the intent if a human provided 1 bit of input by flipping a coin and chose all the odd numbered melodies? What if a human provided 1 bit of input by flipping an unobservable mental coin? What about 2 bits? N bits?
What if a machine generated all the "interesting" melodies via a neural network?
IP is not moot because you have a random number generator. IP may become less relevant if computers ever learn to select useful bit patterns out of sea of randomness, but even then, the legal system will handle it just fine (somebody will own the selection algorithms after all).
> What is the basis by which jury likely decides intent here? Is it any different than "arbitrarily"?
Causality. You can't determine intent from bits, which leads people to (mistakenly) believe colour doesn't exist. To determine the colour of the bits, the courts will look at the actual chain of events surrounding their creation. Because that is what matters. Not what the bits are, but how you came into possession of them, and why.
A counterpoint to that is that genetically modified organisms can be patented and those are effectively random perturbations of current 'state of the art' which are then selected for fitness.
That's not a counterpoint. That's a case in point. The selection by the human (or by any other method) is the 'creativity'. The fact that the change itself occurred randomly is not important. The fact that the permutation was chosen is actually what gives it a certain color in the sense of the article.
There's also a difference between copyright and patent law.
It also makes it clear they realise the limitations of the exercise (it isn't nearly all possible melodies), and it shows their motives a bit better too (helping protect against corporate music copyright trolls).
I am not a lawyer, but as I understand copyright law there seem to be to be two problems with this:
- a copyrightable work must include some human creativity. It seems to me that an enumeration of possibilities might be creative, but there is no way an individual element of that enumeration can be considered creative.
- Copyright depends on copying. If you release a song with a catchy melody stolen from another song, then you infringed its copyright, regardless of if there is an licensable version of that melody that you could have copied. What matters is which one you copied, not the existence of alternate copies.
This is a response to the Flame v Katy Perry lawsuit, where a youtube video with 100,000 views was deemed sufficient popularity for the court to assume copying without proof.
The element that Katy Perry was found to have illegally copied is a 4-note descending line with equal spacing between the notes. In both "Joyful Noise" and "Dark Horse" the pattern was probably generated by a producer pressing the "arp" button on a minor chord in a DAW. Neither of them actually plotted out each of the notes with particular purpose. But the Court still said that the 4-note descending pattern is original and creative enough, the fact that it was composed by software notwithstanding, and illegal to reproduce.
To be fair, this program is a stupid response to a stupid problem. Even if that muddied the waters for a year or two, that would just be a patch, not a fix.
Copyright claims have been won by arguing the song was so widely available that it’s unreasonable to assume the defendant did not have access to it, which is why they’re trying to spread access to this as far as possible
Those are precisely the objections I'd raised with the Damien Rhiel (submitted via email, no response).
Additionally there's the problem of releasing the works to the public domain. As discussed a couple of weeks ago when the 2016 billion-dollar infringement lawsuit against Getty Images was attempted ... and thrown out of court ... the act of putting works in the public domain also extinguishes, in the court's eyes, the rights of the author to sue for any claims including moral claims of authorship.
So: clever stunt, but legally impotent, both by law and the self-neutering actions of the actors here.
Adam Neely's video (interviewing Rhiel and collaborator Noah Rubin) covers many points of copyright, though not the originality, authorship, or PD angles. (Posted elsewhere in thread, echoing here.)
I think the aim is not to claim that other pieces violate this collection's copyright, but rather to invalidate the frivolous claims that such melody sequences are copyrighted to begin with.
Either the courts decide they are not, in which case the creators will be happy.
Or they are copyrightable and therefore this collection, even if it is in the public domain, is a prior art that invalidates any claim that a newly used melody is really new/copyrightable.
> It seems to me that an enumeration of possibilities might be creative, but there is no way an individual element of that enumeration can be considered creative.
What if the author was thinking about generating a specific, original melody when he hit "run"? He just wanted that melody and didn't care about the rest, but he was lazy and it was easier to find his melody among the rest of generated melodies instead of modifying the source code.
> If you release a song with a catchy melody stolen from another song...
Copied. Not stolen. Copied.
Of course I know what you meant, but it's sad that this phrasing has entered the language. In all the tensions around limits of IP protection, this one was probably the most effective trick media companies pulled. Equating copyright violation with theft (and, through implication, their moral weights), even though one has nothing to do with the other.
> His suitcase is full of noise, but what's coming out of the stereo is ragtime. Subtract entropy from a data stream – coincidentally uncompressing it – and what's left is information. With a capacity of about a trillion terabytes, the suitcase's holographic storage reservoir has enough capacity to hold every music, film, and video production of the twentieth century with room to spare. This is all stuff that is effectively out of copyright control, work-for-hire owned by bankrupt companies, released before the CCAA could make their media clampdown stick. Manfred is streaming the music through Annette's stereo – but keeping the noise it was convoluted with. High-grade entropy is valuable, too ...
It's interesting to think about this from a more general perspective: what if we're not limited to midi, and we're talking about a more general kind of _design_ problem, where the design problem is to select a good solution to a problem within some (large) but finite and theoretically enumerable search space.
We can think of selecting a good melody to fit into a song as one example of a design problem -- searching through some finite enumerable space of melodies and selecting a melody that's a good fit. A lot of the effort in this process is testing and evaluating the melody to see if it is any good to listen to / any good in context. This effort hasn't been done if you simply brute force enumerate and list the search space without testing anything.
There are many kinds of design problems where the effort of checking if a proposed solution is any good vastly outweighs the effort of suggesting a solution.
We could do the same thing for other kinds of design problems.
In principle, what if we enumerate some (vast) finite space of digital circuits -- if we put a few limits on the amount of stuff we can put into a single circuit, and discretise any continuous parameters to produce a finite space. Does that mean we can copyright all these possible circuits, even if we put no effort into seeing if any of them are fit for any purpose?
Can we enumerate some large finite space of possible arrangements of atoms into molecules and then copyright them all, without putting any effort into analysis or experimental testing to see if any of the proposed molecules are fit for any purpose?
Snisarenko made a similar criticism when this was posted 15 days ago:
> As other commenters have pointed out, this is gimmicky, shallow and clickbaity. All that they did was count from 1 to 68 billion. Any piece of digital data can be converted to a number, and if we apply their argument then you can be "creative" by just counting numbers. But we all know that's not true. Once the search space becomes that big, you can't actually "enjoy" any of these melodies. Because you can't listen to all of them in your lifetime, the ones that you WILL hear are going to be awful 99.999 percent of the time. Hence, the "creative" process is navigating this search space, and figuring out which melodies are catchy. Better yet, trying to figure out how or why our brain decides to like or not like a melody.
From my perspective of framing this situation: this is dumb, as no design of any melodies has been performed. The design is the uniform prior distribution over some space of all possible melodies, most of which will be uninteresting / not fit for any particular purpose, which is a pretty uninspiring notion of design.
>Can we enumerate some large finite space of possible arrangements of atoms into molecules and then copyright them all, without putting any effort into analysis or experimental testing to see if any of the proposed molecules are fit for any purpose?
Pharma corps do this (with patents), ostensibly, but IP rights have a reasonable cost to them so they pre-filter.
Since they're not trying to profit from them, they'd probably just get cease and desist letters and take the offending ones down. Since one of the founders is a lawyer, it's probably feasible to handle most minor conflicts that come up while still protecting the vast major of the melodies.
This reminds me a lot a short story "The Library of Babel" by argentinian author Jorge Luis Borges in which he conceives "a universe in the form of a vast library containing all possible 410-page books of a certain format and character set"
I know it's many orders of magnitude more difficult.. but if I got a printer to print out one of every possible art and published it, CC licensed it, etc wouldn't that have the same legal status as these melodies... probably none?
I showed this article to a musicologist (in the UK) and he said:
> I assume that somehow lawyers can refer to a precedent of "this is too ridiculous to be taken seriously"
The other thing is that machine-generated works almost by definition do not reach the "creative" aspect of the "original and creative" requirement for copyright to apply.
Just to quickly add, I think the project is cool, it provokes interesting discussion and it doesn't need to be legally watertight or achieve what they say they set out to, to be worthwhile.
Love the spirit of this work, but nobody is an authority on what a melody is. How many different tuning systems were used? Why 8 notes? Why not 0..n notes? [0] Is a given melody, realized at 10ms per beat a melody? How about a 17-month pause before the first note? [1] Absurd? Sure. But who decides where the boundaries are? Lest one think that there is only an appeal to extremes here, check out Polansky's beautiful Lonesome Road, and put boundaries around melodies [2].
99% of all music is pop music, and the tuning system for all pop music is 12-tone equal temperament. Would a jury even be able to tell the difference between 12TET and just intonation?
8 notes is what popular music is based on. I don't see why it matter what the tempo is. Pretty sure no one would be in danger of copyright infringement of a 17 month pause.
OK, forget everything I said but the 8 notes thing. Just, no. "Yesterday"? Great melody. whistle just the first eight notes. "Yesterday, love was such an eas". Hmm. Wonder if there are more examples.
You wouldn't use a CNN for this because there's no spatial correlation to take advantage of. Maybe an RNN somehow but the basic problem is scoring (ie how melodic a song is) not classification so honestly I don't know you'd do this using neural networks.
On the contrary, with a more generous reading of the previous comment, it holds some merit.
1. CNN's are used fairly commonly for sequence tasks nowadays. Convolutions can be 1D after all.
2. It's also possible the previous comment was referring to using 2D convolutions on the spectrogram of the audio, which is a common approach.
3. Neural networks are capable of more than classification. Scoring is a regression task which is common application of neural networks.
I have some questions (mainly to improve my own understanding):
2. Since data is MIDI-encoded, would a convolution hold any merit here? I suppose you could render to an mp3 and analyze the audio itself but that seems very computationally expensive and prone to overfitting.
3) If we're training a scoring classifier, we would need labeled data, but getting those labels seems very challenging, not least because of how subjective our impressions of melodies can be (for instance, the opinions of a fan of atonality would be drastically different from a fan of pop). Do you have any ideas on how to mitigate this?
do people really run 2d convolutions on spectrograms? that seems rather backwards to me - why convolve in frequency space when you can just multiply in the time domain.
re regression tasks: sure I guess that's just an embedding basically right?
Yes, I meant to run 2D CNN over generated spectrograms. We do something similar to classify some specific emissions in RF with good success. As for scoring/classification, you can start by having an output of CNN say whether the song is catchy or not.
why run a CNN over a spectrogram? besides what I said (convolutions in frequency domain are multiplications in time domain) an FT is linear. if classifying using those features were effective then your CNN would've learned the DFT matrix weights from the original signal.
A spectrogram has time on one axis and frequency on the other, so the ultimate result is a multiplication in one dimension and a convolution in the other. It can be used to show things like when a note starts and stops in a piece of music, which is difficult in either purely-time or purely-frequency space.
Also, it’s computationally intractable to individually train 2^N weights. What a CNN does instead is train a convolution kernel which is passed over the whole domain to produce the input for the next layer; by operating in frequency space, it’s considering the basis functions e^{j omega +- epsilon} instead of delta(x +- epsilon)
my mistake i didn't realize spectrogram and spectrum were distinct objects.
>Also, it’s computationally intractable to individually train 2^N weights.
that's a good point - i'd forgotten for a moment (because i'm so used to cooley-tukey fft) that in principle getting the spectrum involves a matmul against the entire vector. which brings up a potentially interested question: can you get a DNN to simulate the cooley-tukey fft (stride permutations and all).
Assuming we accept their premise: how does this prevent lawsuits? Wouldn't the melody-set they've created be subject to a copyright infringement claim from every melody produced before their experiment that is still under copyright?
The method of generation prevents that. They didn't listen to song A and write down the melodies from it, then listen to song B and write down those melodies. They generated them all in order using a computer program.
Presumably it prevents lawsuits because I can write a song starting with a melody from this archive knowing that it's in the public domain. (Or because it makes obvious the absurdity of having a copyright on a 12-note melody in the first place.)
It doesn’t prevent lawsuits, or do anything at all. Other comments here are close; this material is not an original work, so copyright does not subsist in it. No copyright, no ability to place in public domain or issue licenses at all. This is the equivalent of pouring a glass of water in the ocean and then declaring you own all international waters.
It certainly contains many melodies which are not original. But what about the rest? It either contains some melodies which are original or it contains no original melodies. In the former case then they can be copyrighted, and in the later case they cannot (well, that hinges on access rather than mere equality, but I'll ignore that). However, in the latter case nobody else can get copyright on any new melodies either.
In copyright law, original does not mean new. It means that a person originated the work. Here, this was done by a computer. Remember the monkey who took a selfie who couldn’t originate a work, and therefore there was no copyright work to be owned? This is a monkey on a typewriter.
Copyright law protects expression, and the labour required to produce that expression. Just like people who copy others’ work to gain from it, the “authors” have attempted to gain from doing no work and letting a computer give them all the world’s music. Imagine if they didn’t release their library into the public domain, and instead started to immediately sue over every new song that came out? It is fortunate that they cannot do either.
I hope someone Copyrights all possible Java method signatures for the Oracle v Google case. You could write a "compression/decompression" tool in the website's javascript for people to check if you own the Copyright or not. There would be many pages of data so you might want to make it work offline with a Service Worker. To prove you are on the up and up.
I'm sure this has already been remarked upon in comments, but.... Not every possible melody!
A significant portion of melodies likely to be generated by people, or put another way these are the more trivial melodies. The range of complexity is vast, so just like cracking passwords might use a dictionary attack, or use English alphabet, or similar limited scopes, so it goes here.....
All very interesting discussions. Some have broached on it, but let's be clear: mechanically generated sequences are not copyrightable. The program that generates them is, of course.
It doesn't matter if there isn't enough space to enumerate them all. It doesn't matter if most of them sound like crap. It doesn't matter if some match a pre-existing song. It doesn't matter if you do or do not highlight "interesting" songs.
The very fact that they are machine generated eliminates the element of creativity that is a central requirement to copyright.
Copyright law may be stupid, but it isn't this stupid. An even more stupid argument that isn't even valid doesn't really help.
> 503.03(a) Works-not originated by a human author.
> In order to be entitled to copyright registration, a work must be the product of human authorship. Works produced by mechanical processes or random selection without any contribution by a human author are not registrable.
What an awesome effort. However, frivolous suits will still remain especially in places where the burden of proof rests on the defendant. A lot of these suits are filled knowing that they can't win but use the machinery to drain the defendant's resources until they are forced to submit because they have no means to fight it.
Not the greatest recording, but fantastic lightning talk kinda along the same lines but for code, "Mark Dominus - Debugging the De Bruijn Sequence (YAPC::Asia 2007)"
It wouldn't work in my country. Thing has to have individual creative quality to be copyrighted. Autogenerated stuff doesn't get copyright unless a human chose one of the autogenerated things for some particular creative reason. Then this thing gets copyright.
The idea might be to prevent lawsuits, but presumably the authors have opened themselves up to being sued by anyone who already owns a melody. If melody owners can't sue the authors then their work can't protect anyone either.
The artist suing, however, would have to worry about legitimizing this work. If the artist suing were to win, it would be setting precedent that everything else is the archive is also copyrightable, including all the melodies that are novel.
Won't they have inadvertently copyrighted tunes already under copyright? That is to say for anything written before this stunt, they've given away rights they don't own.
I hope that will not hurt them. Imagine that in copyright lawsuite I refered to their midi archive as a source and since that they become in charge for copyright infringement.
I think it's an interesting idea, however, wouldn't this also risk generating already-copyrighted melodies? Putting archive.org at risk of getting massively sued ?
It can be argued that since the melodies were generated by an algorithm, they are not a product of creative process and therefore not subject to copyright.
This reads like an Onion article. I heard about this on Lawful Masses with Leonard French, a copyright attorney who pwned Prenda Law. Also, the number of potential melodies, considering the degrees of freedom involved, is on the order of the number of games of chess. IANAL but this doesn't seem more than a stunt to call attention to the legal climate.
That is why I hope that experiments such as the one that the artists in the article do aren't successful and their copyright does not hold up in court.
If it did, an actor with more resources could feasibly blanket own any cultural good of value and automatically, quantifiably argue how close any arbitrary third party creation is to their copyrighted material.
I've often thought how strikingly easy it seems to be to produce sentences that have probably never been written before.
For example:
I am typing these words on a mobile phone whilst sat on a train to an airport, wearing a green jacket, brown shoes and a maroon hat with a black bag containing gifts for my children.
That seems a fairly prosaic sentence at first glance but I doubt has been written before.
I’m sure I’ve read (in the context of Google searches) that it’s very easy to write a sentence that’s never been written. Most sentences on HN are probably completely unique.
If you create a program which provably generates all of the members of a specified set, do you own the copyright to a member of this set regardless of the runtime of your program?
https://news.ycombinator.com/item?id=22413526
Edit: And apparently discussed even more 2 weeks prior:
https://news.ycombinator.com/item?id=22301091