Programmers generate every possible melody in MIDI to prevent lawsuits

BookPage · on Feb 28, 2020

vice article on same thing submitted 2 days ago and discussed:

https://news.ycombinator.com/item?id=22413526

Edit: And apparently discussed even more 2 weeks prior:

https://news.ycombinator.com/item?id=22301091

dr_dshiv · on Feb 28, 2020

And discussed in 1692, by Bernoulli, Mersenne and Kircher

http://articles.adsabs.harvard.edu//full/1979HisSc..17..258K...

kabdib · on Feb 28, 2020

Back in the day I worked for Atari, writing game cartridges for their line of home computers. I had a decent relationship with marketing and had developed a reputation as one of the more helpful geeks.

One fine day a marketing guy knocked on my door and asked:

"What would it take to print out every possible eight-by-eight bitmap? We want to copyright them so our competition can't use them."

Seriously.

So I told him the story about the king who wanted to reward the inventor of chess, and upon being asked what he wanted the inventor said "one grain of wheat for the first square, two for the second, four for the third..."

I thought a bit, and added "I think that printout would outweigh the planet."

He went away. I was not a helpful geek that day.

Later I realized that I had only considered black-and-white bitmaps, and that preempting copyright on color bitmaps would have meant a lot more planets.

There's got to be a Douglas Adams style tie-in here somewhere involving aliens with planet-chewer-uppers invading and taking Saturn, Mars, Jupiter and then us for some cockamamie copyright scheme in another galaxy . . .

syntheticnature · on Feb 28, 2020

Once, a guy at a bar told me he wanted to do a similar thing: displaying a 100x100 image that changed once a second to run through all possible bitmaps. He didn't believe me when I told him that doing so would exceed the the lifespan of the planet (to say nothing of what state the universe would be in).

tripzilch · on Feb 28, 2020

What I find useful when explaining people how long some very long count will take, is to tell them that 1 billion seconds lasts about 30 years, or that counting to a billion takes about 30 years[0]. Not sure if it would have helped with your guy, of course, but it generally provides some insight :)

[0] I wonder about estimating that time, many numbers may take longer than a second to even pronounce.

olooney · on Feb 28, 2020

Beholding the Big Bang, 2009, by Arthur Ganson

A series of 12 pairs of reduction gears, each at a 50:1 ratio. On the left, a motor spins at a high rpm; on the right, the drive shaft is embedded into a concrete block. It will be 13.7 billion years before the final gear completes a single rotation.

https://www.youtube.com/watch?v=VCA2whpMCno

nkrisc · on Feb 28, 2020

While rationally I understand how it works, it just seems to strange that so much movement can so easily be effectively nullified.

NikolaeVarius · on Feb 28, 2020

I wonder what the system as a whole is rated to. I suspect the motor will give out within another decade. I doubt that the system used a super long life motor, but I could be very wrong.

amichal · on Feb 28, 2020

This is at the MIT museum. I first saw it as a child decades ago and many times sense. The motor has been visibly replaced/refurbed and I'm sure they turn it off at night. All that said, when you see the first gear or two zipping around and stand there and stare at the Nth gear trying to see any movement at all it really drives home how these things scale. As a kid I once watched it for over an hour to see if could see ANY movement in the first gear that looked still.

kevinventullo · on Feb 28, 2020

I recently celebrated my billionth second :)

effingwewt · on Feb 28, 2020

Haha, nice! Here's to the next billion seconds! This is now my favorite way to tally life~

mooman219 · on Feb 28, 2020

An 8 by 8 bitmap has 64 pixels. there are 2^64 possible configurations of bitmaps, or 18,446,744,100,000,000,000.

Your average printer can print at 600 dots per inch^2. A4 paper is 11.69in by 8.27in, with an area of 96.6763in^2. You can fit 58005.78 dots on an A4 paper assuming you can print on the edges, double that for both sides at 116011.56 dots.

Ignoring actually fitting the bitmaps on the paper, you can store 1812 bitmaps per sheet of paper. You can roughly fit all possible combinations of bitmaps on 10,176,500,000,000,000 sheets of paper.

Typical office paper weighs 5 grams. Ignoring the weight of ink, your total mass of paper would be 50,882,500,000,000 kilograms of paper. Pluto weighs vastly more at 13,090,000,000,000,000,000,000 kilograms.

You definitely don't need a planet's mass worth of paper, but maybe a couple of planets worth of production however. We'd be more likely to see planet scale enslavement in a Douglas Adams galaxy wide copyright mockery scheme, which galactic court would likely throw out since they don't appreciate mockeries of law, leaving everyone a bit disappointed and with entirely too much visibly grey paper.

kabdib · on Feb 29, 2020

(chuckles)

Now, for the colored bitmaps case, how much does the ink cost? :-)

We're also ignoring some edge effects, where bitmaps can be "shared" by sweeping an 8x8 frame across a printed page at the level of pixels to generate the patterns. I don't know how to think about the scale of that problem right now, don't even know if it's easy to do optimally or wickedly difficult, or if Knuth has a solution somewhere in Vol 4 . . .

mooman219 · on Feb 29, 2020

A black ink HP 64XL cartridge can print 600 pages assuming a 5% page coverage of ink per page. We're actually coverage roughly 50% per page, and we do that on both sides, so we can only effectively print 30 pages with our HP 64XL cartridge.

The cartridge costs about $40. We can expect to spend around $407,060,000,000,000,000 on printer ink for this endeavor, but HP might consider a bulk rate at that point.

The global world product is roughly $77,868,000,000,000. If we all unite under this noble cause (and assume no economies of scale), we can payoff the cost of the ink in only 5227 and a half years, assuming paper is free (which it totally is, it grows on tress!). If someone works out how much carbon this captures, we might prevent further global warming to troll domestic courts.

Zacru · on Feb 29, 2020

Sounds like a 2-d superpermutation. That is going to get messy very quickly.

https://en.m.wikipedia.org/wiki/Superpermutation

kabdib · on Feb 29, 2020

Wheee! Thanks for that.

ninju · on Feb 28, 2020

A small tangent over to describing how many different permutations you can have with deck of cards

Taken from https://czep.net/weblog/52cards.html

So, just how large is it? Let's try to wrap our puny human brains around the magnitude of this number with a fun little theoretical exercise. Start a timer that will count down the number of seconds from 52! to 0. We're going to see how much fun we can have before the timer counts down all the way.

audunw · on Feb 28, 2020

Most of the generated images would just be noise though. You could probably be more clever than just generating random images, and have a decent chance of generating something that someone would design. These days you could use neural networks, but obviously that wasn’t feasible then

jniedrauer · on Feb 28, 2020

I tried calculating this number for 16-bit images in Python (8^8^16), and the interpreter just froze. Go was pretty quick though: 3.940200619639448e+115

ericfrederich · on Feb 28, 2020

Not sure where that math came from. 16 bit 8 by 8 pixel image would be 16^(8x8) or in Python...

  print(f"{16 ** (8*8):.4E}")

  1.1579E+77

EDIT: I calculated 16 color, not 16 bit. Disregard

Python wanted to use a float to convert to exponential format but it overflowed. Need to use strings ;-)

  n=(2**16)**(8*8); s=str(n); print(s[0] + '.' + s[1:6] + 'E+' + str(len(s) - 1))

  1.79769E+308

DeusExMachina · on Feb 28, 2020

For scale, there are an estimated 10^80 atoms in the entire universe.

esoterica · on Feb 28, 2020

Neither of those numbers are correct. Firstly, exponentiation is right associative so 8^8^16 ≠ (8^8)^16 ≈ 3.94e+115.

Secondly, that number is wrong anyway. The correct number is (2^16)^(8*8).

jniedrauer · on Feb 28, 2020

Yeah apparently each pixel of a 16-bit image can have 256 different colors. Oops.

messe · on Feb 28, 2020

That's an 8-bit image. Each pixel of a 16-bit image could have 2^16 = 65536 possible colours.

jniedrauer · on Feb 28, 2020

I give up. I shouldn't post before having coffee.

hansvm · on Feb 28, 2020

Python is probably trying to compute it exactly.

yellowapple · on Feb 29, 2020

You could've settled for every 4×4 bitmap, of which there are only 65,536 black and white ones. Then you could claim any 8×8 bitmap as a derivative work (by concatenation/aggregation/whatever) of up to four of your 4×4 bitmaps.

tunesmith · on Feb 28, 2020

It's about fourteen kinds of ridiculous, as summarized in other threads. No rhythms, no meter, no tempo, melodies are longer than 12 notes, it's diatonic, single octave, no concept of underlying harmony, the headline is literally false, etc.

Some of the copyright lawsuits are dumb and this is effective satire or performance art but that's all it is.

Intermernet · on Feb 28, 2020

It's definitely satire, but it's satire in the face of comical law. That's the point. If copyright lawyers want to argue originality based on an arrangement of notes in a 12 tone scale, and in a limited number of bars, then this is a completely valid argument against such a weak argument.

The reality is that many number one songs can be tonally compared to many classical pieces, or even pieces from the last 40 years. The current state of music copyright law is an absolute joke, and deserves to be "disrupted" (destroyed).

paulddraper · on Feb 28, 2020

> copyright lawyers want to argue originality based on an arrangement of notes in a 12 tone scale

Interestingly, there are writers who want to argue originality based on the arrangement of letters in 26-letter alphabet!

The world is indeed a strange place for the dogmatically logical programmer.

hannasanarion · on Feb 28, 2020

You're not gonna get away with copyrighting a 4-letter sequence, but the recent Katy Perry vs Flame lawsuit established that 4 notes is all it takes to have a copyrightable melody.

paulddraper · on Feb 28, 2020

It's not four, but eleven letters and two spaces was sufficient for Universal to win against Kamar in the "ET phone home" copyright lawsuit

bdowling · on Feb 28, 2020

> “ET phone home”

Usually short phrases aren’t supposed to be protectable under copyright. However, when a defendant blatantly appropriates a well-known literary phrase for a commercial purpose like selling unlicensed merchandise, courts may make an exception.

notahacker · on Feb 28, 2020

Yep. One of the arguments no sane person should countenance is that Katy Perry's songwriters happened to use a four note descending synth arpeggio in an intentional attempt to cash in on the fact a four note descending arpeggio in a different key was a motif used on one of the sixteen tracks of an album which hit number five in the Gospel Charts four years earlier. For similar reasons, Universal isn't going after most of the 3.8m websites using the phrase 'phone home', and you're probably OK using three stripes in artwork unless you're drawing them on the shoulders of sportswear or sides of shoes to make it look like Adidas.

[there actually are musicians that specialise in recording backing tracks intended to resemble a particular popular recording which aren't that recording for use in commercial products, but they tend not to get sued...]

lmkg · on Feb 28, 2020

Copyright law considers the importance of a sample to the work as a whole, in addition to just the size. Any arbitrary three-word phrase from the middle of a script is probably not copyright infringement. The most memorable line from the entire movie probably is.

Related: The Supreme Court ruled that excerpting a single paragraph from a 454-page book can be copyright infringement. The book was Gerald Ford's memoirs, the one paragraph was his reasoning for pardoning Nixon. The Court's reasoning was more-or-less that nobody cared about anything else Ford did, so excerpting the one paragraph was as good as giving the whole book away for free.

bdowling · on Feb 29, 2020

The amount and substantiality of the copied portion to the work as a whole is one of the factors for fair use, an affirmative defense.

Whether a work in question is even eligible for protection at all is a distinct legal question.

paulddraper · on Feb 29, 2020

Yes, but no one is disputing the legal eligibility of movies or songs for copyright protection.

The question in play is how much of of the copyrighted work may be reproduced before it is infringement.

paulddraper · on Feb 28, 2020

Yes, that is very similar to the copyright argument made by Flame.

Of course not every set of four notes or eleven letters in a copyrighted work is prohibited; the significance and context of the use matters.

derefr · on Feb 28, 2020

Wouldn’t that be considered something more like an implicitly-created trademark? It’s essentially the equivalent of a company motto for the the movie’s SPV company.

logfromblammo · on Feb 28, 2020

I would note that 4 minutes and 33 seconds of silence is copyrighted by John Cage as " 4'33" ", and a single chord sustained for 20 minutes, followed by 20 minutes of silence, is copyrighted by Yves Klein as "The Monotone-Silence Symphony".

jerf · on Feb 28, 2020

Cage's case is more complicated and less a violation than it may sound at first. It's not 4'33" of "mathematically true silence", that is, you don't violate it per se just by having the right number of zeros in your .wav file. It's 4'33" of performed silence, where the performance is actually the incidental noise of the auditorium it is in. Having a copyright on this particular piece specifically in a performance context may still be an interesting thought experiment, but it doesn't break the system as a whole.

In the latter case, there's also no real risk of accidentally stomping on that.

The claim in this particular case is that they really have generated the entire possible melody space. Legally I think it's likely to fail on multiple levels if it is ever challenged, but part of the point is that some of those failures should also be applied to some real copyright suits that have been won.

(It is somewhat ironic that the music industry continues to be so upset about copyright even as they appear to be converging on The One True Pop Song at speed. Maybe if they acted less like some sort of bizarrely over-trained AI and cranked up the exploration constant, they'd stomp on each other less.)

TeMPOraL · on Feb 28, 2020

> Maybe if they acted less like some sort of bizarrely over-trained AI and cranked up the exploration constant, they'd stomp on each other less.

It's not just the music industry. The whole economy starts to feel like overfitting the profit function.

salawat · on Feb 28, 2020

>It's 4'33" of performed silence, where the performance is actually the incidental noise of the auditorium it is in.

That's an incorrect way to view it actually. 4'33" copyrightable essence is actually represented by the active production of its scoring. I.e. nothing. The background sound is not what makes it copyrightable. You can go ahead and sit at a piano for the length of the composition all you want, wherever you want, and you'll still be publically performing 4'33".

The rather humorous outcome, if one asks me, is that anyone who writes in 4 beats of silemce into a score should be violating copyright if we're going to be consistent.

logfromblammo · on Feb 28, 2020

Some smartass artists have actually written their silent compositions as rhythmically structured rests. That is, something like "silence in 6/4 time, as three quarter rests and three eigthth-triplet rests". It's intentionally absurd, but as the minimum-information notation would be the full-measure rest glyph and the number of measures of duration, the deviation from this is undoubtedly creative content that is copyrightable on paper.

That kind of thing is completely unenforceable with respect to performances, but in written musical notation, copying the specific notation pattern could be infringement. If you write "4/4, tempo 80, 91-measure rest", that's maybe violating the 4'33" copyright. If you write a score for a full band or orchestra that shows rests in each measure for each instrument, with key changes and tempo changes and such, you're just retelling the same joke in a different way.

derefr · on Feb 28, 2020

> Having a copyright on this particular piece specifically in a performance context may still be an interesting thought experiment

Not even a thought experiment: it’s essentially a 4’33”-long ambient acoustic sample. There are plenty of these (though not usually that long) in sample libraries, recording e.g. traffic sounds, or diner conversation, or crickets in a marsh in summer, etc. And those are certainly copyrighted, unable to be used without license.

salawat · on Feb 28, 2020

>Not even a thought experiment: it’s essentially a 4’33”-long ambient acoustic sample.

I would question any legal professional's authoritative standing to even advise on copyright of a work of music if they miscategorize a recording of ambient sound as a performance of a musical scoring consisting entirely of silence. The copyright doesn't apply to the ambient sound, but to the long quiessence of an artist at their instrument.

It demonstrates a complete blindness of the negative space of music, and a positivistic bias that has no place being enshrined in our legal system.

jerf · on Feb 28, 2020

Yeah, it's more a thought experiment you get in copyright class. You are right, legally the matter is fairly settled.

SilasX · on Feb 28, 2020

Exactly! What they're doing would otherwise seem pointless, but given how far off the rails the courts went on that lawsuit, copyright jurisprudence needs this kind of sanity check, because courts clearly aren't using enough criteria to call a melody "copied".

ajuc · on Feb 28, 2020

The whole concept of IP is absurd, and there are many absurd consequences you can derive from it.

But there are degrees of absurdity, it's one thing to do that when there's 26^100000 possible combinations, it's another when there's just 12^100 (and if you only care about melody it's overestimation, most songs will use much smaller subset of that).

shadowgovt · on Feb 28, 2020

It's not like this absurdity was unknown to the framers of copyright. Thomas Jefferson, writing regarding patents, wrote "other nations have thought that these monopolies produce more embarrassment than advantage to society..." And yet, they persist (and in the modern era, proliferate) because it's generally agreed society derives benefit from paying people for their purely-idea creative work.

... but that doesn't mean copyright and patent aren't a perpetual battle against the "natural" arrangement of idea, and absurdities are extremely possible when the law is misinterpreted or mis-structured.

JackFr · on Feb 28, 2020

It's weird -- legal arguments aren't protected under any sort of intellectual property regime. In fact in common law jurisdictions, it's encouraged for you base your work off that of previous practitioners. I always wonder how different our intellectual property regime would be if lawyers demanded to be paid royalties for others citing cases which they had won.

shadowgovt · on Feb 28, 2020

Copyright as an institution is grounded on arguments of net societal benefit, and it doesn't take much argument to demonstrate that there is net societal harm to intentionally diminishing people's ability to comply with the law.

dependenttypes · on Feb 28, 2020

Sadly the Norwegian courts do not seem to agree. See this series of blog posts by the creator of CSS. https://www.wiumlie.no/2018/rettspraksis/06-11-blog

s1artibartfast · on Feb 28, 2020

Somewhat tangentially, in some fields and jurisdictions, the legal code which is enforced is copyrighted and paywalled by third parties or the Gov. itself.

https://www.aclu.org/blog/free-speech/court-tells-georgia-it...

JadeNB · on Feb 28, 2020

> And yet, they persist (and in the modern era, proliferate) because it's generally agreed society derives benefit from paying people for their purely-idea creative work.

Who "generally agree"s on this? The existence of laws doesn't indicate the mood of society.

shadowgovt · on Feb 28, 2020

> The existence of laws doesn't indicate the mood of society.

But if laws are proliferating and regularizing instead of standing still or being abolished, and one assumes that elected representatives are acting on the will of the people, it probably does.

There's lots of controversy over how to improve copyright / patent law, but not very many people in governments in the EU, US, China, Japan, Australia, &c are talking seriously about just burning the whole copyright / patent system to the ground. At least a subset of the countries in the groups listed are generally understood to have representative governments.

Retric · on Feb 28, 2020

Representative governments are well known for supporting powerful groups over the collective benefit of society. Just look at any of those countries tax codes and you will find a multitude of special exceptions.

Patents are of limited duration because the tradeoffs of unlimited patents are so horrific. If we accept a billion dollar drug must enter the public domain, clearly copyright should also be limited just as it was proposed in the US constitution. However, because a tiny minority has a huge benefit and society does not really notice the difference you get the modern mess of unending copyright.

shadowgovt · on Feb 28, 2020

"A tiny minority has a huge benefit and society does not really notice the difference" seems like a utilitarian argument that the system is working as intended; there is net positive benefit.

Retric · on Feb 28, 2020

If it costs 350 million people 1 cent to hand me 1 million dollars they don’t notice, yet that’s a net loss. Critically, even when you include those who benefit it’s still a loss.

shadowgovt · on Feb 28, 2020

Only if you consider money completely fungible and use money as your utility function for determining loss here.

You haven't diminished the ability for 350 million people to do things, practically, by shaving 1 cent off of them. But adding the ability a million dollars provides one individual to do something cool with 1 million they couldn't do before has increased the overall capabilities of everyone.

In essence, you've just described Kickstarter's business model.

Retric · on Feb 28, 2020

You’re forgetting just how many organizations all do this, it’s adding up to a significant chunk of global GDP. It’s got real long term consequences.

Basically, even if individual bacteria are unnoticed enough of them can kill you.

JadeNB · on Feb 28, 2020

> "A tiny minority has a huge benefit and society does not really notice the difference" seems like a utilitarian argument that the system is working as intended; there is net positive benefit.

If tiny minority + 'society' is the entirety of the system, then that's true, but there are also plenty of players who lose out due to restrictive IP regimes—and it's hard to quantify the extent of those losses. (Whether or not the benefits they would reap from looser IP are appropriate or fair is beside the question for utilitarian computations.)

shawnz · on Feb 28, 2020

> Patents are of limited duration because the tradeoffs of unlimited patents are so horrific.

What would be the point of a patent system with unlimited duration? If we wanted that, we could just have companies not reveal their inventions in the first place

JadeNB · on Feb 28, 2020

> one assumes that elected representatives are acting on the will of the people

Yeah, that's the assumption I take issue with.

s1artibartfast · on Feb 28, 2020

I generally agree that society benefits from IP.

I think IP is required to monitize valuable ideaS, and monitzation leads to social availability.

On the personal level, I would not like it if publishers could freely print the works of new authors, or engineering solutions I spend years on could be copy and pasted.

Some people prefer Creative Commons, and they are free to publish their work that way. Others need or want financial compensation.

JadeNB · on Feb 28, 2020

> I generally agree that society benefits from IP.

I certainly didn't mean to say that no-one agrees with it, but your personal agreement doesn't evince the general agreement that shadowgovt (an interesting username, in this context …) suggested. To be fair, neither does my skepticism provide any evidence against it.

s1artibartfast · on Feb 28, 2020

I think we can align on the lack of evidence presented. There is also the question on what agreed to means in this sense.

In terms of public opinion, it would be interesting to know what studies have been done. I imagine if you would get broad support for an author copyrighting a book, and less on patenting a pre-existing genetic sequence.

As another unsubstantiated claim, I think if you sat down with the general public and the criteria for patents, they would mostly agree.

The challenge has to do with implementing them and the legal process around them.

If a crappy patent is issued to large corporation, it is incredibly expensive to challenge them.

fwip · on Feb 28, 2020

Jefferson's rich friends.

white-flame · on Feb 28, 2020

They persist because they allow some people to create extreme monetization strategies, which feeds into lobbying congress for further expansion of copyright.

shadowgovt · on Feb 28, 2020

That doesn't explain why copyright regimes similar to the US persist in countries around the world, or why countries are finding their way to treaties that standardize international copyright enforcement that look more like the US regime than other country's regimes.

munk-a · on Feb 28, 2020

The US has a lot of clout in the world and exports its laws and culture abroad pretty heavily. Countries that have tried to outlaw Coke or Cigarettes are usually sued into the ground by large US corporations and, when that fails, the US has used sanctions to back up the accessibility of US products.

Basically, America's crazy overreach forces our laws onto other nations - this is actually one thing that really frustrates me about corporate tax loopholes, that overreach could be trivially used to force better international standards for corporate VAT taxes there just isn't the political will (due to lobbying) to get it done.

paulddraper · on Feb 28, 2020

> just 12^100

At the risk of getting stuck in a loop: "No rhythms, no meter, no tempo, melodies are longer than 12 notes, it's diatonic, single octave, no concept of underlying harmony"

https://news.ycombinator.com/item?id=22441366

ajuc · on Feb 28, 2020

I was overestimating, I even mentioned it.

> and if you only care about melody it's overestimation

And they did it that way because courts don't worry about the exact tempo, meter and rhythm when ruling on plagiarism.

There was a guy trying to copyright A CHORD :)

paulddraper · on Feb 28, 2020

> exact tempo, meter and rhythm

That much is true.

> a guy trying to copyright A CHORD

IIRC he failed miserably.

A better example would be the (until recently) coprighted song "Happy Birthday." [1]

[1] https://www.nbcnews.com/business/business-news/happy-birthda...

young_unixer · on Feb 28, 2020

It would be interesting to study what % of programmers agree with IP laws vs people in other fields.

JackFr · on Feb 28, 2020

Some people would tell you it's an ill-posed question.

Some might say lumping trademark, copyright, patent and trade secret laws (historically and in practice very different things) under one heading called "intellectual property" is an intentional strategy to muddy the waters and cloud any argument.

hsitz · on Feb 28, 2020

And interesting to include in a study like that how much people in each field actually understood about copyright law. My guess is the general level of understanding is pretty low.

molmalo · on Feb 28, 2020

Well, somebody should then copyright the whole contents of https://libraryofbabel.info/

:)

thaumasiotes · on Feb 28, 2020

>> If copyright lawyers want to argue originality based on an arrangement of notes in a 12 tone scale, and in a limited number of bars

goto11 · on Feb 28, 2020

So I can write a program which can generate your name and your sexual preference (among a lot of garbage data). Does that mean this can no longer be considered private information subject to privacy laws?

You can use a ridiculous argument for many things.

Lewton · on Feb 28, 2020

“You can use a ridiculous argument for many things.”

Yes that’s actually the entire point of this

A lot of commenters seem to be missing the fact that this is a response to ridiculous arguments being used in winning court cases

Nasrudith · on Feb 28, 2020

The garbage data makes all of the difference. Said accidental generation would make it legal as the only way to pick out the data is to know it already. Otherwise what separates your real name from "Name: Seymoure Butts Sexual Orientation: Mayonnaise".

The fact there is no garbage data at all shows that they are doing judgement on a nakedly wrong level in music - even by the standards of copyright.

Selling "Harry Potter but with the capitalization inverted" to dodge book copyright nor even "this key and this very long block of data which happen to decrypt to the complete works of JK Rowling, don't decrypt because that would be infringement wink wink nudge nudge".

goto11 · on Feb 29, 2020

I would presume to vast majority of the automatically generate music is basically garbage?

If we say that music (and really any information) is just numbers which can be enumerated automatically, then surely the creative action is finding and picking a number which is actual interesting out of the infinite sea of random garbage.

My point is that framing music as "just numbers" does not disprove that producing a song is a unique creative work. There may be valid arguments against copyright, but this one isn't.

akhosravian · on Feb 29, 2020

How about selling grape juice with a warning on how to not ferment it to avoid prohibition laws?

https://vinepair.com/wine-blog/how-wine-bricks-saved-the-u-s...

fphhotchips · on Feb 28, 2020

Yes, of course it means that. I might randomly generate data to test with. Doesn't mean it's private and needs to be treated as such if I accidentally and unknowingly stumble across real info.

simonh · on Feb 28, 2020

You could be helpful and provide a means for people to request their accidentally reproduced private data be redacted. See example below[0].

[0]https://github.com/danielmiessler/SecLists/pull/155

Nasrudith · on Feb 28, 2020

Ironically that would be a way to disclose your very real data by omission.

apetresc · on Feb 28, 2020

That's the joke.

jerf · on Feb 28, 2020

This actually reveals an important truth about the nature of information: Information is often better understood as exclusionary, rather than somehow "creative". If I have a "thing", you don't know what color it is. If I know tell you it is "red", you still don't know the exact shade, but I have excluded a lot of possibilities. How informative my statement is depends on how much is excluded. If I name an exact Pantone color, I am being much more informative.

In some sense, looking at information as being exclusionary and as being inclusive are the same thing, but there's a lot of ways in which the former actually makes more sense as a thought framework.

And in this particular context we can see how that plays out... a list of all possible melodies of a given nature actually has very little information in it, because it doesn't exclude enough. It may superficially seem to our human senses that a lot of stuff has been included/constructed, but in reality, the 'list of every possible melody' is a vapor. There's not actually anything there. It is the act of exclusion of possibilities that leads to interesting information. Such information as this list has is contained in its specification of what a "melody" is. Counterintuitively (to a lot of people's understanding), if they widened the specifications, while they would end up with a bigger list they'd end up with less information in the result.

The act of creating a song isn't a matter of creating the possibilities from the raw nothingness, it's a matter of carving them out of the exponentially-large space of possibilities and finding something there useful. The exponentially-large space is so large that it is very easy to not see it that way, because, I mean, it's huge. It doesn't feel like "removing" possibilities the way carving a 3D stone does ("I remove everything that doesn't look like my desired statue"), because the exponential space is so inexpressibly larger, and we need fundamentally different tools to address such a space, but in the end, it's the same thing.

While this isn't what the law was written for necessarily, the "creativity" requirement here could be very easily pressed into service here. They've expressed very little creativity/exclusion on this list and it would be easy to argue it falls far below the threshold necessary for copyright. As a literary criticism of the system, it is successful and thought provoking... as a legal criticism of the system it would fail completely.

nybble41 · on Feb 28, 2020

> ... as a legal criticism of the system it would fail completely.

The legal criticism would be that there just aren't that many unique melodies—as demonstrated by the fact that they were able to enumerate them all—so the mere fact that two songs use the same melody is not sufficient to show that one is a copy of the other. The set of melodies that are compatible with human ascetics is even smaller. They don't actually need these auto-generated melodies to qualify for copyright for the project to succeed. It works equally well if similarity in melody is not considered sufficient evidence of copyright infringement.

Even just having the database around so that one can say that they copied the melody from here rather than from some other source might be enough. After all, unlike patents, independently producing something similar to a copyrighted work is not infringement; you have to have actually copied from the other work. If you're a musician perhaps you should listen to a few randomly-selected melodies from this program each day. Maybe it will spark something, but even if it doesn't it will at least make it harder to argue that whatever melody you come up with could only have been "subconsciously copied" from some other composer's song you may have heard decades ago.

goto11 · on Feb 29, 2020

> The legal criticism would be that there just aren't that many unique melodies—as demonstrated by the fact that they were able to enumerate them all

How does that follow? You can enumerate any finite number. And the article doesn't say how big. Is it a thousand or a trillion? "Riehl says the algorithm works at a rate of 300,000 melodies per second.". The article doesn't say how many seconds it took to generate all melodies though.

nybble41 · on March 2, 2020

> You can enumerate any finite number.

Not within a fixed time period in the real world. You're limited by the matter and energy available, and by the speed of light. However we're not talking about the theoretical ideal limits of computation. The upper bound would be 300k melodies for each second since the program was written—68.7 billion in all, according to the Adam Neely interview linked from the Press page of the project site. Which is a lot, but then there are hundreds of millions of known songs, each of which is likely to contain multiple melodies, some of which are much more likely to be chosen than others. Accidental duplication is thus quite likely.

tingletech · on Feb 28, 2020

https://libraryofbabel.info

mannykannot · on Feb 28, 2020

Ha! It does not work however, as a person's sexual preference is a fact, and a statement that either guesses or spits out all recognized variants is not a source of factual information -- no-one is any the wiser afterwards than before.

jameshart · on Feb 28, 2020

On the other hand if you offered it up as a fact with reckless disregard for its truthfulness...

Again we see programmers trying to understand the law in terms of ‘how can a piece of data be illegal?’ while the law is quite happily focusing on making specific actions illegal.

‘You can’t arrest me, gold bars aren’t illegal!’ ‘Yes, but carrying them out of the federal reserve vault without permission is.’

ajuc · on Feb 28, 2020

"I invented this later but independently" isn't a valid defense, even if you can prove it. So it's not actions (like copying or plagiarism) that is prohibited, it's the result.

That's the problem with IP law.

In effect you give people monopoly on numbers. When the numbers are big nobody is bothered by this, because chance of arriving at the exact same one is effectively zero. But for songs the numbers are pretty small (depending on the encoding used to compare the songs), and the absurdity is evident.

wool_gather · on Feb 28, 2020

That's not correct in the case of copyright. Independent invention can be used as a defense to a copyright infringement claim.

> Generally, a plaintiff proves copying through circumstantial evidence, showing that the defendant had access to the copyrighted work [...] > > [...] unlike in patent law, if a defendant independently creates the substantially similar work, he is not liable to the copyright holder.

https://www.finnegan.com/en/insights/copying-copyright-s-wil...

paulddraper · on Feb 28, 2020

s/song/music/

i.e. sans lyrics

LMYahooTFY · on Feb 28, 2020

This is a complete non sequitur.

Your example is barely plausible, much less demonstrable.

The connection between "original works" using an extremely limited set of notes, and your right to privacy using some theoretical predictive algorithm is not at all obvious.

goto11 · on Feb 28, 2020

I agree. I wanted to point out the absurdity of the argument used in the article. The argument is that music is just numbers and numbers are not copyrightable. But any kind of information is "just numbers" which can be enumerated given enough time.

young_unixer · on Feb 28, 2020

The argument simply exposes the ridiculousness of the law.

bougiefever · on Feb 28, 2020

For example, there are a number of songs that include a rendition of Tom's Diner within a larger melody. Also popular to add to a song is the Arabian riff melody (which is so old it's not copyrightable, but you get my point) I don't know of any law suits around Tom's Diner song, but this is an example of the types of ridiculous lawsuits out there over copyright.

albertzeyer · on Feb 28, 2020

The code actually can produce every possible melody in MIDI. They simply have not stored every possible melody explicitly (uncompressed) on a hard drive (which is impossible, as the size is infinity).

However, if you interpret the program itself as a self-extracting compressed archive, they actually have stored every possible melody (in a compressed way).

So the question reduces to how much the type of compression matters here (is ZIP allowed? is TAR allowed? what about more sophisticated like PAQ? and what about this Rust code?). This is what I/we discussed here: https://news.ycombinator.com/item?id=22441328

paulddraper · on Feb 28, 2020

Gotta love programmers.

> So the question reduces to how much the type of compression matters here

If you compress, you can copyright the compressed bytes.

If you don't compress, you can copyright the uncompressed bytes.

As far as that copyright extending to derivations, e.g. decompressions, the answer indeed situation-dependent. For example, converting a copyrighted font from TTF to WOFF does not remove the copyright. But converting a copyrighted font from TTF to screen pixels to WOFF removes the copyright. (Sorry I don't have a reference; probably findable.)

The "self-extracting zip" derivation would probably fall into the latter category; that is, the copyright would not transfer.

---

But even if the copyright were maintained during your advanced decompression, one could argue that editing down to very specific portion of that extremely large body of work was a substantive/transformational derivation, which they could then copyright themselves.

Transformative works are very common in art. The most famous example is Duchamp simply adding a mustache to a print of da Vinci's Mona Lisa, and copyrighting that. [1]

[1] https://en.wikipedia.org/wiki/L.H.O.O.Q.

paulgb · on Feb 28, 2020

> But converting a copyrighted font from TTF to screen pixels to WOFF removes the copyright.

I hope nobody takes this as legal advice; I'm not a lawyer but I'm fairly certain it's wrong. The font would still be the same work.

paulddraper · on Feb 28, 2020

Alright, you made me add references :)

It comes down to this: Typefaces/glyphs are not copyrightable. The font code that produces those glyphs is.

> Typefaces cannot be protected by copyright in the United States (Code of Federal Regulations, Ch 37, Sec. 202.1(e); Eltra Corp. vs. Ringer)...However, there is a distinction between a font and a typeface. The machine code used to display a stylized typeface (called a font) is protectable as copyright. [1]

In software, a similar "black-box" derivation process has happened many times, e.g. UNIX/GNU. Copyrights applies to software source code, but not software functionality.

Determining what is the "essential, creative work" in each case in a nuanced way is a matter for courts and armies of lawyers: Apple round corners, Oracle Java APIs, etc.

[1] https://en.wikipedia.org/wiki/Intellectual_property_protecti...

jameshart · on Feb 28, 2020

Presumably copyright law makes this allowance for type because the point of typefaces is to be reproduced.

paulddraper · on Feb 28, 2020

That is in fact, the point of all digitized information, including recorded audio.

(But yes, I do agree that typefaces are exceptionally commonly distributed.)

paulgb · on Feb 28, 2020

Thanks for supporting your point :) I have to admit that the difference in copyrightability of fonts vs. typefaces is new to me.

paulddraper · on Feb 28, 2020

Now at least the inane font-vs-typeface distinction made by modern designers has some practical distinction to it, albeit a legal one.

ratmice · on Feb 28, 2020

This is true of fonts, as per the copyright office's "Policy decision on copyrightability of digitized typefaces".

However the example here, and the situation with fonts seem different. It is the case that font data is viewed as utilitarian and uncopyrightable. So we have a copyrightable program, producing uncopyrightable data. The argument here seems that we have a copyrightable program producing copyrightable data.

Leszek · on Feb 28, 2020

Is uncompression a derivation?

paulddraper · on Feb 28, 2020

I updated my comment to make this more clear. tl;dr depends.

TeMPOraL · on Feb 28, 2020

> So the question reduces to how much the type of compression matters here

It doesn't matter at all. As 'patio11 correctly points elsewhere in the thread[0], this question boils down to the colour of the bits.

--

[0] - https://news.ycombinator.com/item?id=22441118

programd · on Feb 28, 2020

Here's the link to the original "What Colour are your bits?" blog post by Ansuz - very much worth reading

https://ansuz.sooke.bc.ca/entry/23

auiya · on Feb 28, 2020

Every possible combination of 8 notes in 12 beats is not infinite. Assuming all quarter notes it's just (8^12)-(8-1)^12 = 54,878,189,535. If you factor in rhythmic variances, the number is much much larger, but still not infinite.

https://plus.maths.org/content/how-many-melodies-are-there

arghwhat · on Feb 28, 2020

More correctly, it can produce every possible monophonic sequence of tones given infinite time, which in the article was further limited to monophonic 12-tone sequences across a single octave (any other limitations I missed?).

A melody contains more than a sequence of tones. The most heartless definition would at the very least include rhythm. For every sequence of tones they output, they only produce one out of hundreds of possible melodies for that sequence.

It is of course an interesting thought to consider the definition of decompression, but on the other hand, we should also limit the contributed idiocy to the bare minimum required to break the relevant idiotic rules.

gammadens · on Feb 28, 2020

Part of me wonders if the next step with this is some sort of DL model. I wonder if, trained on one set of melodies (defined in the intuitive sense) it would generate existing copyrighted melodies not in the training set.

pingyong · on Feb 28, 2020

By that logic, a program that simply counts up from 0 (with bignums, as long as the machine has enough memory), is actually a compressed form of every single piece of data or information that has ever, will ever or can ever be created.

ehnto · on Feb 28, 2020

Music lawsuits can be based on about as much information as they're conveying. I think that's their point, too. I don't believe they were trying to be satirical, they wanted to prove a point about the nature of music itself that could be used in defending musicians against lawsuits.

One of the items they were trying to point out, often abused in lawsuits for pop music, is the idea of "Access". If you came up with an idea all by yourself, but a similar song exists that is popular enough, the court argues that just by there being the possibility that you heard it, you therefore definitely heard it and then copied it.

If this music set exists, and is freely available, shouldn't it be considered that you had reasonable access to it and therefore stole it? No, of course not, that would be a ridiculous assumption and so is the current outlook of a song being popular being enough proof that you stole the idea.

Not to mention, music is extremely formulaic. Chord progressions have a natural tendency to certain forms, with centuries of prior art, rhythm within genres of music is often the same, even melodies have a trend toward particular combinations (leading tones over chord progressions bring about lots of similar sounding solos).

Any musician trying to claim copyright for their music should remember that their song only exists on the back of centuries of musical exploration. Consider how much of the song you can say is truly novel, it's going to be nearly nothing.

The combination of lyrics + chords + melody is in my opinion, the absolute minimum you need to claim a song has been copied. Lyrics are derivative, melodies are derivative, chord progressions are derivative, but together they have the chance to be a unique combination.

SilasX · on Feb 28, 2020

>I don't believe they were trying to be satirical, they wanted to prove a point about the nature of music itself that could be used in defending musicians against lawsuits.

Sufficiently advanced “proving a point via absurdity to make a more general argument” is indistinguishable from satire.

paulgb · on Feb 28, 2020

> No rhythms, no meter, no tempo, melodies are longer than 12 notes

I am not a musician, but which of these are copied in the Tom Petty / Sam Smith case that motivated this exercise? To my untrained ear, I do hear some similarities in the relative lengths of the notes (meter?).

I found a side-by-side comparison https://www.youtube.com/watch?v=YflFw9T77FQ

rdlw · on Feb 28, 2020

I don't have a great ear, but I think it's a similar melody, albeit at a different tempo and key (?). If you can't hear it, try changing the video speed to 1.5x during the Sam Smith part.

If these two songs are similar enough, then I think it could be argued that a MIDI sequence has been copied, since in both cases it requires a significant change of tempo and key. A lot of commenters seem to be missing this point: yes, the generated sequences sound different from real songs, but so do the songs involved in the ridiculous court cases. Radiohead and Ed Sheeran were sued for chord progressions, Katy Perry for a melody. The songs involved were altered about as much as the MIDI sequences would need to be to show the similarities.

tunesmith · on March 3, 2020

I'm not incredibly familiar with the court cases, nor that of Coldplay/Satriani, but I have a hard time believing that their decisions are algorithmically binding. Like, yes, they might be similar along those particular axes but that doesn't mean those similarities are the sole reason for the court decisions. There's also matters such as - was songwriter #2 exposed to song from songwriter #1? Does the similarity in arrangement imply intent to copy? Etc.

IAmGraydon · on Feb 28, 2020

Exactly. Harmonic progression is probably the most important part of this, as it imparts context on the melody. The same melody over I-V-IV-vi and vi-V-IV-V is not the same thing.

hannasanarion · on Feb 28, 2020

Courts don't care about harmony. Most US courts follow a guideline that the copyrightable parts of a composition are melody and lyrics. Nobody has ever successfully sued for stolen chord progression.

ehnto · on Feb 28, 2020

But a melody over one chord progression, versus the same melody over another chord progression, is surely a different song.

hannasanarion · on Feb 28, 2020

Not according to the US court system. American case law defines chord progressions as insufficiently creative for copyright purposes. Usually rhythms are too. The only copyrightable parts of a composition in precedential cases of most US courts are the melody and lyrics.

And apparently arpeggios, because the US District Court of California ruled in Flame vs Katy Perry that arpeggios are "melodic enough" for copyright protection.

ehnto · on Feb 28, 2020

Melodies are insufficient on their own in my opinion, the court rules differently I guess. Melodies are just as likely to be formulaic, and are built using similar foundational knowledge as a chord progression (only a subset of notes works in a given progression for example, and conventions lead you toward certain notes of that subset).

A combination of chords, melody, rhythm are I think the only reasonable measure that a song has been copied.

robrtsql · on Feb 28, 2020

It matters a lot to the sound of the song, but does it matter to the court?

If I take the entire melody of a Beatles song, including the verse and chorus, but set it to an entirely different chord progression, would the court recognize that as an original song? What if I lifted all of the lyrics as well?

ehnto · on Feb 28, 2020

Lyrics and melody? I think that's reasonable to consider that an infringement. Melody over a new chord progression? I do think that should be considered a new work, just to limit the scope of copyright. Even if it's clear you copied the melody, I think melody alone is insufficient to call a song. Unless the original song was entirely melody. A melody is using all the same musical building blocks as the chord progressions did, why does it get special treatment?

jdbernard · on Feb 28, 2020

It is from the courts' point of view.

hunter2_ · on Feb 28, 2020

From what I recall, it's exclusively melody and words which can be copyrighted in terms of songwriting. Harmony isn't copyrightable.

db48x · on Feb 28, 2020

That's not even true: https://www.youtube.com/watch?v=0ytoUuO-qvg&feature=youtu.be

irrational · on Feb 28, 2020

I don’t know what any of those words mean (octave, melody, tempo, harmony, diatonic?)

Do you have a book that you would recommend to learn about these terms?

kian · on Feb 28, 2020

Octave - a doubling in the frequency. Generally speaking, in twelve tone equally tempered classical music, melodies and harmonies are drawn from a scale, a subset of these twelve notes. The most notable of those scales, the diatonic scale has 7 unique notes, and the eighth note wraps back around to the beginning of the scale. Thus, moving from a note to a note double the frequency took 8 notes - hence, the octave. Diatonic is both the name of the most used of the 7-note scales mentioned above, and also a term meaning 'within the scale', i.e., 'diatonic to a (given) scale', depending on context.

Melody is the horizontal arrangement of notes for an individual voice or instrument over time. Harmony is the vertical arrangement of notes sounding at the same time, and how those transform horizontally over time as a group. Tempo is the speed in beats per minute of the background 'pulse' of the music.

Now, asking for a music theorist to give you an algorithmic definition of how to make music with any of the above? Good luck ;)

irrational · on Feb 28, 2020

Unfortunately that is just pushing it out. frequency? Is this the same as a radio frequency? tone? twelve tone? equally tempered? scale?

I honestly know nothing about music so this is all gibberish to me. But I'd be interested to learn if anyone has a book recommendation.

kian · on Feb 28, 2020

If you're looking for a book, Music: A Mathematical Offering https://www.amazon.com/Music-Mathematical-Offering-Dave-Bens... is pretty good. But you're nearly a the end of your recursion, though. Wikipedia ought to take you the rest of the way to answering those basic questions. More complex questions like why certain combinations sound good whe next to one another, on the other hand...

rdlw · on Feb 28, 2020

I found this video to be an extremely helpful and concise explanation of music theory basics: https://youtu.be/rgaTLrZGlk0

"Learn music theory in half an hour" is obviously an exaggeration, but it really comes astoundingly close to fulfilling that promise. It contains a lot of information and each part builds on the previous parts, so it requires focus and maybe a few repetitions to 'get it', but I think the approach is fantastic for showing how many ideas of music theory are deeply connected.

Not a book as you requested, but hopefully you'll find the other reply useful. As for some of your specific questions:

Frequency is the same concept as radio frequency, but in this case refers to something we can directly sense. Radios transmit electromagnetic waves, which are photons moving at a certain rate, measured in Hertz, or cycles per second. Sound frequency refers to movement of air waves, so a more accurate analogy than radio waves is waves in a pond when a rock is thrown in. Human ears are sensitive to frequencies between 20 Hz and 20,000 Hz, so any sound you hear is a combination of frequencies. Natural language is helpful here, since higher frequencies sound 'higher' and lower frequencies sound 'lower'.

A tone is a sound at a specific frequency, also known as a note. For example, 440 Hz is designated as the note A4 by the Geneva conventions, and this is what most instrument tunings are based off.

Twelve tone equal temperament is the tuning system nearly all modern Western music uses. Certain ratios of frequencies sound pleasant, especially ratios with small numbers, such as 1:2, 2:3, and 3:4. So if we know 440 Hz is a note in the system, it would be nice to also have 587 Hz, 660 Hz, and 880 Hz. However, these frequencies will only really sound good when played with that original 440 Hz, not necessarily with each other. So instead of using them exactly, we approximate them in a useful way. The 1:2 ratio, the octave, is generally considered to be the most important, so that ratio is kept, but otherwise the notes are equally spaced (human hearing is logarithmic), or equally tempered. The most popular tuning system has twelve tones. There's no note at 660 Hz, but there's one at 659 Hz, which is pretty close, and there happens to be one at 587 Hz. Other ratios are also represented reasonably well.

A key is a collection of notes that sound good together, based on the ratios of their frequencies. The alternative would be chromatic composition, where all 12 notes are used and none is obviously 'more important'. Most music is in a specific key, but uses some chromatic notes to make the melody more interesting.

irrational · on Feb 28, 2020

Thank you for the very detailed response. I'll take a look at that video. Does it cover what a key is?

rdlw · on Feb 29, 2020

Yes, the video goes into some depth about scales/keys.

I actually had to look up what the difference between a key and a scale is, as I thought the terms were pretty much interchangeable. I've edited the last part of my other comment to reflect this:

A scale is actually an ordered set of notes belonging to a key. A key is just an unordered collection of notes. I got this wrong earlier.

So playing all the notes belonging to C major in ascending order is playing a scale, and playing the notes in any order is playing in the key of C.

yborg · on Feb 28, 2020

It's also cleverly disguised PR for the attorney. Well played (so to speak).

albertzeyer · on Feb 28, 2020

The code is written in Rust. The core algorithm can be found here: https://github.com/allthemusicllc/atm-cli/blob/master/src/ut... (Specifically the function gen_sequences. It basically uses multi_cartesian_product to iterate over all permutations.)

Isn't it a bit of wasted space to store the generated data on archive.org (https://archive.org/download/allthemusicllc-datasets)? I mean, as we see, it's trivial to create them. The code above is kind of a self-extracting archive, so it's just the same thing but in compressed form. And I guess it should not matter (legally) whether you store it compressed or uncompressed (or less compressed).

willvarfar · on Feb 28, 2020

At the very beginning of this excellent interview https://www.youtube.com/watch?v=sfXn_ecH5Rw that I found in the comments here, they explain this.

Apparently, for this to work, they have to "affix them to a physical medium".

Strongly recommend watching that video! Excellent info.

albertzeyer · on Feb 28, 2020

Hm, they don't really explain this, or do they? They just say "by saving those melodies to a hard drive they have affixed them to a physical medium which is all that is necessary to copyright them".

So, by the same argument, they could copy the self-extracting archive to a hard drive, and then have the same thing, or not?

They even say that they already store it in a compressed form on the hard drive.

What if some future compressor (future ZIP format) is clever enough to see that you are going to save all possible permutations of something, and then saves this is a much better compressed way. Then suddenly when compressors do this, it means when you use such a compressor, you do not have copyright on the data anymore?

willvarfar · on Feb 28, 2020

Very true.

I think the focus has to be on the spirit not the technology. Saying to a lawyer/judge/jurist "hey, this script can generate every possible melody!" _feels_ quite different from saying "every single melody is already written down, on this hard-drive I am waving at you! Look!"

Tangent: reminds me of the Hutter Prize for compression (hey, they prize pot recently increased to €500K and I submitted it to HN but it didn't get any votes http://prize.hutter1.net/ https://en.wikipedia.org/wiki/Hutter_Prize).

reacweb · on Feb 28, 2020

Takes any algorithm that can generates the digits of pi. It is proven that all the sequences of digits are present somewhere in these digits. Can you claim copyright on all the books of the universe because you can show the algorithm that can generate every possible books ?

Chinjut · on Feb 28, 2020

"It is proven that all the sequences of digits are present somewhere in these digits"

This has never been proven for the digits of π. Of course, there are other digit-sequences for which it is trivially true. For example, just concatenating together every finite sequence of digits, in some suitable order.

reacweb · on Feb 28, 2020

I stand corrected. For pi, it is only a conjecture. My point remains true for "Champernowne sequence".

razorunreal · on Feb 28, 2020

Without being any kind of copyright expert, this argument doesn't make much sense. If you have the relevant indices into pi, I guess you could claim copyright. The index will be much larger than the original book though. Essentially you have a really bad compression algorithm. I'm not going to get very far claiming that because my decompression algorithm could output any sequence, all sequences are mine.

fsflover · on Feb 28, 2020

It has already been done (even though it is only a conjecture):

https://news.ycombinator.com/item?id=13869691

hunter2_ · on Feb 28, 2020

So in other words you need to have used the compressor to compress what you are copyrighting, and that which you used it on must have been physically stored in its entirety. Is it provable that this all occurred, if there's no requirement to still have the original? It would require that you also supply the compressor to the court, so that somebody can use your self-extracting magic decompressor, recompress the result, and end up back in the same place. But you could make your compressor simply emit your decompressor, ignoring the input!

playpause · on Feb 28, 2020

It is to avoid FUD. It ensures no one can derail the conversation into a debate about what constitutes being ‘affixed to a physical medium’.

“But are the tunes really affixed to the medium?”

“Yes, this hard drive in my hand literally contains a bunch of MP3 files” is a lot stronger than: “Yes, in a way, because this tune-generating script technically constitutes a self-extracting archive, your honour.”

C1sc0cat · on Feb 28, 2020

Presumably musicians/composers using purely electronic tools such as abletron live have no problem asserting copyright already.

willvarfar · on Feb 28, 2020

The musican has to cut a CD or something phyiscal.

https://www.tunecore.com/guides/copyrights-101 the first paragraph is:

> For a work to be “copyrightable,” it must be original and fixed in tangible form, such as a sound recording recorded (affixed to) on a CD or a literary work printed (affixed to) on paper.

sullyj3 · on Feb 28, 2020

is a hard drive not a tangible form? Come to think of it, so is a brain, and everything else capable of storing information. That language is atrocious.

jiofih · on March 1, 2020

It’s about “permanence”, not being able to copyright a live performance that wasn’t recorded for example. It just means you need to be able to distribute / replay the recording, and that means using one of the current common audio technologies.

db48x · on Feb 28, 2020

It made a lot more sense when it was originally written.

dri_ft · on Feb 28, 2020

Saying 'I don't actually have that melody written down anywhere, but I easily could have, if I'd run this program' feels a bit like saying 'I don't have that melody written down anywhere, but I easily could have if I'd just thought of it first'.

You may say that in fact having the code that generates the data is effectively the same as having the data. This may convince reasonable, rational people, but the fact that we're dealing with the law that we are should tell you that we're not dealing with reasonable, rational people.

jiofih · on March 1, 2020

Its just the practicals of copyright law. You don’t copyright the idea of a song, a plan for a song, or code that generates a song. You register the tablature accompanied by a recording. See https://medium.com/@dawn_ellmore_employment/what-does-tangib...

vbezhenar · on Feb 28, 2020

You can trivially write a program which will enumerate all the possible byte sequences in the universe. So you can claim that this tiny program contains everything including god. So you can claim copyright for everything.

phonebucket · on Feb 28, 2020

> You can trivially write a program which will enumerate all the possible byte sequences in the universe. So you can claim that this tiny program contains everything...

Not to get too far off topic, but even an infinitely long byte sequence cannot represent all numbers. It's possible to construct something not representable by that sequence via Cantor's diagonal argument: https://en.wikipedia.org/wiki/Cantor%27s_diagonal_argument

dmurray · on Feb 28, 2020

It doesn't contain the floating point representation of that number, which would be infinitely long, but it does contain the UTF8-encoded constructive proof that uniquely identifies that number.

paulddraper · on Feb 28, 2020

Incorrect.

Most real numbers have infinite-length constructive proofs.

Uncountable is uncountable is uncountable. If any correspondence/encoding existed between naturals and reals, the reals would be countable. (But they are not, so it is fool's errand to search for such an encoding.)

Anderkent · on Feb 28, 2020

Surely not? If there are more numbers that can be represented in any amount of bytes (as shown by the diagonal argument), then you cannot represent a constructive proof for each of them in any amount of bytes.

dmurray · on Feb 28, 2020

Kind of. A constructive proof by definition means something like you can explain exactly what the number is in a finite number of bytes.

The standard diagonalization argument is a constructive proof, identifying a specific number. Normally a constructive proof is preferable. But with a bit of hand waving you can make it into a non-constructive proof that there are "unknowable" numbers that cannot be described in any finite amount of bytes.

afiori · on Feb 28, 2020

In practice constructible for reals mean that you can approximate them with arbitrarily small know precision e.g. a sequence of retional number numbers q_n each no more that 1/n apart from the actual real number.

The point is that such a sequence needs to be constructive in the usual sense, so there are still only a countable number of them.

Filligree · on Feb 28, 2020

The argument still applies, yes. Nevertheless the string still contains all finite substrings, which includes all English sentences describing such numbers.

The implication is that human minds wouldn't be able to represent it either, at least not with language.

jnbiche · on Feb 28, 2020

Well, it could contain the floating-point representation, since floating-point is a finite representation.

But it couldn't contain the real number representation.

shkkmo · on Feb 28, 2020

Perhaps I am mis-undertanding the diagonal argument, but it doesn't appear to show what you claim it shows.

The diagonal argument seems to be a proof that there are uncountably many infinite byte sequences. So while it proves that it would impossible to "enumerate" every infinite byte sequence, it doesn't prove that there exists a number that cannot be represented by some infinite byte sequence.

Indeed, I believe the opposite can be shown to be true by constructing a mapping from every finite number to an infinite byte sequence. ASCII trivially provides such a mapping for real numbers, and finite-tuples of real numbers (such as imaginary numbers) can be mapped by alternating digits from each element of the tuple.

Edit: The key distinction is that the set of finite byte sequences is infinite, but countable, while the set of infinite byte sequences is not only infinite, but also uncountable. Which of these two sets is deemed the set of "possible byte sequences" seems to be the critical distinction and the turn of phrase "all possible byte sequences in the universe" seems to imply all byte sequences of finite length.

vbezhenar · on Feb 28, 2020

All the possible byte sequences contain all the possible programs.

All the possible programs can receive input of infinite length and output bytes of infinite length depending on input.

So if we can supply any input to any program and treat its output as part of result, does this argument still work? Is there such a number that can't be constructed by some program receiving some input?

paulddraper · on Feb 28, 2020

> All the possible byte sequences contain all the possible programs.

Correct.

> Is there such a number that can't be constructed by some program receiving some input?

Yes. Most numbers are not representable by a program...unless you allowed a program to be infinite length, in which case the first statement is no longer true.

There's nothing particularly unique about a digit-based encoding for Cantor's diagonal argument.

shkkmo · on Feb 28, 2020

> in which case the first statement is no longer true.

Why is that? What would be an example of an infinite program that could not be represented by an infinite byte sequence? Indeed, it seems trivial to map an infinite series of machine code instructions to an infinite series of bytes.

Edit: It seems like the first statement is false only if you use a different understanding of "possible" for "possible byte sequences" and "possible programs" where the former excludes infinite length and the latter does not.

paulddraper · on Feb 28, 2020

I internally editorialized your first statement to be "all the possible [finite] byte sequences contain all the possible programs".

Because you cannot enumerate all infinite byte sequences, according to Cantor's diagonal argument as linked earlier. [1]

Calling certain numerical representations "computer programs" in no way changes the fundamentals. There is no way -- no matter how clever your encoding -- to enumerate all real numbers.

[1] https://en.wikipedia.org/wiki/Cantor%27s_diagonal_argument

shkkmo · on Feb 28, 2020

I am not the person you were originally responding to.

You don't have to enumerate all real numbers to create a mapping from real numbers to infinite byte sequences.

I am merely pointing out that the first statement only becomes false if you editorialize it to "all the possible [finite] byte sequences contain all the [infinite] possible programs". If you treat the meaning of "possible" consistently in that sentence, I believe it remains true regardless of whether you define infinity as possible or not.

paulddraper · on Feb 28, 2020

Yes, you're right. I did not do a clear job of explaining the flaw.

paulddraper · on Feb 28, 2020

> even an infinitely long byte sequence cannot represent all [real] numbers

You misread. The previous comment said

> you can trivially write a program which will enumerate all the possible [finite] byte sequences in the universe

No one is interested in your infinitely long musical score.

paulddraper · on Feb 28, 2020

> an infinitely long byte sequence cannot represent all numbers

Okay, all natural numbers then. I don't think that changes the argument.

(Reals are a funny thing because any format you choose has numbers that are infinitely long in that format. Which is basically the same thing you said.)

albertzeyer · on Feb 28, 2020

Not all numbers, such as real and complex numbers. But you indeed can represent all integers, all possible programs, all digital music, digital images, digital everything. Not sure if those other music / images which cannot be digitalized are so relevant.

maxerickson · on Feb 28, 2020

All you have to do is enumerate an infinity of infinities. It's no problem at all.

zdragnar · on Feb 28, 2020

One could make the same argument for a mathematician with a typewriter. The fact that it could come up with a particular sequence is, I think, less important than whether or not ot actually has recorded a particular sequence.

emiliobumachar · on Feb 28, 2020

Claiming copyright of everything won't save you from producing all that child pornography and hate speech, not to mention unauthorized copies of state secrets. ;)

simonh · on Feb 28, 2020

For the state secrets, the state would have to identify the specific strings that constitute the violation, so that's unlikely to be a problem in practice.

moron4hire · on Feb 28, 2020

Nah, they have no problem identifying the string, they'll just slap you with a gag order, which itself will also have a gag order.

nkrisc · on Feb 28, 2020

That's all great until the judge asks for a demonstration to show it can generate the material you're presumably suing over.

jnbiche · on Feb 28, 2020

At that point, you simply explain Cantor's diagonalization proof and the concept of countable and uncountable infinities.

nkrisc · on March 2, 2020

But is it enough to prove that it could produce any melody? After all, couldn't a binary adder produce every executable binary possible? Well, eventually, and with enough memory.

jagged-chisel · on Feb 28, 2020

No, because each thing you want to claim copyright on must be “fixed in physical form.”

nck4222 · on Feb 28, 2020

By that logic, the library of Babel owns the copyright on every piece of literature that could be written, which would be a terrible idea.

https://libraryofbabel.info/

virgilp · on Feb 28, 2020

They don't pre-generate all the "books", right? So they can't claim the copyright (it's not "affixed to a physical medium")

danbruc · on Feb 28, 2020

So what is the difference between a program that views some text by reading it from a huge file and a program that views the same text by generating it on the fly? Especially if the file is generated by the very same process that happens on the fly in the second case? Why would you even have to actually build one of the two programs, enumerating all possible character sequences of some given length is a somewhat obvious idea and actually implementing that idea, especially if the text is only generated on the fly, does not bring the text into existence substantially more than just pondering the idea.

I guess for matters of copyright you have to produce a specific artifact, just blindly enumerating all possible artifacts is not good enough. Which would also mean that the existence of all those MIDI files will not make a difference.

afiori · on Feb 28, 2020

There is something missing here.

If I were to write a "random book generator" that just regurgitate infinite text then I can copyright a pair starting index and length, but not simply all possible finite sequences of letters.

danbruc · on Feb 28, 2020

I would argue writing a book, composing a song, or inventing a device should be viewed as a search process. You are wandering around in the space of all possible books, songs, or devices looking for one that is interesting to read, pleasant to listen to, or useful to use. Your work is mainly identifying special items in the vast space of possible items, making it physical once identified is only secondary, at least for the purpose of this discussion. So writing a computer program to generate all possible songs is finding a specific and - at least somewhat - useful program among all possible programs and - ignoring its triviality and the general questions about copyright - seems copyrightable to me. On the other hand using this program to turn the space of all possible songs from something in your head into a huge pile of MIDI files on a disk seems not copyrightable to me because you do not single out interesting songs. Finding a good song did not get significantly easier, searching through and listening to a sea of MIDI files instead of playing all possible combinations of notes on a piano and judging the sound of it.

young_unixer · on Feb 28, 2020

By that logic, you wouldn't be able to claim copyright on works that are distributed compressed or encrypted.

paulddraper · on Feb 28, 2020

Nice link; thanks!

hobofan · on Feb 28, 2020

I'm not sure if it would hold up legally anyway, but with it already being generated I can see it working out a bit better.

If you only provide the code as a means of generating any melody, you could just as well replace it with any music instrument and claim you can generate any melody with it.

arketyp · on Feb 28, 2020

Ad absurdum (or maybe not so much?), any universal Turing machine can be programmed to list every possible program, specifically the one generating every melody. So the copyright claim could then just as well point to Conway's game of life or rule 110.

anonsivalley652 · on Feb 28, 2020

Yep. They could've used that space for rainbowtables. I'm old enough to remember when someone from shm00 had rainbowtables available for download, but I guess the network bills added up or the feds threatened them.

eb0la · on Feb 28, 2020

I guess you need it stored somewhere to claim it existed on a given date...

If someone is using archive.org for this purpose, please consider donating :-)

collyw · on Feb 28, 2020

> Isn't it a bit of wasted space to store the generated data on archive.org

Could the algorithm be considered a form of compression?