MusicGen: Simple and controllable music generation

AxEy · on June 10, 2023

(This is not meant to be an anti-ai-generated-art rant. It's coming whether we like it or not. But some of the motives in this thread confuse me.)

Music producer here with an honest question to those saying "this will provide me with a simple soundtrack/background music for $PROJECT"

Have any of you checked out / made offers on music production subreddits? Or other music subreddits? various music production discords? Elsewhere on the internet?

If so, could you say what your experience has been?

I ask because the music production scene is like...ridiculously saturated, and it's almost a meme in the producer community how hard it is to make even a buck producing. I suspect that there are a significant number of producers who would be happy to take your "prompt" for a small fee. Yes, I understand 1) free and 2) immediate is convenient, but isn't 1) relatively inexpensive and 2) whatever advantage intent in construction gives good too?

I'm willing to admit that I'm missing something here, but I'd love it if someone could enlighten me.

While I'm asking follow ups, to all the folks who love digging for new music so much that they're considering turning to prompting AIs, I'd be seriously surprised if you've really checked out all the stuff that is coming out from new producers (again, reddit, soundcloud, etc). Another meme in the producer community is how one spends hundreds/thousands of hours perfecting ones craft, and dozens of hours working on a track, only for that track to get like 5 plays on soundcloud and negligible engagement elsewhere. Are music consumers really that desperate for new tunes? Frankly a lot of us just aren't seeing it....

bityard · on June 10, 2023

It's not that there isn't enough electronic music being made, it's that every new track that lands on soundcloud is a drop in the ocean of mediocrity. There is _too_ much, and 99.9% is just boring to listen to, because it sounds like everything else. I listen to a LOT of electronic music (and have, since the mid 90's) and just don't have the patience anymore to sit through hours of average material to find one or two truly inspired artists.

I doubt I would turn to AI much for anything other than background noise while focusing on work. In fact, that sounds like a perfect use case for me. "Dear GPT, please compose a four-on-the-floor downtempo progressive track with soft pads, no vocals, and zero goddamned fake vinyl noise that runs for two hours straight..."

mjr00 · on June 10, 2023

> It's not that there isn't enough electronic music being made, it's that every new track that lands on soundcloud is a drop in the ocean of mediocrity. There is _too_ much to listen to, and 99.9% is just boring to listen to, because it sounds like everything else.

Yep. this is why I don't feel like AI used in this manner moves the needle for music: people only actively listen to the best 0.1% of music anyway. The ability to create music that is firmly in the other 99.9%, as this stuff very clearly is, just means that the ocean of mediocrity has more water dumped into it.

wwweston · on June 10, 2023

> It's not that there isn't enough electronic music being made, it's that every new track that lands on soundcloud is a drop in the ocean of mediocrity. There is _too_ much, and 99.9% is just boring to listen to, because it sounds like everything else.

To the extent that sounding like everything else is a problem, how is ML generated music not going to have it?

And in general this isn't going to be a qualitative improvement in experience. ML algorithms for recommendation are searching the preference space in much the same way ML generation would, they're just doing it over existing stuff. If you really find 99.9% of existing material boring you're probably going to find a similar order of generated material boring.

Though I suspect 99.9% is hyperbole. My rate of "this is listenable and interesting and I'd like to come back " on Soundcloud is better than 1 in 25 on the worst day and better than 1 in a dozen on most, and the rate is often north of 1 in 6 for curated platforms like Pandora. It's never been easier to discover good new music to listen to with not much in the way of effort.

pmoriarty · on June 10, 2023

"To the extent that sounding like everything else is a problem, how is ML generated music not going to have it?"

AI generated art has explored all sorts of weird spaces that few humans have touched.

It's not difficult to make computers create unusual, original, bizarre work. The difficulty comes in making it both original and enjoyable/interesting.

Also consider that AI-generated music is often going to actually be a collaboration between a human and an AI. The human will be acting at least as a curator, because not everything created by AI is going to be pleasing, so some selection and catering to human taste will be required.

mjr00 · on June 10, 2023

> The human will be acting at least as a curator, because not everything created by AI is going to be pleasing, so some selection and catering to human taste will be required.

Yes, and keep in mind humans are already doing this! It's very common to do tweaking of knobs on a synth/VST while recording and create a 10-20 minute audio file, commonly called a bass jam or mud pie, then select the best bits to use in a song. And of course, people use randomization tools to tweak the knobs for them. IMO use of AI to support this type of workflow is far more promising than going directly to the finished product.

underlines · on June 19, 2023

I generated a 1:20 sample using your prompt "four-on-the-floor downtempo progressive track with soft pads, no vocals" using the audiocraft-webui fork, which allows for longer generation by overlapping generations.

https://sndup.net/njs2/

AxEy · on June 10, 2023

It was not super discerning listeners (like it sounds like you are) that I meant to address in that second question. Sorry if that was not clear. Rather it had sounded from some of the comments that people were desperate for original tunes (and maybe not necessarily the most highly produced). But I didn't point to a specific comment, so maybe that's my fault.

We may also disagree on how much good stuff there is coming out, but I agree there is a lot of noise.

courseofaction · on June 11, 2023

Nothing has replaced physical music scenes and communities for content discovery. They filter brutally.

electroly · on June 10, 2023

There is an absolutely massive gulf between "free (or fixed list price) and immediate, just use these apps" and "locate a musician, bargain with them, pay, collaborate with them, wait for revisions, eventually get something usable (but not be 100% sure if I own the rights or not)." I wouldn't even know where to start; I'm too far out of my league.

Tackling the latter would likely exceed the entire effort I spent writing my little hobby game in the first place. I don't think it's even close; it was never a serious consideration. Some of these games I write in a single sitting. I do my best to piece background music together using a chord progression app, descriptions of keys and the notes they contain from Google, and premade drum loops and instrument samples. It comes out worse than if a real musician had made it, but getting a real musician was never really an option.

It's the same for the art. I don't have the time or money to pay an artist. They deserve to be paid fairly for their work just like musicians, but I don't have it and it's just a stupid hobby game. But even stupid games need art and music. So, homemade programmer art and music it is. The availability of better tools to help non-musicians hack something together is greatly appreciated. I haven't tried any AI stuff yet but I will next time.

Andrex · on June 10, 2023

It's not that hard, you find an email and you basically do a cold call. I was doing it in high school off Newgrounds (which is full of royalty free stuff too!)

If your project is small and free, you're not going to land The Eurythmics. But all those people posting their music online hoping to get noticed? Emailing them, even a cold call, immediately tells them you've listened to their stuff and you like it. Honesty is the best approach.

I think OP is onto something.

Edit-

> but not be 100% sure if I own the rights or not).

That's also really easy: stipulate it in writing. Preferably a proper contract but an email agreement is defensible too (IANAL).

electroly · on June 11, 2023

> stipulate it in writing ... proper contract

When it's just me and some apps, I'm writing the background track in an evening after coding the game earlier in the day. If I bring someone else in, now I'm writing contracts (something I'm completely unprepared to do correctly myself, as a non-lawyer). It's too big of a jump for a one-day zero-budget hobby game that isn't very good and only my friends will play. Not when I can quickly cook up something myself using readily available tools.

For a more serious project with a budget, absolutely you find a professional producer, just like you get professional coders and artists. But this isn't that.

Solvency · on June 11, 2023

Dude I've literally offered to design entire websites for up and coming music artists for FREE because I love their work and am rebuilding a portfolio to have more music sector work.

I've offered this to like 15 people. Some respond in utter confusion and blow me off. Most don't even respond.

This isn't 2005. The vast majority of people, especially "music artists" are not corresponding over email and are not exactly professionals either.

dooraven · on June 11, 2023

> It's not that hard, you find an email and you basically do a cold call. I was doing it in high school off Newgrounds (which is full of royalty free stuff too!)

You are missing the point, most programmers are introverts and absolutely deplore doing cold calls and cold emails.

jasonm23 · on June 11, 2023

> Just write a contract

... this is almost TOO convenient.

SanderNL · on June 11, 2023

Instead of turning to pre-made corporate software, you could just hire me or one of my developer friends? We’ll crank out something fine for you in no time flat. All you have to do is just:

- Find me, which is easy, just become an amateur developer yourself and scour the various places I frequent

- Think of and propose in great detail what you want. We will go back and forth over this, over the course of several days/weeks. Bonus points if we don’t speak the same language.

- Sign some form of agreement, really easy, just read this 5 page document and maybe hire a lawyer if you are unsure. All very easy.

- Fork over the cash

- Get deliverables in a few weeks, hopefully.

Now when you compare that with just firing up some website/app and getting on with your work, is that really better? I’m not seeing why you would just go to gmail.com if I could have just made you a very nice, very special email reader.

I have perfected my craft over thousands of hours you know. You should pay us the respect we deserve.

—

Seriously: cranking out tunes through some prompting vs hiring people through shady channels like reddit? Are you serious?

AxEy · on June 11, 2023

Yes, I'm quite serious.

You give an analogy of software. I suppose I feel that art differs from this in some ways. E.g. originality, creativeness.

Reading many of these responses I'm gathering that perhaps I have too strong a notion of what kind of quality of music might be in demand. While the loops on the original article are impressive given their generative nature, I suppose I felt that there may be a demand for something more (Better sound design, more long term structure), but maybe I'm naive.

But thanks for giving me your perspective.

SanderNL · on June 11, 2023

I came off like an annoying neckbeard. I guess that’s expected of me and my type, but sorry for that.

You hit a nerve because software development is also art and highly creative. I have given a significant part of my life - basically my youth - to it and I feel “creatives” think they are somehow special and that their work is fundamentally different and I don’t think it is.

It’s just that we have been cornered earlier than you guys. My skills are now only profitable as boring building blocks in corporate settings because nobody else will pay for proper work. Everybody expects easy access for free instantly to whatever digital service they can get their grubby hands on. If I talk about “craftsmanship” I get laughed out the room. Nobody gives a shit.

Now I’m like, yeah guys, that’s how it feels to have your skills commoditized. Deal with it. That’s kind of childish though.

Folcon · on June 10, 2023

Personally I think you underestimate access, I've on several occasions while developing small games wanted to collaborate with someone who has a musical bent to put something together.

The problem I feel is that I have an expectation of being able to front the cost of engaging someone to work on a project with me.

Working out navigating a working relationship on a smaller project seems fraught with issues.

I'm rarely inclined to spend dozens of hours listening to soundcloud when I have other things to work on.

I mean yes people create interesting music, perhaps it's a search problem? Knowing someone creates the kinds of music I'm interested in would help. But as someone making things, I'm trying to find someone who I can collaborate with who has an overlapping interest in what I make. Solving for that is not straightforward.

I've had much more luck with graphical art than music.

So yes, even though these systems are fundamentally worse, I can at least "collaborate" with them on producing something. Going from zero to one can be enough.

crabkin · on June 11, 2023

Music is abstract. When we talk about visual art we can almost always be on the same page. If I say I need garden gnomes parading around a Bavarian village, the amount of variation between my internal idea and what a visual artist returns will mainly come from the lack of terms I use regarding aesthetic sensibility. Will they return something abstract or neoclassical? I would then be more specific etc...

For music we could present such an image but it would then suggest I'd argue much more possibilities. We could narrow down by genre you would suppose but even then there are too many possibilities: genre's are not as strong categories as are the stylized "era's" of visual art, I would also claim. Moreover, we can "port" a fundamental structure like a melody over all sorts of strains of music. In visual art, any motif is bound to be changed depending on the era and the style we'd put it in, that is, I think that in music, there are elements that are stronger in visual arts and elements that are weaker in music, and vice-versa, with regard to a description we could give in English. It's probably more natural and more possible to ask about what a sort visual representation should be than what a piece of sound should be.

It's interesting how we can generate images I'd argue in stunning faithfulness to some prompts but we don't seem to be very close to the same standard, for some prompts, at generating music.

mtlmtlmtlmtl · on June 10, 2023

My first reaction to this wasn't "cool I can make the novel music I desperately crave", more along the lines of "this thing is making some wacky sounds that I'd love to see a producer craft into something more". Because I definitely agree with you that there's an abundance of fantastic music to check out, and realistically I'll never be able to check out even half of it throughout my lifetime.

The guys in Infected Mushroom will have a field day with this stuff. Their whole thing is finding weird ways to create new sounds you never heard before.

Just another instrument, really.

AxEy · on June 10, 2023

Honestly what I'm most excited about is how this technology can be used, not to arrange parts or even loops but rather in new plugins (VSTs) that implement novel approaches to digital synthesis. Think of all the awesome sounds.

If anyone knows anyone working on that, ping me. :)

post-it · on June 11, 2023

> Have any of you checked out / made offers on music production subreddits? Or other music subreddits? various music production discords? Elsewhere on the internet?

When it's 3 am on a Saturday and I'm in the zone on a passion project, I'm not about to spend the rest of the weekend going back and forth with a music guy on Reddit.

I want a music robot that cranks out music on demand and responds to my every whim, and a real human being isn't going to want to fill that role no matter how impoverished they are.

pcthrowaway · on June 11, 2023

If I need a background track for something, and I commission someone else, then I believe the standard contract for the commissioned work would still leave copyright with the producer (though not always), and changing it so that I have exclusive rights to the work would potentially make it more expensive.

Add to that, if I don't like something, want it tweaked, want something completely redone, or just flat out change my mind about some direction I provided later, I have to go back to them and negotiate a new contract, or find someone else to do the work. The costs add up over time, and there's an additional benefit to immediate feedback (or cost of delayed feedback, as anyone who has worked on a software project that takes forever to compile/check can attest)

I haven't used the music AI tools yet, but having played around with Dall-E a bit I can say that it's pretty enjoyable to be able to give direction, and bound it, then roll the dice and see how things turn out. I definitely feel some ownership of, and pride in, the resulting creation

comfypotato · on June 10, 2023

Free and immediate. You answered your own question.

All things equal, people are happy to support local businesses. The value prop here is far from equal.

yieldcrv · on June 10, 2023

the simple answer is that your motivations for being an artist need to change to be exclusively personal fulfillment. because that was true for the pre-AI world, as you essentially described, and its true-er for this current-AI world.

the real meme is about how artists have always been grasping for financial respect in every market condition ever, and yet nothing has changed. people were never going to commission you, they were never going to book you. While they do appreciate the content. But for the few that would ever actually try to commission something, they encountered friction after friction after friction and collectively artists have been disinterested in solving. Because they're starving and preoccupied with fighting for scraps and modicums of respect at all.

The world’s has now solved many of these frictions.

The frictions were:

1 hoping they found the right artist to begin with

2 hoping that artist is reliable and has any work ethic or structure in their life

3 not bruising that artists ego in however communication style is preferred

4 dealing with how completely segregated many artists are from contract negotiations and any aspect of the business world, but needing to secure rights properly

5 ego in securing rights properly without the artist overplaying their hand

6 waiting for the commission

7 revisions

8 circle back to 1

9 if you ever get past part 8, you have the issue of whether your new license can be used in an unforeseen way and medium in the future

getting burned in altruistic commissions of living artists is simply over now. all these frictions are solved with the free and immediate way.

mjr00 · on June 10, 2023

> the simple answer is that your motivations for being an artist need to change to be exclusively personal fulfillment. because that was true for the pre-AI world, as you essentially described, and its true-er for this current-AI world.

> the real meme is about how artists have always been grasping for financial respect in every market condition ever, and yet nothing has changed. people were never going to commission you, they were never going to book you. While they do appreciate the content. But for the few that would ever actually try to commission something, they encountered friction after friction after friction and collectively artists have been disinterested in solving. Because they're starving and preoccupied with fighting for scraps and modicums of respect at all.

For anyone trying to make money off of music, they should have already been aware that most of the effort in making a living is the non-music work. Once your music reaches an acceptable level of quality it's more about finding and managing your fanbase, industry connections, getting booked at the right shows, promotion and marketing, maintaining professionalism, etc. than anything else. Which this particular AI doesn't help with.

An extreme example is Fred Again, who came out of nowhere and is now one of the biggest names in electronic music. His music isn't bad, but it's nothing revolutionary. As it turns out, though, he grew up in one of the richest neighborhoods in England, with Brian Eno as a neighbor, and went to the most expensive private school in London.

So no, AI music generation doesn't change anything here. It's similar to the startup mistake technical people make of focusing on picking the right tech stack instead of focusing on sales and finding product-market fit. The software/music is only about 10% of the challenge of making a successful business/career.

yieldcrv · on June 10, 2023

absolutely, great contributions.

I did want to clarify that I was posting from an angle about those us who need music produced for our products, but were never going to commission it.

I think its important to understand that user story because a lot of artists don’t seem able to empathize with it. People are excited because they were never going to commission artists, and were also turned off from stock music licensing websites too.

AxEy · on June 10, 2023

>the simple answer is that your motivations for being an artist need to change to be exclusively personal fulfillment.

Mine are and the same goes for most of the artist I indicate. The point wasn't that they were in it for the money, although many dream of being able to at least one day pay the rent with it (or maybe just groceries).

The rest of your response makes sense (although I think much of it could be said for all of hiring someone to do work). Anyway, thank you for providing your perspective.

raincole · on June 10, 2023

Humans are just naturally hard to deal with, especially humans who you never met face-to-face.

As anti-social as it sounds, it's the conclusion that I've reached to after years of working with freelancers/contractors. I've contacted with >50 artists (>300 if "they send me a propose on Upworks" count) and worked with ~10 of them.

Don't get me wrong, I still choose human artists over Stable Diffusion. For now...

redox99 · on June 10, 2023

> I'm willing to admit that I'm missing something here, but I'd love it if someone could enlighten me.

It's basically the same as with Midjourney. Before Midjourney I'd have to spend quite some time organizing with some human, explaining what I want, licensing terms, etc only to have to wait a significant amount of time for an image that I may not like.

With MidJourney for just a very small amount of money I can instantly get images that are exactly what I want, iterating extremely quickly. Just the fact that I don't have to deal with another human saves a massive amount of time.

TL;DR

1) Faster

2) Cheaper

3) Often closer to what you want, because you quickly iterate and can get hundreds of variations

thorum · on June 10, 2023

“Melody conditioning” as shown in the article seems both immediately useful and something that’s harder to find a human to do for you at the same level of quality.

george_ciobanu · on June 11, 2023

I think it's about the ability of someone who perhaps has a great idea (imagination) but lacks time, resources or the skills (execution) to make it happen. Of course that a talented producer will create a more compelling song (at least for now) but if the tool is an amplifier it should also boost talented composers - their prompts or inputs to prompts might be much more detailed, more interesting, more creative than those of a neophyte. This of course makes some assumptions but I think the draw for most people is that they can "make" something that sounds cool with almost no effort. My belief is that someone who puts more effort in a GPT and has more expertise can get a lot more out of it as well. I could of course be wrong and GPTs might be the big equalizers, but I doubt it.

tararara · on June 11, 2023

I want to do this but I'm scared of the backlash of "you're being exploitative!!!!".

I know the people who say that mean well, but it totally overlooks both how much the culture does (as you say) want to provide their art for projects to use it and create value together, and... reality. Shouting at everyone isn't the way to get them onboard, but shout they do, and it's one country in particular that seems to scream the most.

I'm in the UK and I can't walk down the street without tripping over producers, so maybe the way around the angsty people is finding them in person? Or... we just use AI. The robots solving our social issues is probably a thing.

dsign · on June 11, 2023

Hmm, I was there right yesterday. I needed a track to go with a very particular side-project, and I was looking for somebody to arrange/design/sing the vocals. I’m 100% sure that a musician will always do a more musical job than me. The problem is that coaxing the exact work that I need from that musician is going to be a pain (that is to say, expensive and time-consuming), because the piece does not (can not) fall into any set genre (which tend to be cheaper to produce).

I’m not going the AI-route. It’s frankly easier, more fun and it sounds better to compose the music myself, and if the project takes off and makes a dime, hire a pro to improve things later.

noam_compsci · on June 11, 2023

> 1) free and 2) immediate

This is an insurmountable benefit. Literally the two most important things when it comes to me buying music at scale.

I made a mobile game a while back and composing and licensing music cost me $30,000 for a free to play game. That was the same as 6-months dev salary (devs in Belarus).

If I can save $30,000 and have zero delay, I’m just going to do that 100% of the time.

The only factor that a real musician can beat on is quality. But let me tell you, with zero marginal cost of production, quality will inevitably be better with generative.

1337biz · on June 10, 2023

Problem is minimal viable expectations and how fast these ar filled. In 90% of Reddit you will get flamed for offering money for anything. Wouldn‘t even touch my mind to go there.

Kiro · on June 10, 2023

I don't want to involve other people in my project.

theptip · on June 10, 2023

I think the promise that has already been demonstrated with language is that you can iterate really quickly. “Make it a bit more upbeat, ok try more synthwave, ok scratch that try darker electro, ok this is better make the bassline more pronounced? Great that’s what I was imagining”

I don’t think it’s going to displace a dedicated composer that gets the medium they are scoring for any time soon. But then that’s not what your comp was initially.

TLDR there are cases where “good enough” is going to be provided by generative music in the medium term. Unlikely for this to be anywhere adjacent to music connoisseurs.

Aeolun · on June 11, 2023

I just don’t have a good source of people that are guaranteed to produce something sensible, I have a shortlist of artists I’ve encountered over the years, but it’s indeed a very short list. If I ask someone to create a track and it turns out it’s garbage, I still need to pay them, and I’ve wasted a week of my time.

etrautmann · on June 10, 2023

Another framing of this is not based on demand. Presumably most creativity and art creation isn’t to fulfill a need or demand from anyone other than the producer. This could allow the creator and even users to feel some sense of originality and creativity.

AxEy · on June 10, 2023

I get that. It was not those purposes that I wanted to question, but rather just the one near the top of my question, namely demand.

seydor · on June 10, 2023

It s also a matter of ease of use. this is faster than searching or asking anyone online

the-dude · on June 11, 2023

Define small fee.

bottlepalm · on June 10, 2023

(and now for the rare take that isn't your typical cynical/jaded internet comment)

Wow this is more than good enough to use for background music in video games, stores, commercials, etc..

You really could have super dynamic music in a video game for instance that changes based on the time of day, environment, situation, mood, etc.. all combined.

Combine it with a LLM DJ and you could get some fun radio stations.

jsheard · on June 10, 2023

> You really could have super dynamic music in a video game for instance that changes based on the time of day, environment, situation, mood, etc.. all combined.

Games can and do already do this, dynamic sequencing of music from a pool of stems has been common practice for a while. Maybe this could let you do it cheaper, and AI could go more granular by creating new stems on the fly, but the onus is still on the AI developers to show something which hits as hard as someone like Mick Gordons dynamic compositions.

Infinite variety is of little value if the infinite space is full of infinitely boring, uninspired content.

bottlepalm · on June 10, 2023

Like I said, cynical/jaded internet commenter - I don't need Mick Gordon's dynamic compositions, I just need some background music for my game that's good enough.

And no you can't do this already:

   const musicSpeed = inFightSequence ? 'intense' : 'chill';
   const musicPrompt = `drum and bass beat with ${musicSpeed} percussions`;
   playMusic(musicPrompt);

jsheard · on June 10, 2023

I wish you luck with your game which constantly oscillates between https://youtu.be/dLkFh9Kn8AU?t=60 and https://youtu.be/gQktj-WkgEo?t=75 because you didn't put any thought into the music beyond telling the computer to make "chill DnB" and "intense DnB".

wholinator2 · on June 10, 2023

I dunno man, if it output those tracks with a smooth transition and some volume matching i think most game developers would call that a success.

bottlepalm · on June 10, 2023

Are you deflecting by attacking my simple example instead of defending your ‘this has already been done before’ point? If you’re wrong just say you’re wrong.

jsheard · on June 10, 2023

When Animal Crossing blends between a different composition for every hour of the in-game clock, with variants for different weather conditions, is that not changing based on "time of day, environment, situation, mood, etc"? When Doom dynamically ramps the intensity of the music along with the intensity of combat, and inserts perfectly synchonized stings in time with the players actions is that not reacting to the situation? That's what I mean by this already having been done, just not with AI.

AI has the potential to consider more variables than is feasible with the current process, but my question is "at what cost". Would Doom be better if the music were slightly different depending on which weapon you were holding, if the trade-off is that instead of Mick Gordons work it was a computer generating what may as well be royalty free elevator music? Probably not.

Making more content for less money is only a net positive if the content is actually good.

lukevp · on June 10, 2023

Why do you think AI music will sound like elevator music forever, when it’s already generating English text and code at such a high level? It’s quite possible that 10 years from now, Mick Gordon will sound passé when compared to the dynamic AI generated music. Maybe not, but definitely possible. There’s a lot of money to be made with better generation of music, and it’s going to be an area of exploration for sure.

jsheard · on June 10, 2023

Well, I would say that AIs ability to generate objectively correct text or correct code doesn't have much bearing on its ability to create worthwhile art, those are almost polar opposite goals. There is no objective metric for what constitutes good art that you can train an AI towards, the closest thing we've come up with is teaching it that the samples of art in the training set are "objectively correct" so that it will try to make something similar. Better models achive higher fidelity but are stuck forever imitating rather than exploring new or less common ideas.

Image generation is the most mature form of artistic generative AI, and the trend there has been towards introducing more human influence into the process to help guide the AI into creating something actually worthwhile. If the goal is to embed an unsupervised AI into a game engine and have it create consistently high quality and interesting music based on the current game state, with no human operator in the middle to curate and guide the process, we've got a hell of a long way to go.

cypress66 · on June 10, 2023

> Wow this is more than good enough to use for background music in video games, stores, commercials, etc..

I spent like 6 hours yesterday playing with this. It's really cool, but not that good yet.

I'd say it's like the original stable diffusion (without any of the finetunes and improvements). Very cool, but not 100% there yet.

ramoz · on June 10, 2023

my first gen is good enough to be an actual song (rap beat)

LegitShady · on June 10, 2023

>Wow this is more than good enough to use for background music in video games, stores, commercials, etc..

Hard disagree, and lack of copyright due to not being produced by a human becomes an issue for many video games, commercials, etc.

>You really could have super dynamic music in a video game for instance that changes based on the time of day, environment, situation, mood, etc.. all combined.

You don't need this IA for that at all.

>Combine it with a LLM DJ and you could get some fun radio stations.

You could also not, and you wouldn't know until it failed to produce anything interesting. A whole radio station filled with grocery store background music? oh wow I can't wait for the fun.

and you're not likely to make any money doing it, so what's the point aside from showing the human portion of music is missing in everything you suggested.

jerpint · on June 10, 2023

Can’t wait for soulless 24/7 grocery store robot music /s

zirgs · on June 11, 2023

It needs civitai for music. People will find ways to extend it.

woah · on June 11, 2023

Have you tried asking GPT 4 for a playlist? It sucks

emporas · on June 10, 2023

I used MusicGen yesterday to create 50 songs or so. Three of them sound pretty good [1][2][3]. MusicGen is definitely the best of four models of the presentation. I used the prompts differently than the article and i think i got better results.

Suppose there is way to measure cardio beats or electricity spikes on the brain, and we configure the machine to generate music to increase cardio beats, or decrease them, or similarly increase electrical activity of the brain or decrease it. Then psychology might be deprecated, mood will be reduced to just a music channel.

[1]https://soundcloud.com/kwstas-pramatias/lounge-owl

[2]https://soundcloud.com/kwstas-pramatias/rock-glass-shatterin...

[3]https://soundcloud.com/kwstas-pramatias/rock-owl-howling

layer8 · on June 10, 2023

Those three are pretty nice, but I’d say they are only a starting point, an inspiration, for creating a good song out of those themes.

You are probably getting downvoted for your second paragraph which is a bit out there.

emporas · on June 10, 2023

Err, yeah maybe a little bit out there.

Yes of course they are the starting point, a good musician may take some samples and transform a music generation to a better song for sure. Some artists state that a painting is never complete, or a song is never complete. There is always room for innovation.

The prompts i used, referenced real songwriters, and the model seems to know their songs. The article does not prompt it that way. So i guess there may be a little bit of IP infringement, but we need that, only for the first bunch. Next models will be trained on the best generations of previous models.

pcthrowaway · on June 11, 2023

> Some artists state that a painting is never complete, or a song is never complete

Kind of like how the software is never complete :P

Kiro · on June 10, 2023

I upvoted it because of that second paragraph.

mtlmtlmtlmtl · on June 10, 2023

People have tried this, they're called binaural beats, and they don't seem to work for the most part. I mean, not in the sense that you could engineer sound to invoke very specific effects in the brain consistently.

emporas · on June 10, 2023

I personally have more than 10 years experience on sitting in the cold all day long, with only summer clothes on. Like 0 to 5 Celsius, with only shorts on, not even socks. I am winter swimmer as well. I do that, because i can think a lot more clear in a cold environment, it is good for the brain. Granted in Greece there is not that much cold, maybe 1 or 2 months of 0-5 Celsius.

That can be achieved by putting music on, which speeds up the heart pulse. Usually hard rock, metal, thrash metal etc. In that case, the body starts sweating a lot, not matter the temperature. I combine that, with 5 simple exercises i do all day long which are important as well.

My point is that using music, someone can be in charge of his heart pulse. But my biggest complaint always was that these metal guys, are masters of the guitar, but other kinds of music have better taste in rhythm, in melody etc. Using programs like that we can evolve it a little bit, to be more pleasurable to listen.

I know about about binaural beats, i have tried to listen to different hertz for hours on end, they don't work in my opinion. At least in my case.

scns · on June 10, 2023

Huberman cited a study proving the effect. Sorry, no link at hand.

mtlmtlmtlmtl · on June 10, 2023

There's a real effect, just nothing even remotely close to the actual fantastical claims being made about it. It's highly doubtful there's some sort of profound way to induce arbitrary brain states through audio input alone.

I remember vividly that this was very hyped in some circles around 2005 or thereabout, with wild claims that listening to some strange white noise for twenty minutes could induce full-blown psychedelic trips even in people with no psychedelic experience. I even tried a bunch of em, and the only clear effect was a mild headache. And I was naïve enough to think it might work back then, and yet there wasn't even really a placebo effect.

emporas · on June 10, 2023

I was thinking of a scenario of mapping our brain activity, like reading functions of some module, or a birthday party, or a business meeting. From then on, we put the machine to generate songs and activate roughly the same brain region of the actual life experience. We do that once, and generate 10 songs.

The next time that life experience takes place, we listen five or ten minutes to the relevant songs before it happens. We do that to put ourselves in the mood, as a mental preparation tool.

That's all. Not creating worldwide Britney Spears hits, or alter our consciousness. Just a mental tool.

mtlmtlmtlmtl · on June 10, 2023

Oh I see, you're essentially describing what I think of as aural contextual clues/associations. Sure, that's very real and I've experienced it first hand.

Though I'm sceptical how directed it can really be. There are some songs that have a bizarre effect on me for sure. Though most of the time it's because I had some strange experience involving the combination of said music with psychedelic drugs. And now the music can induce echoes of that experience. But it's just sort of an association that happened by accident.

I guess I could see people using this phenomenon in a more deliberate manner. And you certainly seem to be doing so. Though it could be that you're just somehow more able to than most people.

emporas · on June 10, 2023

That happens in general, many people associate music with relevant actions of their life. They listen to songs which are more suitable to driving a car, or lounge beats to read books.

One scenario is to record some sounds of the event once, like the laughter of a child in it's birthday, put it to songs, and listen to it before the next time it happens.

One other scenario, is to record the brainwaves of some difficult task, like programming, and by listening to songs, try to activate the same region of the brain. When there is an automatic way to create one song which activates an area but not exactly, and another song which activates one more area but not exactly, the machine will try to figure out how to combine the two songs together which will hit the spot. It is essentially a problem of combining information, which A.I. statistical engines are very good at it.

ben_w · on June 10, 2023

So, CC-BY-NC licensed model weights, and they've made sure to license the training data. And some jurisdictions are saying that copyright cannot be claimed on the output of such models.

Oh, to be a fly on the wall in RIAA corporate offices…

Sans schadenfreude, I think this (depending on inference speed) could be perfect for dynamic content in games (including IRL games: LARP, escape rooms, table top games, etc.)

yyyk · on June 10, 2023

>Oh, to be a fly on the wall in RIAA corporate offices…

In all likelihood, they're ok with events. Games were never anywhere near their main revenue stream. Now the labour costs on what they're actually selling are dropping to zero. RIAA's future:

1) Use AI to fake a band.

2) Use AI to write music (maybe even lyrics). Don't really care if the AI is any good.

3) Distribute output widely, note that copyright still applies to the output.

4) Use media to generate hype (the critical step). This depends only on platform control/relations, and they have that.

5) Yea, other people could technically generate same quality dreck with AI, but it won't be (and legally can't be) exactly like the hyped dreck. Others can replicate nearly everything except the hype.

6) Since the costs are near zero just about every sale is pure profit.

Basically, since Music can be replicated, they'll sell hype and belonging to a fan group instead.

LegitShady · on June 10, 2023

Then they'll quickly be replaced because nothing about that is special - the RIAA exists because they were positioned to guard intellectual property that gave them a monopology on IP that was culturally significant.

Making and marketing an AI band isn't even interesting. Someone will be doing it on twitch and youtube an anime vtuber ensemble before the RIAA even figure out any portion of it. The media hype is because of celebrity, and AI generated stuff can't be celebrity.

yyyk · on June 10, 2023

>The media hype is because of celebrity, and AI generated stuff can't be celebrity.

With sufficient social network campaigns, media brib^W relations, and paid influencers we can get anything to be a celebrity, whether it's a paid actor or an AI avatar. That's the special step that not quite anyone can do. Plenty of K-Pop is already not that different...

pcthrowaway · on June 11, 2023

> 3) Distribute output widely, note that copyright still applies to the output.

Can you elaborate on this point? I don't think this has been established to be the case yet.

yyyk · on June 11, 2023

Let's adopt a legal realist point of view. RIAA/etc. have money, lots of lawyers with large briefcases, and lobbyists so as to basically write the law. The ordinary YouTuber doesn't have tohe resources to defend, and there aren't any commercial interests on the other side here.

So even if the output is technically made entirely by LLM, they'd find a way to slightly tweak the process or even the law so it applies. Someone will do a trivial low-pass filter and then claim copyright. At worst they'd find a flunky to say they 'wrote' the music.

ben_w · on June 11, 2023

They might; but equally, Meta or Apple might counter-lobby to end copyright as a concept entirely, (or just for music, depending on how good the LLMs are at code and script writing for commissioned TV shows and stuff).

Those two can still rely on patents and trade secrets in a way that RIAA can't. (At least, the RIAA can't rely on those as far as I can see, but what do I know…)

Eisenstein · on June 11, 2023

Who plays the live shows?

yyyk · on June 11, 2023

That's what actors + pre-recordings are for (we could try using AI to generate the music live but that would make the actors' work more complicated and add more fail modes, why bother?). Note that using recordings already happened in the pre-AI era:

https://blabbermouth.net/news/anvils-lips-on-bands-using-bac...

seydor · on June 10, 2023

Well , a lot of artists have sued other artists for plagiarizing. Now MusicGen will be called to testify in court and show is composing method. And if it can't prove innocence, it will be put in jail

lelandfe · on June 10, 2023

I mean this is shockingly good. The longer “lofi” example at the bottom sounds like it could have been a Boards of Canada demo.

I’m very impressed.

Jeff_Brown · on June 11, 2023

On sound quality, yes. But neither this nor any of the others offered for comparison feels creative in the slightest.

By contrast OpenAI's JukeBox, which is maybe two years old now, really comes up with fascinating ideas.

lelandfe · on June 11, 2023

No, not just on sound quality. If that was an outro for a Boards of Canada or Ulrich Schnauss demo, I would not have batted an eye.

woah · on June 11, 2023

No, it sounds like a car commercial.

speedgoose · on June 10, 2023

I installed it and everything went surprisingly fine and easily. It used about 8GB or VRAM max on a nvidia A30 and takes about 30s to generate 10s of audio. The max duration seems to be 30s in the frontend but the quality is a lot lower.

Mixing genres do not really work and the model doesn’t seem to be trained on band names. However it does perform well to create music using existing styles.

I generated some Eurovision crap and minimalist techno that were very much believable. But mixing death metal with lofi ambient isn’t the best, nor the epic progressive rock guitar solo I asked.

I think the examples on the website are cherry picked but with some experience in prompt engineering and many tentatives, it should be possible to generate great samples.

It’s also excellent at generating boards of Canada like music. The audio artéfacts, the low fidelity, the weird sounds, the detuned synths, this model does that very well and it does sound great to me.

Thanks a lot to the authors.

solumunus · on June 11, 2023

> But mixing death metal with lofi ambient isn’t the best

Would you expect that to be good if a human did it?

squeaky-clean · on June 11, 2023

(Not the original commenter) I don't know of any death metal examples, but chill-lofi-beats+djent music already exists and it's pretty good. This isn't the only example but it's my personal favorite one:

https://youtu.be/-Yeqp2-dF5M

solumunus · on June 12, 2023

I don't think you can have death metal without double bass hammering. I don't think you can have lofi with double bass hammering.

speedgoose · on June 11, 2023

Depends on the human. If it’s myself, not really. But if it’s well done, yes I would enjoy death metal sonorities in lofi ambiant once in a while. But I admit the prompt is perhaps a bit challenging.

muglug · on June 10, 2023

For the most part the new samples still sound like melodic nonsense — in all but one of the examples the melody doesn't fit properly with the chords underneath. It really does feel like the output of a music blender.

The style transfer is the most interesting bit IMO, as you get a sense of how it hears the source examples.

For example, when transferring the opening to the Bach Toccata all the new samples miss out the same passing note (the fifth note in the sequence). To a human ear that note is important, and could easily have been incorporated into the new samples, but it seemingly doesn't activate enough neurons for MusicGen to care.

justdep · on June 11, 2023

This is as bad as it'll ever be though.

muglug · on June 11, 2023

Sure, but how good can it get with transformers? Is there some upper bound imposed by this overall approach? We don't know that yet, either.

yantrams · on June 10, 2023

I'm just weirded about the fact that conversation about something AS EPIC AS THIS is so boring and rudderless here on hacker news of all places.

I mean like YESTERDAY I did not have this superpower to summon something as majestic as say https://fb.watch/l4ssOD40M4/ with a simple 'A quirky and skronky Aphex twin sample that just hits you'

Edits:

I woke up to this news delivered from Yann Lecun himself in the morning on facebook[1] and my gaped mouth can still be found for onlookers to witness I suppose!

LIKE THIS IS IT FOLKS!

Edit 2

All those back in my days muzzak folks lamenting about the quality of contemporary music can fuck right off because you clearly havent explored enough of the modern music landscape.

Dont you dare blaspheme saying modern music has stagnated or some drivel like that. It is outright offensive to folks who are pushing the boundaries like say for example The Ex from Netherlands https://www.facebook.com/theexband https://www.theex.nl/news.html

Just because you and the other soulless people you fraternize with are ignorant of all the innovative stuff thats going on, we have to suffer through your opinion on the state of pop culture?

jimbokun · on June 10, 2023

Because we like music for the fun of making it, and the shared emotional connection felt with the artist (whether it's real or not, knowing a human wrote and performed a piece allows you to imagine this connection).

I don't know what the point of machine generated music is. Just destroying one of the few remaining ways for people to make a living doing something creative, I guess.

The promise of automation was to have machines do the things we don't want to do, so humans could have more time to do things we enjoy.

Instead, we are automating the things humans enjoy, and still leaving humans to figure out how to feed, house and clothe ourselves through the sweat of our brow.

Tao3300 · on June 10, 2023

> Because we like music for the fun of making it, and the shared emotional connection felt with the artist... I don't know what the point of machine generated music is.

Bingo.

This is a fun toy, but in terms what it means, you may as well ask an AI to pray. It's completely hollow in terms of the actual experience.

This could make suitable filler for idle games, ads, aquariums, and elevators. Not much else. Perhaps at best, a producer could use this to fill in the instrumentation behind a singer, but I have a feeling it's not there yet.

> The promise of automation was to have machines do the things we don't want to do, so humans could have more time to do things we enjoy... Instead, we are automating the things humans enjoy.

Damn. Never looked at it that way. It's still enjoyable to do these things, but perhaps less lucrative. I don't know, do professional musicians like arranging elevator music? I'm strictly an amateur who has never made a dime performing, so I really don't know if that would be joyful, soul-crushing, or somewhere in between. I just know what it means to me, and like I said, you may as well ask the machine to pray for all I think this amounts to.

visarga · on June 10, 2023

> but in terms what it means , you may as well ask an AI to pray

The generative process is based on a combination of learning and randomness. The random part doesn't mean anything, but it's clear that it is far from just random notes. Do you think human music always starts from a meaning? It's just lucky accidents that sound good. We even retrofit explanations post facto to our actions, we can certainly compose music first and assign a meaning later.

Around 150 years ago classical music had a big dilemma - should music be related to concrete things or abstract? Should we put a story to music? So everyone wanted to know "what was the program?" (program==original author's meaning) sometimes composers would just hide it in order to instigate people to use their imaginations. It didn't matter what meaning the author originally assigned to it, better to try to hear it with beginners ears.

Tao3300 · on June 10, 2023

You've misunderstood. I'm not talking about the meaning of the inputs and outputs of a creative process. I'm talking about the very experience of doing the thing. Hence the prayer comparison.

bluefishinit · on June 10, 2023

> I don't know what the point of machine generated music is.

One point is that music fans can now make their own music. I think it's great that people can express themselves and it's not limited to those who put in 10k+ hours to master a single instrument. More people creating is a good thing.

waboremo · on June 10, 2023

The idea that creating music has had a huge technical barrier is laughable. It has not existed for ~20 years. Artists like Tinashe have learned to produce music themselves with programs like ableton, not a lick of mastering instruments or graduating from this and that art school. Just a general sense of what sounds good to you. Unlike visual art, there's no mechanical barrier either, no mastering of techniques. You can genuinely fiddle around with knobs and buttons and create something that sounds great to you - soundcloud is filled with these.

So there isn't going to be an increased level of profound self expressions because of this. Quite the opposite, more pure noise for the purpose of farming ad revenue.

What's worse, and an aspect many proponents of AI generations ignore, is that by ushering people into this specific channel of caring more about prompts than all else, we are doing a real disservice to potential people who could have become serious masters of their realm. After all, "why learn how that music program works when I can just generate it?"

boredemployee · on June 10, 2023

>> After all, "why learn how that music program works when I can just generate it?"

That's how many of my friends in the music biz are thinking right now.

Also the same applies to Code and anything that could be generated by AI. I honestly lost the joy of learning a programming language with the advent of GPT.

The future is dark.

zirgs · on June 11, 2023

Chat GPT still can't do visual programming so if you use something like Unreal Engine - you still have to figure out everything yourself. Sure - GPT can generate algorithms, but it can't playtest the game. I can't ask ChatGPT to implement something like wall climbing or object throwing in Unreal. It will probably generate something, but it has no way to playtest it and check if it actually feels good to play.

jimbokun · on June 12, 2023

Not until the next version, at least.

jerpint · on June 10, 2023

For me GPT allows me to explore more ideas quickly, and can help you learn languages more efficiently. More importantly, you don’t have to use it

Tao3300 · on June 10, 2023

That probably is a good thing, but the road to mastery is a great thing. I can't describe to you the feeling of being in the zone while making music, but I'll try.

Things will erode and decay, things will come into being, things will change. This flux is so constant that in truth there hardly are any things, just the changes; for as soon as you step in the river a second time, neither you nor the river are the same as you were. Epictetus, maybe? One of those guys.

Likewise, music is inherently fleeting, yet it still makes sense. You can't hold music, yet there's still a sense of it being a thing that exists. Yet when it stops, it still somehow hasn't ceased to exist. The act of musical performance, even at a basic level, especially with others, brings us one step closer to something fundamental about the universe than other forms of expression.

Like I said elsewhere, if you could ask the machine to pray or meditate, it wouldn't be fulfilling for anyone. It would be hollow.

throwaway675309 · on June 11, 2023

people can express themselves...

Is it really an act of personal expression if you've narrowed the "vocabulary of creation" to the stable Diffusion equivalent of "hyper realistic, unreal engine, 8K, masterpiece, intricate details"?

At that point would not the act of creation feel rather hollow?

zirgs · on June 11, 2023

Tbh, since the invention of DAW software you don't actually have to learn to play any instruments.

I can create something that sounds decent in software like Ableton Live or whatever, but I can't play anything on a piano or a guitar.

tarr11 · on June 10, 2023

This feels like a straw man to me. We are continuing to automate feeding, housing and clothing ourselves as well. These two things are not mutually exclusive.

I would like to make music, video games and movies, too, and AI lets me do that. I don’t need millions of dollars or years of training to make something creative anymore.

logarhythmic · on June 10, 2023

> I don’t need millions of dollars or years of training to make something creative anymore.

You never did... You just needed to get creative.

Kye · on June 10, 2023

You can go a long way with LMMS or Ardour and free sample packs. Most big sample production companies provide royalty-free samplers. The free stuff from Sonniss (GDC freebies) and Black Octopus Sound could last an entire career. Throw in the free Komplete Start (or Helm and Surge if you prefer open source) and you have all your synthesis needs covered: https://www.native-instruments.com/en/products/komplete/bund...

jimbokun · on June 12, 2023

> I would like to make music, video games and movies, too, and AI lets me do that.

In what sense is that still you doing the creating?

satvikpendem · on June 10, 2023

> Because we like music for the fun of making it, and the shared emotional connection felt with the artist (whether it's real or not, knowing a human wrote and performed a piece allows you to imagine this connection).

Speak for yourself. I like music if it sounds good, regardless of who made it.

electroly · on June 10, 2023

> we are automating the things humans enjoy, and still leaving humans to figure out how to feed, house and clothe ourselves through the sweat of our brow.

Have you been to a farm before? Have you seen a textile factory? Have you seen a construction site? How could you, with a straight face, suggest we are not automating those things? There are vastly more people working on automation in those fields than are working on AI-generated music. Automation in agriculture, construction, and textiles are massive industries. There are a lot of people in the world working on a lot of things.

jimbokun · on June 12, 2023

I mean that I still need a job to get access to those things. Housing prices go up and up, whatever automation is happening in that market is not helping ordinary people who need a place to live.

Food is a different problem. We have access to very cheap calories, but the overall quality of nutrition is way down in advanced economies, leading to an epidemic of obesity.

Textiles is pretty much a solved problem. We have so many clothes, we give them away en masse in developing countries. I think there are very few people in the world without access to adequate clothing, and if there are I suspect it's a distribution problem.

Kye · on June 10, 2023

Yep. The way commissioners react when I deliver the files and they hear what I made for them for the first time tells me AI has a long way to go. I'm not sure it can replace that human connection. There's plenty of solid, cheap, and sometimes even free library music out there if you just want music of some sort for a project, and no generative music I've heard comes close to it.

lyu07282 · on June 10, 2023

> The promise of automation was to have machines do the things we don't want to do, so humans could have more time to do things we enjoy.

But why exactly should that happen? By which mechanism? Every single company automates in order to increase their monopolies and profit, to generate more shareholder value. There exists no other mechanism, so obviously we will never do anything other than that.

visarga · on June 10, 2023

> I don't know what the point of machine generated music is.

Exploring the latent space of human music. It's a cultural mirror.

grugagag · on June 10, 2023

But at the risk of aligning mirrors to other mirrors and hollowing out the essence of it. Computers have been essential to the evolution of modern music, AI won’t evolve it anywhere because it needs to mirror the human work, and without people to do that it’s a sad dead end. But I doubt people will stop learning instuments and stop making music the old way because it is too fun and meaningful to do that. But there’s a possibility it will shift in magnitude in either direction. Hope to go the way chess did and not press a button and a few faders and call it music.

cutler · on June 10, 2023

You said it for me. Muzak lives on.

woah · on June 11, 2023

These soulless beatboxes will never have the heart of a real drummer

cheschire · on June 10, 2023

It's not rudderless, there's just a large amount of angst surrounding AI/ML ranging from "more ways to feed the copyright trolls" to "what should I raise my kids to do for a starter career?" and a lot of interpolated points in between.

You're totally okay not feeling this angst. But so are the folks who do.

devin · on June 10, 2023

I think it's really neat, but I also kind of go "meh". I've been into generative music stuff for a long time, but whenever I get to the end of the project I go "meh", and I don't really feel any different about this.

As I've watched the evolution of music generation with LLMs I feel like I just keep hearing drivel at greater fidelity. If you like it then by all means listen to it, but this is average or below. In some ways I think I prefer the more chaotic less coherent predecessors. They're a bit more interesting to my ear.

And as other posters have said: that doesn't really sound like Aphex Twin to me at all.

wooque · on June 10, 2023

Because that doesn't sound a bit like Aphex Twin and sounds like some generic filler music.

solumunus · on June 10, 2023

That is straight garbage.

wholinator2 · on June 10, 2023

If I'm being honest, i have to agree. This is like, the least interesting sound I've heard today. It's just a beat, like i bet some things could sound cool eventually but it's just ridiculously generic and kinda derivative. As well i can tell it's ai generated, it's got the same kind of stilted, just holding on to tempo, that most voice generation sounds like. Like it's mere moments away from entirely falling apart into machine screeching and creepy whisper sounds. Maybe there's better examples but being introduced with this clip has really put me off the whole idea

moonchrome · on June 10, 2023

I get the same feeling every time I buy into the AI hype and try it for myself.

On stuff like art it's hard to judge objectively, but in things like code it's much simpler. Don't get me wrong there are cases where I find generative AI useful - but the hype machine and the unedited whole solutions are just straight garbage.

LegitShady · on June 10, 2023

because it doesn't sound like aphex twin, isn't particularly quirky, isn't skronky, and doesn't just hit me.

It sounds like output not resembling what you requested, and you're celebrating because for some random reason this particular prompt didn't sound totally horrible today. But it isn't intentionally making music, and it isn't particularly interesting music either. It's basically baby's first drum machine sort of stuff.

yantrams · on June 10, 2023

It's basically baby's first drum machine sort of stuff.

PRECISELY! And I find that magical. Better prompt fidelity, model zoo etc will follow soon.

tomjakubowski · on June 10, 2023

that's cool but also sounds nothing like aphex twin. sort of four tet-ish maybe

enricozb · on June 10, 2023

Reminds me of his "Change" off of "26 mixes for cash": https://www.youtube.com/watch?v=ecHLSoJeAAA

However, that's from a sampled drum beat. I generally agree though that this generated snippet doesn't remind me of Aphex Twin much at all.

blensor · on June 10, 2023

I may get down voted for this but it somehow reminds me of this: https://www.youtube.com/watch?v=fboNTcjJ8bo

yantrams · on June 10, 2023

Agree. I'm still blown away by the fact that we can summon this level of coherent output with text. Absolute black magic sorcery that!

kristaps · on June 10, 2023

Eh, could fit into a busier "Acrid avid jam shred", I think

wrl · on June 11, 2023

When it comes to music, the meme at this point is that "discovery is the problem." There is already so much music being made by so many people that the difficult part is connecting listeners to music they enjoy. It's tiring to see endless takes of "finally! we don't need artists anymore! we can just stand on their shoulders by training models on their work and then generate our own art instead!"

There's already so much art being made. Why have so much joy in ignoring all of it and focusing on generative AI instead?

Furthermore, there's another work posted on that FB account that has the caption:

"While I'm concerned about the possible impact on society - especially on the jobs front, I cant help but grin as the edifice of human exceptionalism is shred apart with every passing day."

Is that you? Why the misanthropy? "edifice of human exceptionalism"? As it applies to... making music?

fullshark · on June 10, 2023

Writing about music is like dancing about architecture

gilmore606 · on June 11, 2023

This quote always makes me wish someone would give dancing about architecture a good try.

paddw · on June 10, 2023

Every week now there is a new AI thing and we are all worn out from trying to continually ascertain what to think about them.

bratao · on June 10, 2023

Meta is truly on fire with the ML releases, outpacing Google and friends. Kudos to them! I'm genuinely thrilled about the potential release of LLaMA2.

seydor · on June 10, 2023

Lecun is rushing everything out the door before they are forced to say "Yes Senator" again

blululu · on June 10, 2023

Possibly. FAIR has always been doing great work and making it public though (PyTorch is so big that we forget about it sometimes). Sadly 'we sell ads' is going to remain the case unless product people ask users to pony up some cash to use this tech. To be fair, I would totally chuck some cash to play with something like this and I can easily imagine a world in which this technology is used to power some bizarre social experiences like an online drum circle or some such.

odyssey7 · on June 10, 2023

Infinite music is interesting from the angle that the music that we value is connected to our cultural and social experience. How can we cherish a song that has never been heard before and will never be heard again, which means it is deprived of social context that would give it meaning?

One answer would be to create music that shares its roots with music that the listener already knows. This music could be enjoyable, but you can't exactly sing along to a melody you're hearing for the first and last time, so it has more limited engagement potential. This is an approach to composition that you learn when you study chord progressions and other elements in music theory, and it's what I'm sensing when I listen to the MusicGen outputs.

To draw from greater cultural context, you can incorporate folk and popular melodies that are widely known. Musicians love this trick. "Immature artists copy, great artists steal." MusicGen seems capable of doing this, too.

To promote a novel melody as something that listeners deeply cherish, or to innovate at the level of the theory, the social context has to be built up around the content after it's generated. E.g., when introducing a new song on the radio, a common trick is to play it between songs that are already popular; building up co-occurrences with songs that already have cultural significance. My challenge to Meta would be: can you use your platform to transform some of the model's novel outputs into familiar popular music? It would be an important cultural milestone if an AI-generated melody became a familiar tune that would be played in the café, recognized, and enjoyed.

pcthrowaway · on June 11, 2023

I've thought about this same idea before, but especially as it relates to film/television, and writing.

Think of your favourite TV show.

If when you first watched that show, you were told no one else had ever watched it, or ever would watch it, would your engagement with it be the same?

Part of our enjoyment of art is the shared cultural context. Maybe saying "art" here isn't the salient thing. Maybe it's our engagement with ideas.

I personally haven't even considered this concept as much in relation to music, because while I do love to deeply engage with music and the shared narrative behind it, both real and imagined, I also just like to put on music that sits in the background as a tool to drown out noise while I'm working, walking, etc.

visarga · on June 10, 2023

> It would be an important cultural milestone if an AI-generated melody became a familiar tune that would be played in the café, recognized, and enjoyed

New generative music benchmark - popularity.

odyssey7 · on June 14, 2023

I can see why some might think this is silly, it seems unscientific. But what exactly is human-level performance in music, and how can we detect that?

If music is an artifact appreciated by listeners, then any metric apart from whether the music is listened to would be a proxy — though I can appreciate the perspective of creating for the artist’s own sake, without a need to share the creations.

Popularizing some of the model’s outputs would reveal their merit against human-produced music, by allowing them to succeed or fail in attaining the same quality of cultural significance.

Then again, if the artist here is the AI research team and the audience is AI enthusiasts, then the music has succeeded in being heard and attaining cultural significance. It has been remarked that Schoenberg’s music was more often defended than listened to — music written by a theorist for an audience of theorists. I am a true fan of Schoenberg’s, though I can hear that the example outputs of this work are music that is meant to be accessible to the everyday listener.

seydor · on June 10, 2023

But how is GenA going to prompt this thing? how will they know what "80s music" is about and all.

In other news, goodbye Youtube audio library, this is pretty good

At 3.3B parameters this should be running locally, right Meta? (Yes it does, instructions on github)

I 'm not sure I've seen any MIDI LLMs , wouldn't that be more fun to do ?

Tenoke · on June 10, 2023

I was playing with it yesterday and it's not bad. I'd much rather use it for e.g. YouTube videos than risk getting copyright claimed for using something that already exists.

mmaunder · on June 10, 2023

This is incredible!! For all the “AI is stealing our music” naysayers here consider that all art is derivative or it lacks context and makes it nonsensical, and artists learn too.

tgv · on June 11, 2023

I don't know why you're mentioning art being derivative, but that's not the thing that worries me.

What worries me, is that a good enough model will take away the incentive to write music for many, and as a consequence it will also remove performers. This will reduce demand on music teaching and instruments, which will then both become nearly inaccessible. Since learning music isn't a question of following a few youtube videos, this will leave the world with just AI music.

Jazz and classical music are probably exempt, since it relies on subsidies, their audiences care about the actual performance, and AI compositions will not draw enough of a crowd to make it financially interesting.

But popular music will suffer, and that's what makes development of these models straight evil.

ddmichael · on June 10, 2023

[flagged]

dang · on June 10, 2023

Can you please make your substantive points without personal swipes?

https://hn.algolia.com/?dateRange=all&page=0&prefix=true&sor...

https://news.ycombinator.com/newsguidelines.html

skilled · on June 10, 2023

Except for the 2 min lowfi demo, I found the examples to be pretty bad. Sounds like the music is being played in a cardboard box in your garage’s corner.

malux85 · on June 10, 2023

Just like the first generated AI images were full of blurry artefacts,

Then 2 months later, they weren’t.

You’ve got to start somewhere.

pkaye · on June 10, 2023

That Riffusion output is causing me both laughter and pain. The pain is from laughing causing pressure of a surgical scar on my abdomen.

mtlmtlmtlmtl · on June 10, 2023

This is still pretty deep in the uncanny valley. None of the rock examples for example even sound very much like guitar. One sounds more like trance, the others seem more like metal than rock(though interestingly trance is a lot more similar to metal than you might think. It's just hard to notice at the surface level due to very different instruments).

Then again, in this case I don't mind. I'm sure someone like Simon Posford could do some really wacky sampling based off of this.

Don't see myself using it to make music just for my own listening though(not much of a composer). That's still a long ways off.

iamsaitam · on June 10, 2023

The question that I think tells the kind of potential that ML has for music is, how good is written language in describing music? People here compare this to the development of Stable Diffusion, but the mediums are quite different. Describe your favorite shot in a movie and most likely you’ll enumerate and get pretty far in conveying it to someone who hasn’t seen it, at least the physical aspect of it. But try describing your favorite song and the interpretations will be wildly different (outside of hard defined genre music).

siquick · on June 10, 2023

“Writing about music is like dancing about architecture.”

raincole · on June 10, 2023

1. MusicGen accepts melody prompt.

2. It doesn't have an "interrogation" feature for now (unless I missed it), but I'm sure something like this is possible.

zavertnik · on June 11, 2023

> The question that I think tells the kind of potential that ML has for music is, how good is written language in describing music?

If you took a DAW project file of a song.wav that was completely written and produced digitally using virtual instruments and compiled all of the parameters a user had to set to achieve their output.wav into a .csv file, you may be surprised to see (1) how few parameters were used (2) how often those parameters are unchanged from their defaults and (3) the amount of those parameters that would be expected in any other project file.

When you break it down, you really only have 6 layers to parse, all of which are dynamic but within a relatively small and consistent sandbox, at least relative to image generation.

1. composition layer - the midi or notes of the song. 2. arrangement layer - the selection of instruments used in the song and the division of the song's midi to the song's respective instruments. 3. instrument layer - the parameters of each instrument, such as a synth path or a virtual piano's room setting. 4. post processing layer - the effects placed on the output of each instrument, such as reverb, compression, delay, ect. 5. mixing layer - the volume of each instrument + post processing channel 6. mastering layer - processing on the master track

All of these things are more or less standardized. Developers always add their own flair (read: custom parameters) for their plugins, but they can be decompiled to be a composition of each of these layer's fundamental parameters. All these parameters + the midi of a song would be a few kb

I feel like a LLM trained on the parameter sets which interacts with the software used to manipulate these layers could really produce amazing tools and open the door to writing high quality songs to everyone, just as other AI products have opened so many similar doors.

The DALL-E app for music, in my mind, probably wont be a text based description -> .wav output. Instead, it would be the generation of the elements of each layer with options that can be listened to in real time using whatever VSTs were used in training. When you ask ChatGPT to write a complex python script, it starts with an outline of all the methods in the script as placeholders and then takes you step by step until you're done, then you troubleshoot it or flesh it out. The best part of a generative music like this is that it leaves the user really only with having to decide if something sounds good or not.

As a mostly musically illiterate producer myself, I've produced hundreds of songs and a few albums without ever really learning how to do anything other than manipulate the parameters. When I started learning to produce music I was 15 years old and knew nothing about music production. But, I was really good at was using computers and software so I learned to play the DAW, the plugins, and the sample packs. The only layer that I couldn't learn through learning software was the composition, the writing of the midi. Fortunately, the midi of a song becomes very easy to brute force over time, so I learned to brute force midi. Once I became efficient with my workflow, producing music became a task of "make this idea sound good." And without ever really feeling like I was a musician or composer, this became an enormous passion and outlet for me that I did every day for a decade.

I was able to do this because at its core, all the mechanical parts of a song are simple machines and a song's quality is the way those machines are used together. As an outside, this feels like a workflow that would be very machine learning friendly. But I could be wrong!

lucis · on June 10, 2023

I wonder if any of those services can generate editable output for a software like Ableton or Logic Pro.

Seems to be more useful as an "assistant" for music producers, similar to how Copilot operates.

NickC25 · on June 10, 2023

Google has actually released Magenta [0] which is a plugin within Ableton that can do just that. It's pretty cool, it generates interesting melodic content at the click of a button.

[0] https://magenta.tensorflow.org/studio

IAmGraydon · on June 11, 2023

I’m a musician and have spent over 20 years making music. It’s my biggest passion. I still love this. Music isn’t a competition to me so it doesn’t bother me that someone with little experience can create great sounding music. I can still make music that’s an expression of myself, or I can generate some AI music to see if something catches my ear as a starting point. It’s just another tool, and I’m excited to see how it evolves.

xfour · on June 10, 2023

Pretty good at transcribing the text, but the outputted music feels for lack of a better word “safe”. For example the kicking beat is way to generic and soft.

rifty · on June 10, 2023

Style transfer is pretty cool tool as someone who likes to play around with sound design. It will be a lot of fun to drop this on parallel channels and blend it together into choruses and new instruments.

Right now the generation still sounds a lot like loop packs smashed together anyone could technically make. But it is practical for anyone who really only cares for that style of sound but do not themselves have the familiarity to do it. Now they can just say what they want and hit regenerate, skipping the latent feedback cycle of iterating with humans or sifting through song snippets.

My opinion on this style of content is that ai generation is simply accelerating us to the inevitable end of generic digital content, it isn't really changing it. It just happens to be also the optimal interface for discovering and not just generation.

cma · on June 10, 2023

They messed up and shipped Bolero with it in the repository, which is copyrighted until 2025 in the US

https://en.wikipedia.org/wiki/Bol%C3%A9ro

donbrae · on June 11, 2023

These are really impressive. Providing the ‘prompter-composer’ can generate high quality, stereo mixes — these all appear to be low bitrate, mono files — you could probably use them right now for various ‘wallpaper music’ purposes, where originality is not required or is even a detriment: ‘on hold’ music, low budget video productions, symphonic film scores etc.

Producing this kind of anti-music is pretty soul-destroying for a musician, anyway, so the machines might as well do it. We can then spend our time working on stuff that means something, and if we’re lucky and it connects with enough people, make a living from it.

two_in_one · on June 11, 2023

Before using just listen the samples. It doesn't have any structure even within 30 sec. It's repetitive and boring. Can be used as a background in low budget game, but that's it. It's not a threat to human talents, not yet. It's like first text generators with tiny attention windows, took some time to evolve into GPT-4. Then it will have an impact. Bans and unions may delay that for some time.

toasternz · on June 10, 2023

Music is hard to describe well without using artist names or references to specific songs. There isn't an alternative way to really describe things - "Airy EDM with tropical feel" doesn't cut it.

This space will belong to scrappy shadowy decentralised organisations who let you type "give me a filtered french disco song using mizell brothers era johnny hammond jazz funk samples, lil uzi rapping, with a thundercat bassline and crooning"

cutler · on June 10, 2023

As if the current music scene hasn't already plummed the depths of banality. At this rate the stars of the 70s and 80s will be in business until they expire.

obiefernandez · on June 10, 2023

Music is already one of the most extremely devalued art forms given how oversupplied it is. Boggles the mind to think of the consequences of technology like this reaching quality levels where the differences between it and professionally produced music are imperceptible.

rjh29 · on June 10, 2023

One consequence is bespoke music that changes dynamically, e.g. in games.

There's also a relative dearth of royalty free music for independent content creators to use. AI would enable them to produce better content on a limited budget.

People who enjoy creating music from scratch will be unaffected - recognition and financial rewards are tiny already for most.

usaar333 · on June 10, 2023

If anything, the consequences seem minimal because it is so oversupplied.

obiefernandez · on June 10, 2023

This comment made me realize that I should have said the consequences for me and people like me who are trying to break through new artists.

https://soundcloud.com/obie for reference

Zetobal · on June 10, 2023

[flagged]

istjohn · on June 10, 2023

I downvoted because your comment is unnecessarily passive agressive and combative.

Zetobal · on June 10, 2023

Is it? He stated something as a fact and I want him to take the viewpoint of other persons than himself and think about it. :)

amanaplanacanal · on June 10, 2023

If there is a viewpoint you want people to think about, it would be much more productive to just state what it is.

Zetobal · on June 10, 2023

No, I don't want to influence him it's way easier to see the prejudice of other people if you let them think for themselves.

FpUser · on June 10, 2023

IS it me or I just simply do not hear any real music in the "examples". I tried quite a few and would not want to use any for listening.

sureglymop · on June 10, 2023

The drum and bass is genuinely amazing! Wow.

fintechie · on June 11, 2023

Do any of these generative music ML framework export as midi instead of .wav or .mp3? That would be 1000x times more useful as the quality we're reaching is good enough.

I can imagine an VSTi that just takes a prompt and generates midi tracks. Something like this is surely coming in the next couple of years.

mk_stjames · on June 11, 2023

That was the way almost all ML research on music was done until recently- train for MIDi generation with input of other MIDI, or, at an even more fundamental level, just notes. If you go back, tons of ML music papers were written on generating believable sequences of Bach, Mozart, etc. because it was just note prediction.

Since the advent of transformers, and this idea of using text models mapping the natural language space to tagged music samples, and the music tokenizer acting directly on sampled audio stream (the bits of a .wav file, essentially) all the cutting edge work is going that route. Because it is producing high quality, finished audio streams directly. And I think part of it is because there is way, way more training data for actual audio than there is for MIDI alone (there are tons of free midi sites out there but a lot of it is garbage and it pales in comparison to what is already sampled and tagged in real audio libraries).

I imagine what will happen is... within two to three years these LM-transformer-music models will get so good that the audio will be damn near spotless sounding, and there will be additional methods developed to synthesize with more control directly with the models, to the point where wanting MIDI so you can use your own HQ synth isn't needed, because if you want "the lead synth to sound less digital and more like a classic Minimoog Model D" you just add that to another 'Music2Music" pass and out pops your sound.

For those who still want MIDI there is still work being done on traditional audio-to-MIDI modeling and I think you'd wind up just using that in the chain.

jimnotgym · on June 10, 2023

I just got excited at a mention of Summits On The Air and found a boring AI article.

It's an amateur radio thing btw...You go up a mountain and make use of the good propogation to call other hams. You accrue points that are as valuable as HN points are!

MicolashKyoka · on June 11, 2023

Amazing work by FAIR! With tools like these, AI-enhanced creativity is going to give us hyper specialized output and thankfully further us even more from mainstream, one-size-fits-all art. We are living in the future!

beefman · on June 10, 2023

> a piano and cello duet playing a sad chambers music

No such thing as chambers music or a music. Maybe "a sad piano and cello duet" or "sad piano and cello chamber music" would be a better prompt.

finger · on June 10, 2023

I can’t find any requirements. Can you run it locally on a consumer GPU?

8bitsrule · on June 11, 2023

I'd get off any elevator playing anything on that page.

Kiro · on June 10, 2023

I'm curious how this handles bass music where the lower frequencies need to be tightly mixed to sound good. Is that really something that a model can learn?

moffkalast · on June 10, 2023

> Audiocraft requires Python 3.9, PyTorch 2.0.0, and a GPU with at least 16 GB of memory

I sleep. And these are only 1-3.3B param models, that makes no sense.

erwincoumans · on June 10, 2023

I've tried MusicLM and other Google AI music tools and they sounded very low quality/lo-fi. Seems Facebook isn't much better?

rvz · on June 10, 2023

And here's the code: https://github.com/facebookresearch/audiocraft just so you know for the weights specifically:

"The weights in this repository are released under the CC-BY-NC 4.0 license as found in the LICENSE_weights file."

Combine it with AI voices and voice cloning and so begins the further devaluation of musicians and artists.

Might as well accelerate it and see what happens. What could possibly go wrong?

ddmichael · on June 10, 2023

I may as a composer be biased but AI "generating" music is just sad. The hypocrisy is that musicians have been suing each other for intellectual property reasons, while this thing is being trained on everyone's music. The law should catch up on this. I get that it's going to improve but for now it's also just elevator/supermarket music.

esskay · on June 10, 2023

It's an artform so its natural for a composer to have the same reaction as a painter would to Stable Diffusion for example.

That being said - it's happening, and nothings going to stop it regardless of which side of the fence people sit on.

tumult · on June 10, 2023

You make it sound like some force of nature is causing this to occur. These things exist because people are making them, despite there no longer being a healthy reason for doing so.

(I’m not saying there’s reason for AI development in general to stop, but these generative things that are designed to slot neatly into the role of human artists specifically have no reason to be developed further beyond proving it was possible, and that happened a while ago.)

arcanemachiner · on June 10, 2023

The force of nature is that our society will utilize any technology before even considering its ramifications. "Touchscreens in cars? Let's do it!"

I think you might enjoy Neil Postman's book "Technopoly", which discusses the subject of weighing the pros and cons of a subject instead of just diving in headfirst every time some new technology is developed. His YouTube talks are also great.

tumult · on June 10, 2023

I think that’s more of an emergent behavior, not a force of nature. But I agree that enough people might do stupid stuff to create this kind of emergent behavior, which seems to be happening now. Like maybe 90% or more of people think these art AIs are distasteful, but enough people can’t stop themselves from filling in the blank square where something is possible to create but hasn’t been created yet, so people keep trying to make it. Even though there doesn’t seem to be any upside or goal, since I’ve never gotten an explanation of one.

And I think if you’re going to create something that has a little bit of potential to harm at least a few people, you should at least have a decent goal or reason for creating it.

BHSPitMonkey · on June 10, 2023

Humans creating and sharing new technologies (and ideas, and works of art, etc.) across societies _is_ a force of nature.

tumult · on June 10, 2023

“Force of nature” generally means some phenomenon of physics or some natural disaster outside of human control, which is what I meant.

“A big and powerful cool thing” is not what I meant, and not what force of nature usually means.

popalchemist · on June 11, 2023

Life is a force of nature. Evolution is a force of nature. The development of society (across species, not just human) is a force of nature. The development of technology is an aspect of that. In other words, macro-economic trends such as the development of automation ARE in fact an aspect of evolution, aka, a force of nature.

Doesn't matter if that's not the sense in which the phrase is used, these things are arising out of the collective unconscious, not as the result of mere individual will.

tumult · on June 11, 2023

You just made that up. That's not what the phrase "force of nature" means.

ddmichael · on June 10, 2023

"That being said - it's happening, and nothings going to stop it"

Well, the European Union is already working on a legal framework for AI. It happened with GDPR and it will happen again.