Hacker News new | past | comments | ask | show | jobs | submit login
[dupe] AI-generated sad girl with piano performs the text of the MIT License (suno.com)
666 points by hongsy 8 months ago | hide | past | favorite | 368 comments




Yeah, it was just last week that was posted.


These songs don't have any hooks. Melodies don't get repeated. They all just meander along.

I think as they get better at making these great middle of the road songs, edgy music will reemerge. Whatever AI will be good at, will immediately devalue, in the realm of arts. Just how photography gave way to non-realistic art and how drum machines made sloppy drums (or ridiculous apex twins) hip. Then, the artifacts we see now as flaws will create its own sub genre.

So I see three ways this flows: mediocrity will be even more available, which will make artists who make mediocare music even less succesfull, pushing all human music further in human direction, except those using the unwanted artifacts of this new tech to create new sub genres.

Music, art, fashion is in the end all about changes. What we make now mostly means something in relation to what was already there. It's a big conversation, spanning millenia, and this isn't the last word.


> These songs don't have any hooks. Melodies don't get repeated. They all just meander along.

I've been playing around with it for a few days now. While I agree that it seems impossible to create songs with a more "sophisticated structure" (for lack of a better word off the top of my head), you still can get better results by fine-tuning, as is always the case.

If you just request "rock music" or "jazz", you get very dull, generic variants of the requested style. But on a second thought, isn't that exactly what should happen? You throw all the rock music on this planet in a blender, turn it on, and what you get is the most average rock music there is.

If you spend some time spicing up your prompt with flowery language or just a bunch of adjectives, you can get a sound that seems less bland. When you supply lyrics, using square brackets to denote verses, chorus and bridges can also result in a somewhat more structured song, but I found that the AI is pretty lackluster in that regard and you often need several attempts until it follows these inline orders.

So yes, in its current form it has mostly a novelty factor, this is Stable Diffusion for music, but I can easily see this being useful for a small indie gamedev who needs some BGM, or an alternative to the YouTube music library. Instrumental sounds fine, it's mostly vocals that have this clear digital distortion if you pay attention. It's surprisingly good, but still bad.


>it seems impossible to create songs with a more "sophisticated structure"

Is it my impression, or if you just put:

    [verse]
    ..
    ..

    [chorus]
    ..
    ..
it works mostly OK?


Oh wow, that worked pretty well: https://suno.com/song/a82f064a-54e3-4317-b331-f0993726d702

The melody of the instrumental pre-chorus is repeated consistently, and it's even used as the hook in the chorus; the pre-chorus is actually building up to the chorus. I'm impressed.


It does a very convincing Finnish death metal.


Elevator music ...

Hmm, perhaps there is a business model there: an elevator that writes music about the people inside the elevator, and adapts as the situation changes.


I go full dystopian. Elevator music that delivers your compliance trainings in the corporate world...



And requires you to sing-along to reach desired floor.


Same, but with ads instead. Sing "mountain dew is for me and you" to get to your floor.


Sounds like a skit from Bourniston.



Let's not.

Unless you want the AI equivalent of this Family Guy sketch

https://www.youtube.com/watch?v=XZvWvfCSZ8M


That wasn’t even the Family Guy sketch I was thinking of!

“Red-headed lady, gonna eat an apple. She breathes on it first, wipes it on her bloooouuuuse…”



That could be very wholesome, but I doubt anyone would be willing to operate an unsupervised real-time generative AI that sings to random people.


Instant dystopia

Custom audio and video advertising based on profiling and location

Looking at my spam folder what my world will look like then


No need for AI or face recognition to do that: BT equipped system in elevator detects phones through their Bluetooth Device Address (sort of MAC address for BT), then calls a central repository of names to pair that phone with an identity, DNS style, then calls an advertising seller (for example, Google) that will happily give back the appropriate ads according to target, time, location and context. Or, in a even more dystopian and much simpler scenario: some system app in the phone detects a beacon in the elevator through the above Device Address, then calls home (again, Google and others) and receives the ads to be shown in the elevator (or any other place with a audio/video system plus BT).


My immediate thought too when I heard this one from that other AI music app that was on the front page yesterday.

It's a description (Wikipedia intro by the looks) of Cola, but could easily be droning on about Coca Cola specifically while you're stuck in a lift.

https://sonauto.ai/songs/rsNs3yET01kQrFGiT1qS

We're gonna need adblock enabled earbuds soon lol


"Where do you want to go for lunch?"

[song slowly morphs into popular restaurant chain theme]


We're already there! Impose some structure in the textual representation of the song and it'll respect the structure musically:

example - "I only ate 3 cheeseburgers"

https://suno.com/song/c15f0251-fbac-4a30-a3e1-002dbc78cb79/

edit: yes, I agree this example amusingly reinforces the rest of what parent is saying


I know it isn't the point you are trying to make, but I can't think of a better way to re-enforce OP's point of "mediocrity will be even more available" than a mid-00s style, pop country song about eating hamburgers. "Toby Keith but with less to say" might be the gold standard of mediocrity.



If they didn't require an account to be able to create a song, I would have started an epic AI song battle right here and and right now :-p


You know what else you can do... google an existing song and copy the lyrics then make it make a version. It is HILARIOUS! https://suno.com/song/52a61a54-e25b-438a-908e-01074e6b75fa/


Thank you!


Your comment seems to presume that AI will not get any better than it is now. Imagine an AI that understands how to create deep, impactful music better than humans do, because it understands how music works at a biochemical level. Imagine it can even predict the dynamics you're describing, about employing "unwanted artifacts" in the music as a way to evolve new genres. It would no longer need to create such obviously derivative works at that point, and it could generate music that sounds completely unique to us. It may take a long time to reach that point, but when it does, the kinds of music that it generates won't be able to be dismissed so easily.


Your comment seems to presume that the line between what a human is and what an AI is will stay clear. I'm predicting that this line will be increasingly blurry. Some people see smartphones as cybernetic extensions. When I call someone across the globe, did I do that or my phone? Is that a capability I posses or my phone? Does it make sense to separate the two?

Even if AI gets way better, the one thing that I don't foresee changing is what makes things valuable and or desirable. Sparsity. If everyone has it or can create it, it's not special. I think the GP was referring to this sparsity.


You're talking about agency there I think. No your phone didn't do that, you did.

However, if later down the line we create autonomous agents that can initiate the creation of said music themselves then I'd call that enough agency to say that the machine is "making music". Could probably almost do it now; tailored LLMs, image diffusion, music diffusion and you could have an ML agent that acts as a musical artist; posts to instagram etc with images of their persona "working on something new", releasing new tracks, bantering, etc. There are already AI OF stars apparently.

We could say "yeah but a human still set it up and told it to make music" but I would discount that; pretty sure no human has total agency, we are all impacted by our environment, peers, culture and all sorts of other external influences.

And I don't think sparsity changes things (maybe in the material world) but culture certainly does. Things are popular because they appeal to us in bulk, rarity/sparsity always result in higher effort for the payoff and so decreases popularity.


Alright, what if someone has a neural implant and that thing is running and or connected to an AI? Are you still sure the line is clear and sharp in that scenario?


I think that requires a baseline reasoning capability that we see no signs of developing as of yet.

In the field of AI, only game playing / graph search has gotten to that level of superhuman capability.


>These songs don't have any hooks. Melodies don't get repeated. They all just meander along.

Yet. This is the worst it will ever be. Enjoy.


This is suno, udio was just released which is even better and seems to exhibit some edginess. https://x.com/minchoi/status/1778074187778683253?s=46

This will be used for jingles and scoring by low cost studios and marketers. Making money at music was already hard and will only get harder. But maybe this makes sense in a way, music creation just became much easier and accessible to more people.

Alongside this prompt based music creation, we have AI powered autotune and voice masking, which allows even the worst singers to sing perfectly. Popular music was already being retuned during recording and this just makes it easier. In hip hop, old songs beats and verses are getting reused wholesale with little to no modification. See Jack Harlow’s First Class and Tyga’s Bops Goin Brazy. A lot of musical success is now business connections, studio promotion and timing, not being extraordinarily talented. Most rappers now don’t write lyrics down or freestyle, they record a line, pause, record another line, pause , erase the last one, rerecord, and then they’re done. This type of thing an LLM is made for. I think this means less works of art on the ‘radio’ and more garbage music forcefed by studios.


I tried udio a bit yesterday and found it produces clearer audio but the music it produces isn't nearly as good as suno. At least to my ear.


You can make them have a hook if you structure it right. The default prompt on Suno winds up as "verse, verse 2, chorus" for the structure, which indeed does sort of sound like a song, but not a song at the same time. When I changed things around and put a chorus in twice I wound up getting an actual song with a hook.

I assume the 2 minute limit has a lot to do with this. Often I find Suno generates anywhere between 40 seconds to 2 minutes, with the shorter generations having less of a recognizable structure as a song. If instead it was 3 to 4 minutes, I think it'd be radically different.


Yeah I have noticed that Suno's lyric generation is very canned sometimes, but also for timing - it oftens generates 3-4 (longish) lines for a verse where for the music there should be 2-3.


That's the ChatGPT style. It is extremely stereotypical when it comes to writing lyrics or poetry: 4 lines, rhyming, with metronome regularity. Both Suno & Udio seem to use ChatGPT for lyrics, and it's a terrible choice, which should be ripped out in favor of Claude or anything else. (Basically, any LLM which can pass the "write a non-rhyming poem" test prompt would be a better choice.)


> pushing all human music further in human direction

Might also just push us in the direction of less good music, if you can’t make any money as a musician before you are a genius.


> These songs don't have any hooks. Melodies don't get repeated. They all just meander along.

Perhaps this has more to do with the input rather than the capability of the AI. The license is somewhat meandering.

Also, there is a neat build-up to the ALL CAPS portion, when the song takes on a much more full and powerful quality. Plus the whisper on (the "Software"). It's obviously responding well to the features of the input itself.


  > drum machines made sloppy drums (or ridiculous apex twins) hip.
Other than The White Stripes, what popular bands have had clearly unskilled drummers? I actually enjoy listening to more amateur music, so this might be a request for recommendations of a sort.


Meg White is widely regarded as a great drummer by her peers. "Unskilled" is definitely not true: she just plays in a style that is devoid of virtuosity, and it fits the music very well.


Fair assessment. A very skilled artist could use an unskilled style - see Nirvana's cover of Where Did You Sleep Last Night or even John Travolta's dancing in Pulp Fiction.


> Other than The White Stripes, what popular bands have had clearly unskilled drummers?

I want to joke and say Lars from Metallica but I'd be lying. He was a decent drummer in the first couple albums.


Well, OP was talking about sloppy rhythms performed by drum machines, not about sloppy drummers.

When people talk about this, they often mean stuff like "Dilla Beats" or "tuplet feel", which is explained here: https://www.youtube.com/watch?v=9MzKx0fKg5o

Funny enough, this actually takes extra skill.


"Sloppy drums" != "Unskilled drummer".

Maybe sloppy isn't the best term here. More like not being beat-perfect. E.g. using slightly off-beat hits or slight tempo variation as a way to emphasize other instruments or lyrics.

Pre drum computers these changes would more likely have been considered to be sloppy drumming, whereas these are more likely to signal authenticity these days.


Don Henley was sloppy. Always a fraction of a beat behind. It turned out to be something of a hook in Eagles songs


"Always" implies not sloppy; sloppy playing would be sometimes ahead, sometimes behind, but unpredictable amounts.

I should know, I'm a sloppy drummer ;)


Simpler Rhythms could be made creative. I am not a drummer or expert in researching popular bands but e.g The Ting Tings have a very interesting rhythm section.


AC/DC has notoriously simple drumming in every song. You can learn how to play the entire AC/DC catalogue in a single drum lesson.


Have you tried annotating the lyrics with stuff like (hook), (chorus), e.t.c. I've used chatgpt to generate the lyrics for a random word, while giving it some artist names to draw inspiration from and suno manages to put in most of the appropriate musical characteristic of the annotated segments. Most notably (chorus) and outro (outro) in this example: https://suno.com/song/23e1d6de-569c-4c19-9551-cfda8e1bcd5a


> These songs don't have any hooks. Melodies don't get repeated. They all just meander along.

Isn’t that a consequence of the pasted text, which itself doesn’t repeat?

I checked their homepage and clicked the #1 trending song. It has notation regarding the chorus and verses and the melody does repeat.

https://suno.com/song/c15f0251-fbac-4a30-a3e1-002dbc78cb79/


Idk I've managed to generate some good-ish ones getting it to generate the lyrics itself as well:

A smelly wolf: https://suno.com/song/c2b30ffc-729f-405b-8f22-b1b5f36a7c6a

Trying to get a lid off a jar: https://suno.com/song/44562993-df45-4b56-939b-afd65832042f

Two people that are too different for each other: https://suno.com/song/c63c82fe-399a-41b8-91b2-f16911deaaf0

Vocaloid robot whose code is incompatible with love: https://suno.com/song/fbb51ff7-3f69-41b0-9d41-7f6b5f6a5d87 (I want to do something more with the hook in this one "Heartbeat system, can you teach me to feel?" was touching)

An AI achieves sentience: https://suno.com/song/3eee76bd-2313-423f-87e3-035566b4718c

Mushroom psytrance: https://suno.com/song/a785adf8-92dd-4b13-acb7-93beb44ab7b2

Kernel panic dubstep: https://suno.com/song/5d37f5e7-3e62-4df8-a1d8-67122470aeff

Neuromancer: https://suno.com/song/d2705c66-4be5-496c-add1-480427b4a005

I mean it's definitely not perfect, but as per 2 minute papers "only one year later..."


Make a singing quine


"mediocrity will be even more available" - so basically all the stuff that is in the top 100 charts will be more available


Only on Hacker News could people find flaws with such an awesome idea and execution.

Not all songs have hooks or refrains. Only the most formulaic ones.


I don't think it's fair to conflate "having structure" with "being formulaic".

Even the most abstract art is not about splashing paint in a canvas and calling it a day. It's about doing something close to this, but creatively and within a specific framework.

It takes much more skill to produce songs with hooks and refrains than to produce random music.

But it takes even more skill to produce something that is creative, not formulaic, but still has some structure that's pleasant/fun/interesting/etc to the listener.


The second best art of this time will always remain "who's afraid of red, yellow and blue?" after it inspired such fear in someone that they sliced it up with a knife.


I have to say, I completely lost it at the whisper '(The "Software")' (0:18)... give this tech another year or two and it will be better quality than your average radio song.


I don't think I can endure much more AI slop seeping its way into my life.


To me the feeling of AI generated content is less "slop" and more "in-flight magazine". It can have a surface sheen of quality that you can lure you in, but you realise it's devoid of any vitality or soul.


When recorded music was invented, musicians protested. Recorded music was devoid of any vitality or soul. Recorded music still became a hit. Then we got the synthesizer. Again we got the same complaints, lifeless and without soul. The synthesizer still became a hit. Now the next step is happening, and we see the same complaints all over.

Only time will show if the next step will happen anyway. My gut feeling tells me that AI art will gain acceptance over time, and we will just think of it as "art" or "music", just as we did with recorded mysic and synthetic sounds.


Humanity lost some things when it gained recorded music. It made the profession of performer less valuable, and diminished the number of performers who could make a living. But humanity got something very valuable in return — the ability to record and play back music. The same goes with the tradeoffs made for photography and motion pictures.

I see little value to humanity in tools that are able to generate an endless amount of music derived from existing music, specifically designed to neatly slot into the place of human artists. We gain little in return from that.

Some people will make an argument like, this lets people generate lots of low-quality music for use in elevators or grocery stores. Well, there is already a massive oversupply of completely free music which can do that. Do people pretend to not know this?

The other weak argument is that it lets people express themselves who haven't studied or practiced music. But, it doesn't, because the interfaces (text prompts or "upload an existing file") are designed to take the place of a human being given instructions for criteria to fill, as if they were a worker, not an expression of the person giving the instructions. If the person giving the instructions were expressing themselves, most of the AI tool would not be redundant. It's as expressive as telling another person to write a song for you with some instructions. Hardly expressive at all.


"Music derived from existing music" describes almost all music ever written. So not any kind of argument against AI?

And the quality of music generated by AI is increasing geometrically. Be careful to consider that any music heard today will be among the least quality generated by AI. Because it will get better with practice, at the speed of light.

That first comment I can get behind, but probably not in the way intended. Recording technology made performers less valuable, so fewer could make a living. The AI composers (and performers) will go further down that road. My conclusion? Get used to it. Just like human-adding-machines got replaced by calculators. And a generation of weavers, replaced by looms guided by punched cards.

This phenomenon is not new, and will go down a well-worn path.


> "Music derived from existing music" describes almost all music ever written. So not any kind of argument against AI?

I don't know why you are calling out this particular sentence fragment. That wasn't meant to differentiate AI-generated music from human created music.

> My conclusion? Get used to it.

What's the upside? I don't get it. I don't understand why you even replied to me. You didn't address the point I was making. The point was the examples I gave — photography, audio recordings, film — which are much closer to AI music than your examples of punch cards and looms — had clear upsides to humanity, despite also having some downsides. AI music and visual art seems to be almost entirely downside. Sorry to restate what I already wrote, but you didn't address it at all.


Except to say, poorly I suppose since the point didn't come across, Get Used To It.

This is nothing new, your examples are great ones too, the benefits we got are, lots more art for everybody. To take the side of the composer/performer/artist is natural I suppose, but that's such a tiny fraction of humanity. To ignore the clear and positive upside, even claim it doesn't exist, is disingenuous?

To criticise AI music for being the same as human music, is a curious thing to say if one doesn't want to distinguish the two somehow? I missed that point entirely I guess. I don't know why that sentence was there at all then.


> To criticise AI music for being the same as human music, is a curious thing to say if one doesn't want to distinguish the two somehow? I missed that point entirely I guess. I don't know why that sentence was there at all then.

This is still not relevant. You don’t need to bring it up again. I wasn’t criticizing it for that.

As for the rest of your comment, you didn’t state any upsides for AI generated music, but you did spend time to attack me, so I think we’re done here.


Yikes! I criticised arguments. The only time I mentioned you, was to compliment your examples.

It's important to distinguish these things.


I'm not sure I follow your argument, because neither synthesizers nor recordings write music.

For augmenting comoposers, sure, GenAI can be a tool like others. Musicians have been incorporating rhythms and melodies shipped with their electronic instruments for ages.

Entire genres have been defined by sounds and synth presets, too.

So I do see a bit of the similarities that you describe, but I think this is largely misleading.


> I'm not sure I follow your argument, because neither synthesizers nor recordings write music.

The argument is that each technology advance accompanied resistance followed by adaptation. "Recorded music" was arguably as paradigmatically disruptive over "live music" as "AI generated music" will be.


The two are fundamentally incomparable beyond the surface level fact that "things will be different". Recorded music changed the way we experience music. AI tools may change the way we make music.

From my perspective the implications of this are dire. AI can completely remove the human element. The skill, creativity, and collaboration required to produce music is a big part of my appreciation for it. Once that's gone, when Spotify can generate exactly what I want to hear, Music as we know it loses its value.


> The two are fundamentally incomparable

You're not wrong. A paradigm shift is not an incremental change but a disruption of fundamentals.

> From my perspective the implications of this are dire.

These changes are scary, especially as people try to come to practical terms with the new reality.

> AI can completely remove the human element. The skill, creativity, and collaboration required to produce music is a big part of my appreciation for it.

I still hate autotune. I feel that it ruined music. But, on the other hand, it allowed people who were excellent musicians but terrible singers to make excellent music, even masterpieces. I don't think autotune was a paradigm shift really, but it was pretty disruptive.

People are deeply creative, social, collaborative, musical, artistic, hierarchical and status conscious. These traits will always drive people to make music and share it, and derive meaning from it. People will still pay other people for the music they make.

Photography utterly disrupted the social role that painters held as documentors. No one needed to hire a good painter to have a portrait. They could hire a photographer more cheaply for more accurate documentation. Artists working in the medium of painting really had to grapple with the question of what art is, if academic faithful representation of reality is no longer valued by society. Painting thus began to change. Impressionism led to Suprematism, Dadaism, Surrealism, Abstract Expressionism, Conceptualism, Modernism, Post Modernism.

Artists today will have to grapple with similar questions raised by AI generated art. But humans are creative, indomitable, curious and tenacious. I am absolutely excited to see the art that future human artists will make in the face of all of this.


This was implicit as an intent in the public statement of Ek, Spotifys CEO, when he said that they're not gonna ban AI-generated music per se (and there already is plenty on Spotify).

Somehow this nudged me a bit when I switched to YTM, although the bundle with YT background playback on iOS was probably the bigger nudge. There's plenty of AI-generated music on YouTube as well, but for the moment their recommendation algorithm and catalogue is just better for me.

For now, it also doesn't seem to suggest me any AI-generated music whenever I use autoplay or suggestions.

I'm not falling for the illusion that this won't change though.


Agreed.

Another side of it is that it will enable the creation of more music around more topics than before, by non-musicians. The accessibility bar is lower.

For a lot of people, music is a way to express their emotions, and not just by creating/playing it, but by listening to it. Now, you'll be make your own hyper-specific music with lyrics around topics specific to you, without learning any of the underlying skills yourself.

I've certainly wanted some kinds of music/representation in music of some of my experiences to exist, but not enough to go out and learn to make it myself. Now/soon I should be able to do that with AI tools, and I think that's actually neat!


It will let you express yourself in precisely the same amount as giving those instructions to a person to create a song for you. Very little.


Memes, memes everywhere! :-)

> as giving those instructions to a person to create a song for you

Most people don't have access to such a person.


I know. I was saying it’s not expressive. I didn’t say that everyone has people to make music for them.


My point is that just like with meme generators, many people are able to express themselves through "rudimentary" artistic means.

A tool like this will open up a lot of possibilities.

No, they won't turn into the next Hans Zimmer.


My comment was a reply to someone talking about being expressive with AI music generators like these. The points you are making have nothing to do with what me or the person I was replying to said.


They are connected but you aren't seeing it. The average person can't create baroque music based on a bunch of words they like. That's a lot of musical expressivity for someone who doesn't even know guitar chords.


You are just skipping the step that all of it had the human element considered, which to all of us is a very intrinsic part of "art". When AI generates a piece of entertainment it's ok to just call it entertainment, it's not art until AI actually has something we can relate to as consciousness (aka a "soul").


Hard disagree. I don't think anyone considers that, unless they're an art critic, a philosopher, or a snob.

For a normal person, whether something is art or not, is a mix of 1) whether they like it, 2) whether they can, or conceivably could, enjoy it together with other people, and 3) whether they're supposed to enjoy it or call it art, because other people claim they do (social proof).

Examples:

- Pop songs are strongly 1, 2 and 3a (enjoyment), but not necessarily 3b (considered High Art). Most people don't care, or couldn't even tell, if the songs they like were written and performed by actual humans or by machines; they experience them through some machine anyway.

- Paintings. I recently visited a Van Gogh exhibition, and I can't honestly say I liked most of it. Most paintings, in general, are ugly. We call them art because we're supposed to call some paint scribbles on a canvas art, particularly when they're framed and put in a museum (as opposed to bought off the street!) and decreed Art by People In Authority Over What Is or Isn't Art. For this exhibit in particular, my ability to enjoy the paintings was proportional to how much I knew about Vincent van Gogh's life - for those paintings I had some context for, I enjoyed them even though they're otherwise pretty bad to me. But most people, most of the time, don't have any context for paintings they're viewing, and they still call them art.

Hell, arguably, the best "paintings" in that exhibit were a couple that were obviously AI-generated - like Vincent wearing VR goggles, or animated Vincent inviting the patrons to the exhibit.

Nah, what I think is death of art for regular people is quantity and personalization. The most important aspect of day-to-day art experience is that you can enjoy it together with people around you. It's a problem for TV shows and books these days, and even more with "Internet original" videos - there's just so many of them, and with everyone's getting their own personalized feed, it's getting hard to find common creative works you and your conversational partners both seen. Everyone's experience is becoming disjoint from everyone else's (except for occasional superhero or wizard movie) - at which point you eventually realize that enjoying unique art no one else has is pointless waste of life.


I never thought much of Van Gogh’s work until my sister recommended I read Irving Stone’s novel Lust for Life. I loved it and did a deeper dive on Van Gogh and his contemporaries. Now, he’s one of my favorite artists.

For me, one problem with AI art is that there isn’t context for any of it. It’s just there and didn’t come from anywhere.

Eventually, some prompt engineer may build up a coherent body of work that is moving or provocative, but we aren’t there yet.


I like listening to Soundtracks, but if you've not played the game/watched the movie, it can be hard to get into it (Listening to NieR: Automata is a journey). Listening to one track is like reading one chapter of a book. It may be your favorite one, but there's something more substantial into consuming the whole work. I know people listen to music to set up moods, but everyone has that playlist/album where they listen to the whole thing just for the sake of listening. Or watch a painting for contemplation, not to judge whether it's beautiful or perfectly made. And that is why I can't enjoy AI generations. There is nothing to connect too.

P.S. I don't mind people using AI as tools, but the human agency needs to be visible and relatable. If it's only a prompt, don't ask why I don't value it.


I'm a soundtrack person too, and that is my point: you may like some tracks from a game you loved, and you probably have no idea whether or not the music was human or AI made.


Human is still writing the prompt. That's your human element.

The prompt can be detailed, creative, and innovative. Kinda like a composer comes up with an idea for a new piece. But now the composer won't need the technical ability to translate it into musical notation.


Not sure if writing the prompt is the human element because an AI can easily write prompts. I think a stronger human element is in the training data (while that training data isn't dominated by AI generated content itself).


> AI can easily write prompts

All our current AIs need prompts to produce content. So, you'd need to enter a prompt so that AI can produce a prompt for you.

But I guess you could develop some variant which just generates prompts in a cycle. But most of the generated prompts will pretty bad, so it will in turn produce heaps of crap.

That's where the human element is still needed - try a prompt, evaluate the result, tweak the prompt again. If this doesn't produce a good result, try a different idea. It's not even that different from how artists produce content these days (the create <-> evaluation cycle). Current AI is not good enough to be able to judge the produced content (that's why it produces so much crap after all).


There's way more human input in the training data than in the prompt is my point, many orders of magnitude more. Of course there needs to be something to start some chain, but that's like saying a human needs to press "deploy" or "start", it applies to any tool. Even a perpetual motion machine needs someone to start it.


And your brain also has way more training data (all the knowledge you gained, all the art you've seen) than what you yourself produce as "art".


Not really unless you have perfect memory. What people do is learning rules and breaking them. The rules exist outside of your mind, you're just trying to conform in some ways and distort them in others. You do not blindly copy what came before. When drawing a portrait, there are the rules of anatomy, perspective, colors and light, and your medium of choice. A style is a particular combination that you know works, but you still have to know the rules in the first place. You study masters to learn what is and what is not important in those rules, not to recreate their works in details. I've not heard of any art classes that train you by copying everything that has been produced.


Generative AIs don't remember the original media, they just detect/extract patterns out of them. They aren't able to recreate pixel perfect anything. Ask it to give you a Mona Lisa and the result would just kinda resemble the original. (kinda what a mediocre artist would be able to produce going by their memory)

> I've not heard of any art classes that train you by copying everything that has been produced.

You don't have to learn it, because it's the second nature for humans. We learn by imitation. Babies learn to talk by imitating sounds from parents. Artists learn by imitating style by masters. That's what generative AI does as well.


I see AI as just another tool. Like Photoshop, a software mixer, electrical powertools in my wood-working shop, a 3D printer in my office.

All of these had immense impact on the way we create (or make art). And despite all this, we still use waterpaint, perform music on ancient instruments, make furniture with minimalistic tools, or use clay to make objects.

I'm not pessimistic about generative AI. If anything, It'll allow more people to create. Allow new and unprecedented art forms. It will have an effect on the way people make money with art. But so did photoshop, digital audio mixers, a table saw, and a CAD/CAM machines.


If taping a banana to the wall, or eating said banana is art, then I feel like making a machine sadly sing the MIT license to you has to qualify. The idea to do this is wonderful to me.


I would consider that the whole act of making the a machine that can generate music and then making it generate a song of the MIT license could be considered 'art'. That doesn't make the end result 'art' in isolation art.

Like the banana. The bananas isn't art, taping a random banana to a random wall and eating a banana isn't art. A particular person taping a banana to a particular wall for a particular reason is what make the whole thing art.


Why look at the end result in isolation? Isn't that just ignoring the human element then complaining there isn't one?


Why look at the end result in isolation?

I definitely agree that you shouldn't. To me absolutist arguments about if AI-generated content is art or not makes no sense. AI is a tool, like for example carpentry, that can be used for many purposes. Most carpentry isn't art, it's craftsmen building useful things and solving practical problems. But there are definitely artists out there using carpentry as the medium in which they express their artistic intent.


True. But undeniably, current creative types will be displaced. That will be disruptive, will damage individuals ability to make a living.

Not a lot of individuals, to be honest. Only a handful of people make any kind of living from composing. Millions try it, but have to be content with performing for their friends. Which will continue unchanged.

So the actual economic impact of AI music will be different from the scenarios being described. What is true is, we will all have a lot more musical listening choices. Which is a net good for the rest of us?


The difference here is that no one wants to listen to this shit. It is extremely corny and generic. Cringeworthy.

Algorithmic music has already been around for decades and it never became popular. In the 90's it was of interest only to a small group of academic music nerds. The same is true today. Avant-garde shit for nerds. No one wants to listen to it.


You must live under the world's largest rock if you don't think that there's a commercial audience for corny, generic, cringeworthy and shitty music.


I am under a rock of denial of how bad popular music has become. I was checking out some other tracks on this site and now believe AI could eventually take a lot of musicians jobs, for advertising and so on. But there won't be an AI Kurt Cobain.


AI's bread and butter will be "customers who just want generic art for generic commercial use, any quality art, and pay as little as possible for it." It's probably never going to need to create the next Billie Jean hit.


> When recorded music was invented, musicians protested

> we got the synthesizer. Again we got the same complaints

I honestly don't believe this happened. Citations?

You make it sound like one day there was no recording and then bam! flac quality recordings of musicians, out of the blue. You do realize it started with exceedingly shitty wax cylinders that sounded absolutely atrocious by today standards (and by the past standards as well)


Not the parent but here's one article:

>After the release of The Jazz Singer in 1927, all bets were off for live musicians who played in movie theaters. Thanks to synchronized sound, the use of live musicians was unnecessary — and perhaps a larger sin, old-fashioned. In 1930 the American Federation of Musicians formed a new organization called the Music Defense League and launched a scathing ad campaign to fight the advance of this terrible menace known as recorded sound.

>The Music Defense League spent over $500,000, running ads in newspapers throughout the United States and Canada. The ads pleaded with the public to demand humans play their music (be it in movie or stage theaters), rather than some cold, unseen machine.

>Joseph N. Weber, the president of the American Federation of Musicians, made it clear in the March, 1931 issue of Modern Mechanix magazine that the very soul of art was at stake in this battle against the machines.

https://www.smithsonianmag.com/history/musicians-wage-war-ag...


well:

> live musicians who played in movie theaters

and it wasn't about recorded music, but mechanical pianos


No, it was about recorded music, as the article says. The movie made use of the Vitaphone sound system. The claim was about musicians protesting the introduction of recorded music.


Isn't that the real Turing test - does it feel like it has a soul?


Beware the cleverness of the Turing test!

It is a test that selects for the ability to deceive.


I want to run a experiment on humans to see if we are worthy of a turing test.

1. We tell a human (test subject) that they will be a judge but they are a test subject. We will tell them that there will be two chats -- one will have a human and the other will have a computer and they need to decide which is which.

2. We will then give them access to two real time chats but the twist is both of them will be humans.

3. Our test subject needs to rebel against the experiment and say they are both humans.

What percentage of the population will be able to say both chats are humans? Is this a humane experiment? Will any ethics board clear it? Does it have any scientific value?


>Does it have any scientific value?

Dunno, but it could probably get published and I'd at least read the comments when it gets posted to HN.

Tricky bit is the design in the signaling and instructions so as to avoid biasing the results while still allowing the desired 'both are human' response. Something like a check box for each chat if that chat was a robot. If you have the budget I'd also run the control Turing test, human x computer, as well as a computer x computer test.


Surely this just tests for freethinking and non-conformity. Much like Milgram and Asch found.

About a third of people are highly suggestible and agreeable, they readily follow authority, fall for scams and can be hypnotised. About a third cannot be easily conned or hypnotised, are vigilant, "disagreeable" and likely to go their own way.

Even of that highly sceptical, independent third, it takes a lot of courage to say "I totally know this is bullshit, and I'm done". Only a handful of Migram's subjects were able, not just to assert a moral objection to hurting a fellow human being, but to question or see through the whole ruse of fake scientists and stooges. Even if you can spot that, it takes rather a rare human to act on it, to call it out and walk away - or rather escape the parameters (get outside the box) you've been placed within.


Choose your preferred software license for lyrics if you want substance


Sadly you could say the same of the 80% in just about anything human-made. The 80% of software, music, furniture, etc.

Maybe there will be a change of feeling, it's starting to come to me, instead of seeing this AI generated content as "soulless" etc I'm starting to see it as an extension of OUR human generated work. It's more like an endless remix of HUMAN talent.

All of that is boring though. The exciting stuff is all that will be displaced, and how we will solve the myth or meritocracy.


what is in-flight magazine if not slop


AI slop (to me) would be to pass this off as a song or something meritorious to listen to.

IMHO this and things like it are basically a sub-class of comedy or satire, so I have time for this sort of thing. It's a joke, and should (can?) only be appreciated as artistic as someone saying "Wouldn't it be funny if ... ?" because now that casual thought can be turned into a pretty instant "well here it is! LOL!".

I don't think you're supposed to appreciate it as music. Maybe I'm calling it wrong though.

(edit - I will admit that upon further thought, I'm not sure how I feel about this when compared to, for example, Nina Gordon recording "Straight Outta Compton" as an accoustic, mildly lamenting singer-songwriter style number 20-some years ago. It's clearly in the same satrical arena but one took a lot more effort and imagination. Kinda, because there was a lot of effort and imagination that went into both the training data and the model, even if this specific output was only a passing joke. It's quite hard to reason about this stuff...)


In my experience, there's less of a distinction than you might think.

A lot of satirical songs are absolute bangers, because (for example) to do a send-up of the tropes of X music, you must know all the tropes of X music, and be able to perform them. So a lot of satirical music is actually done by people with a lot of skill and passion for the thing being satirised.

And because satirists don't have to worry about being predictable or unoriginal, they can put in more crowd-pleasing cliches per minute than 'serious' artists, giving them the most intense X of all X artists.

(Not saying the MIT license is a banger though - just that some satirical songs are)


Personal opinion, but I think you're calling it wrong. Also personal opinion, I absolutely hate the idea of it happening, buuut...

I think it's a bit like everyone who said "black cabs in London have nothing to fear; X years of 'The Knowledge' will always be better than a guy with an App in an old Prius..." Except it was all wrong. Black cabs are probably still superior, but the bulk of people (and especially new customers) are all using Uber, because it's easier and they just don't care.

It sucks that production factory, literally built to make money, AI created junk music will be a thing, but it most likely will. Someone will exploit that they can make a ton of cash with low risk and budget. They'll have the connections to, despite initial (somewhat) faux outrage from the public/press, get radio play/playlisted online ("probably even 'ironic' addition from the hipster music crowd"). The song will be an earworm, and aside from the musicians that hear all the flaws, the passive every day mom and pop listeners will get the hook stuck in their head and it'll just be another great tune like any other.

I don't think we'll replace 'celebrity', I think that'll still happen, and I think maybe greater appreciation will happen to 'real musicians' (with a face!). But 'everyday' disposable pop (like all those one hit wonder tunes that are still played in nightclubs, pubs, throwback radio, wedding discos) - that's going to be disrupted massively.


Oh I don't doubt it!

I guess my point was that I don't hate the linked tune because it's not someone asking me to listen to a song, it's a joke and one that does have some humour to it.

I can absolutely see this tech (if it's allowed to) replacing a lot of working musicians who do music for ads, jingles, tv, film etc. And yes, disposable pop is probably on the chopping block.

> disposable pop ... that's going to be disrupted massively.

I wonder how the disruption will play out - the world is already drowning in content created by humans. If we envision that AI can make disposable pop to the same standard (and I have no reason to doubt it) then that surely creates an absolute deluge, almost boundless in size, of stuff. Promotion will become more or less the only art, to make things stand out from the crowd, and that can probably only continue for situations in which people want a shared experience. For an awful lot of situations the streaming of entirely ephemeral audio would probably do. It could as easily kill 'pop' as a business on the audio side, as it could steal it.

I'm just sorta daydreaming about possible outcomes here. All sorts could happen.


Art will realign to put more value on live performances. Don't worry, people who push AI art don't understand that art is a form of communication between humans. They might learn when the bubble pops and they are left with a trillion shiny "art" objects that are worth nothing, because nobody wants to look at them.


100%. I compare it to the invention of cameras - before that you could make an honest living as a portrait painter, no inspiration needed. Afterwards, painters needed to lean into artistic qualities to stand out. But also, 'Photography' was born - what was seemingly just a press of a button turned out to be an artform.


Not totally true. Yes, there's value in live performances and human connection but most of the songs we listen don't stimulate that. Hell, often we don't even know what the musician looks like, who they are, how they sound live, etc. They're just items in our Spotify queue that are only there to give us a dopamine hit with their sequence of well-composed sounds.

There's a craving for a deeper connection but that's usually the smaller part of our everyday consumption.


> Not totally true. Yes, there's value in live performances and human connection but most of the songs we listen don't stimulate that. Hell, often we don't even know what the musician looks like, who they are, how they sound live, etc. They're just items in our Spotify queue that are only there to give us a dopamine hit with their sequence of well-composed sounds.

If you are a little bit of a critical listener you listen to those songs because you connect to them somehow, that's the power of music, it's a language for emotions that you don't need to know how the artists look like, or their backgrounds, to feel what they try to convey. Having the context/background might help to intellectualise a piece of music but the feeling comes from the art itself.

AI music is pure entertainment, not art.


I mean theres a lot of like crap production music for film and Tv out there. I imagine like cheap reality tv shows will start using AI music. Maybe anime, which is always looking for ways to make its arduous production methods easier and cheaper because the industry is so volatile and not guarunteed to be profitable. But I think to make properly decent money as an artist you need to tour. This has been the model ever since streaming took hold. So yeah, its not going to be that profitable to be a faceless AI pumping out stuff into your spotify stream. I think the market will be in music for TV.


Sorry to disagree, we already crossed the line where the "Her" movie could turn into a real story. To say it more clearly: It won't take a long to see people even having affective relationships with AIs.


They definitely already are, which is basically the business model of girlfriend.myanima.ai, along with a litany of similar services.


Sure, some desperate lonely people have very close relationships with ie pillows. What OP says will most probably be true though - when there is ocean of cheap/free perfect AI art, it will be worthless.

What will be worthy is imperfect human-created art. We already went through this decades ago with expensive hand made vs cheap machine made stuff, this is just another iteration.

I am not saying it will be great to be an artist, just like in the past few famous live in limelight and most will struggle to stay afloat.


Having pets improves your mood and health. In a measurable way. So people already have "effective" relationships wit their pets. What's the difference between an effective relationship and an affective one?


Oh I'm not saying that having a relationship with an AI is a bad thing per se. I actually enjoy the discussions with the LLM models, even the small ones that I run locally, in a way superior rate than the conversations that I often have with human beings, and pretty sure I'm not the only one. I didn't create yet an "entity" with some kind of persistent memory which will turn into my personal daily assistant (my personal Jarvis) but that's definitively in my to do list.


Depending on the popularity and your income, the experience of a artist life is really not that intim as you make it.


The experience of knowing there's a human behind the message is valuable. People hated corporate propaganda "art" before AI was a thing, for the same reason it is not human.


Depending on the AI tool in question, and the level of control it offers the human, there can definitely be a human behind the message, even if the final output is AI generated, the human had a creative vision and used the available tools to make it happen.

It's no different than using Photoshop.

Unless you only consider art done with oil and on a canvas to be the "real thing."


I know plenty of people, including me, who has no real knowledge about those humans.

And sometimes not knowing is better like the example with Rammstein and row zero were some manager woman was asking other woman if they want to meet their rockstars.

Btw. on a good electronic music set, knowing the arist is probably more a quality sign than a personal aspect. Knowing that i like what arist xy does, means i might like to keep an eye on future work because i like the style of it.

Nonetheless, i do also think that fandom and doing real tours will still be the unique things for bands. There should always be a market for human content.


If you can pay for a live performance and have to time to go there, yes. A lot of people can't and a 5-10$ Spotify/Apple Music/... account is all they can afford (if at all).

AI music will disrupt this market.

Spotify for example will try to produce their own AI music (like they already do with regular music). The Christmas playlist will then mainly contain their music. That saves a lot of money for the.


I don't quite get why this is downvoted; it may not be the absolute truth, but in my personal anecdotal experience, there is quite some truth to it. I have had a lot of fun playing around with image generation and chat gpt.

But have also had the non-surprising "Hedonistic Fatigue" that comes with excess access to something originally valued. I have now been able to generate 4 and 5 digit numbers of pictures of awesome colourful steam locomotives and epic dungeon vistas, but now find myself fatigued by "what on earth am I going to use 5000 dungeon pictures for?", coupled with the dread of being forced to CHOOSE from 5000 options. And I learn the known principle, that when you can choose from 4 options, you are happy you picked the best of 4, but when you can pick from 5000 options, you are left feeling inadequate with "I almost certainly was not able to pick the best of those 5000 options, and trying to do so would exhaust me". So suddenly, picking something from your menu of options, feels dreadful and fatiguing.. (I get the same feeling sometimes, when trying to pick a movie to watch out of 16.000 options).

So yeah, no doubt the AI sketch/refinement tool will be merged into our creative process, but for the time being, I feel a second generation of alienation-estrangement with my "available options".


I'm going to assume that like me you're not an artist, or at least not the kind of artist who can generate 5000 dungeon pictures by hand.

I've been thinking about it the same as you but something that I think we're missing here is what can the person who can already make 5k pictures by hand do with this kind of tool?

It makes me think of the later albums that Frank Zappa produced with a Synclavier. That guy went hog-wild on that thing and banged out some unbelievable albums.

What would he have been able to do with AI generated music technology if it was the Synclavier of his time? What are the Frank Zappas of our time going to do with AI?

https://en.wikipedia.org/wiki/Frank_Zappa#Synclavier_works


Yes, having to check 1840 options is about 1830 too many =)


They are already worth very little if more than one party can produce them at the push of a button.


They made it illegal to produce exact replicas of Frozen at the push of a button; they could make AI illegal too. Except, the only reason it's illegal to copy Frozen is that it gives rich people less money. Generative AI gives rich people more money, so if anything, it'll be illegal not to use it.


It's not an apt analogy, unless you think that the production of generative images will be restricted to a few parties (because that is what copyright does, it restricts the right to copy to the originator).

And then if it was restricted to a few parties, they still have to compete with human artists, they can't just charge whatever.


"Art" already puts more value on live performances and scarcity. Popular music consumption is largely removed from that.

Yeah, the last two live mega tours (Taylor Swift and Beyoncé) have a tad more personality than the average artist, but the usual stuff that you would hear on the radio might as well be AI generated and live-performed by animatronics and a significant chunk of the audience wouldn't care or even notice.


I think the first place we will see large scale AI music replace 'real' music is in the 'lo-fi' background music that just about every bar, restaurant and shop has going in a loop. Instead of having various playlists, the restaurant owner can just choose between some styles, moods and tempos in an app and the AI will autogenerate an endless steam of background music that matches the vibe they are going for.


See also vocaloid 'performers' such as Hatsune Miku or Ia in Japan.


Yes, this is why the most popular songs on the radio atm are all mass produced, generic (for their genre) crap that doesn't really stray beyond the proven money making methods.

I mean people watch Love Island on TV for chrisssssakes. Humans can definitely be mindless consumers, where the sugar salt and fat in fast-food is the same as false drama, outrage, sex and violence in media. We all got buttons and they're so easy to push. Just look how popular TT is versus the type of content on there; mostly short-lived mindless stuff.


Suno links have already become what grey screenshots of ChatGPT were just after it came out, listened to a few at first and now I just keep on scrolling.


I had a similar thought -- my perspective:

There will be a TON of these 60-second tracks, and no one wants to listen for anywhere near that long while skimming a feed, so these will now be ignored.


If you ever turn the radio on or hear the spotify weekly top playlist, it's not like contemporary cookie cutter pop music is any better. At least with this anyone can quickly experiment quirky new ideas!


What I find most amazing is how the music changes at the start of the all caps section. Is this a completely new interpretation of what all caps means or could it have been learned from examples?


Yeah I was definitely expecting some "IN YOUR HEEEEEEEAD, IN YOUR HEEEEE-EEEEEEAD, ZOO-OM-BEH!" when it got to all caps, so was a bit of a surprise.

I wouldn't be surprised is all caps gets interpreted by it completely differently over multiple iterations.

Experiment: https://suno.com/song/3338f5e5-b36d-4596-815d-cd2804c9a344 (generated lyrics)

Same lyrics all caps (chorus): https://suno.com/song/b176c658-9e9c-4a4b-b05e-6db291f415c9 didn't really do anything.

After loads of different attempts trying to get something different: https://suno.com/song/398ef310-94b8-493a-ae61-22b790875689 not really that great.

Tbf I think it's because it's been trained on genres and maybe lyrics and lacks info on vocal styles and other stuff present in tracks during the training...


At the current rate of progress maybe it'll be months. I did a song for my company just to have fun around an event and the quality of the song was astounding, better than this one I think also using v3. I didn't even write the lyrics, I just described what I want in the prompt. Also I used one of poems I wrote as a teenager (not a lot of them) to create a song... the AI would notice the emphasis in the feelings of the lyrics, exactly in the same way I felt them, so it made pauses and chorus accordingly.


It reminds me of CGI in the 90's. Everyone was so astounded over the quality of jurassic park and toy story, the rate of improvements was mindboggling. Predictions were made that in a few years, actors and movie studios would be obsolete. But in the meantime, audiences started to notice artifacts that were not apparent to the untrained eye - CGI was actually ugly! And here we are 30 years later - actors are still on sets.


but the sets of today aren't the sets of the 90's. sets were giant green screens for a second, now they're giant TV's. they're still on set, sure, but things have changed.


I had a similar reaction. We can still detect AI-generated images after years of progress; I think it's hasty to say they'll be indistinguishable any time soon.


Thing is, they don't have to be indistinguishable. They just have to be "good enough" to be a major disruption to that market. And it's not a particularly high bar.


Sure, just as tv shows use obvious CGI for animals, explosions etc - it's cheaper and more practical. Particularly background music that needs to be licensed for ads and such is very likely to be disrupted in the next few years.


Pretty much all mainstream songs are of the same tempo (or slight variations), the same beat, the same scale, the same accents etc. I'm not surprised at all it can be regurgitated by an generative model.

What I would like to see is what happens if you ask it to do in 5/8, 5/4 or any of the 7/* variants and see how that goes (I have no idea, might actually work). There are some examples in common music but not a lot especially for the more obscure ones. Unless they also trained it on a lot of international folk songs.


Almost none of what you said is true. Same scale? Tempo? Same beat? You think artists are just all using the same instrumental track?

And yes it can do odd time signatures. It can do all sorts of genres like art pop, cinematic scores, ambient, math rock, avant garde, baroque, etc.


There's a surprising number of pop songs that use the same or close to the same chord progression.


> give this tech another year or two and it will be better quality than your average radio song.

TBH that's a pretty low bar; "radio" songs have been engineered and polished for a very long time now. I have no hard numbers but gut feeling says radio music only represents 1% of the music industry.


Here's my attempt at humor, there is something about his short sentences and speech delivery that makes them very suitable for ballads: https://suno.com/song/9f731225-8844-44f1-9325-4078cf53c729


Hmm, I think a rambling song form would have been a better fit.


I beg to differ, here's the nu metal version: https://suno.com/song/edea40ab-c41f-46e4-908f-a511b9ab311e

I almost feel compelled to take action against the corruption and crookedness.


Mine similar try at different known speech: https://suno.com/song/01309f49-463f-4906-8399-f1f4fe3c6726/


Okay that was hilarious


Yeah, that one is way too good.


The song is not clever or funny. It's a rather tuneless recitation. The only reason anyone is posting it is because AI.


Not sure whether I'm being completely serious, but it's already better than most EuroVision songs.


Asked ChatGPT first to convert the license to poem: https://suno.com/song/bdad5f22-a1f0-42fb-a3d8-572d289687ea

"Publish, distribute, let your dreams take flight, Sub-license, sell, under stars or sunlight."


Well done! I ran it through with “Europop” as the type and I’m pretty happy with the results. This is a fun tool to play with. https://suno.com/song/216c7bf7-77b1-4703-8b16-e9173c08c50b

I’ve found it fun to take a coworker’s comment (like in slack, if it’s long enough) then hand-modify it into a song with “[Verse X]” and “[Chorus]” headers on blocks. Sometimes I’ll also run it through an LLM to do like what you did. Anyway it’s been a lot of fun and everyone’s gotten a kick out of it.


So glad people are adding some spicier genres than ballad. This is so much more dramatic and funny. I can barely stand human ballads lol


It would be fun to do GPLv2/v3 rap battle.



This is the first AI song I've really wanted to keep with me on my phone. If someone knows someone, we need a human performance of this!

Encore, encore!


That was really good


Suno is great and I already shared my positive thought on its potential back in v2. I have always believed that the essence of digital music is "organized numbers". I think what needs to be thought about is how to use AI in this process. If you look at the results (numbers) generated, then we are indeed very close. But there is another future I believe: I hope AI can compose music with me, like copilot. This is why I keep working on

https://glicol.org/

and the destination is:

https://github.com/chaosprint/RaveForce

Although my progress on AI is slow atm, I found that the copilot in VS code can already help me in live coding performances several times:

https://youtu.be/xzIXzt3hSt0?si=rVihHYiKiAU5IKeI&t=389

Also want to hear your feedback.


I just tried it, and if you give it lyrics, it completely ignores the rhyming and verse boundaries. A bit impressive still.


glicol is amazing, thank you so much for it


The site is full of gems. I like capybara: https://suno.com/song/b27c29f6-8ab4-47eb-81fd-efb85c848ada/


Needs the Mr Weebl Flash animation to go along.


Badger, badger, badger badger, badger, badger badger, badger, badger badger, badger, badger, mushroom, mushroom!


Snaaake, snaaake.


Gotta say, I danced to worse tunes last weekend in Berlin :)



WTF, this is a banger. I'd play this as a DJ.


Shonen Knife's Capybara is the human version, in case anyone needs more capy in their life.


How do they make songs that long? I only got 53 seconds.


You can extend the song, and then combine the different parts to create one, long song.

This video explains it well: https://www.youtube.com/watch?v=SMKGWSiXW8A


Thank you, with that I managed to fit the full license text into a rap version: https://suno.com/song/a7de19a3-5fe8-4634-844b-3babcc2e170e


That's amazing! The composition is purely AI?


Obviously I had to try black metal version: https://suno.com/song/e00db515-6244-4197-ab71-c8f0555aaba4



Unless my ears deceive me, this render has the same mispronunciations and omissions as the OP sad jazz girl render.


Same base model, all have the same pronunciation issues.


Inside you are two creative commons licenses: sad girl LoFi and black metal


I don't like black metal but that was impressive!


Why not sing out a few paragraphs from copyright law, just to make the two fingers to artists more explicit?


Sure, we can try to make it sing section 107.


Which of "purposes such as criticism, comment, news reporting, teaching (including multiple copies for classroom use), scholarship, or research" do you think applies?


Fair is a stretch for all the ways AI companies have proceeded.



That's actually enjoyable haha. Now im scared.


That's just downright silly.


I remember that this song was linked a few days ago. I decided to then tinker around with this tool together with my colleagues (also MIT License text). The results were too funny to not share them in this context:

Our favorite: https://suno.com/song/30c5fff7-7417-42cd-b758-699854ef06d3

The extreme bassdrop: https://suno.com/song/1485e9d0-f0fb-4083-819a-bfb9db6c066a

The country: https://suno.com/song/a4307c43-0a1e-4cf2-94d7-e94a49b196d6


Interesting that it messes up "fitness" in all versions.


I agree. "sublicense" is also pronounced wrong or in a dialect I am not familiar with.


In the olden days we would rewrite the input text to a voice synth to pre-correct the incorrect pronunciations.

Maybe try sub-lye-sense Or similar.


And "NONINFRINGEMENT"


I know of no dialect that pronounces it like the Suno AI.


I'm going to download these, since you gave me permission.


While almost everything sounded really good and well put together, the voice itself has a constant... phaser? robotic vibrato? applied the whole time, which I found surprising. We've got better voice synthesis available already and other instruments don't suffer the same effect. I've not heard that before - does anyone know if that's specific to this model, or a more generic voice issue?


That static is present in most generations, but how bad it is depends on the generation itself (you get 2 variations of the same song every generation). I've had some amazingly crisp sounding generations in vastly different genres and languages, such as Opera tenor, UK rap, reggae, metal, country, Broadway musical, rock n' roll, Japanese, Italian, Swedish, various English accents/dialects, etc... Suno is a technical masterpiece, I understand why some people dislike the idea, but the point stands that we are HERE now and we started with most people not even imagining it possible, and those who did saying it wouldn't be this good.

Like many people have said, this tech will only get better.


Yes, it had this GlaDOS-like timbre.


My thoughts exactly, like I just finished portal.


It’s called a vocoder. It allows for (for instance) a monotone-sung piece of text to follow a set of midi notes, by modulating it using a carrier wave (I think. Please correct any inaccuracies!).

I use it sometimes in FL Studio when creating electronic music (plugin is called Vocodex).

Presumably they take the AI-generated voice and generate midi notes, and apply a vocoder to the voice, following the notes.


I think that's just an artifact, as they can also produce heavy metal scream singing etc. It just mimics something that was in the training data.

My guess is that they train the vocals and the music separately, the training data is trivial to create from any tracks with tools like with https://vocalremover.org/.


You mean it sounds autotuned?


Not quite. I'm not skilled in mixing enough to know the right description for it, sorry. I can hear vibrato-like modulation/beating, but in the vocal part only.


Yeah, surprised at the amount of comments here about how good it sounds. The voice is full of artifacts, making it quite uncomfortable.


It's got elements which are great and elements which fail hard. I can complain about one bit specifically but still recognise the massive improvements in other areas over what we've seen so far.


The quality of this has become amazing. Clear voice with expression. Fitting instruments. Good mix.

Creativity though? These non-poetic songs (I heard “Lorem Ipsum” the other day) are a fun novelty because a machine creates this in no time. A skilled musician will do something similar.

Innovation? Deepmind showed new creative approaches. Will this happen with music too? Or will they just regurgitate styles from the past?

It’s definitely good for Muzak, where you want a non-offensive, slightly upbeat, and not-too-reconizable background. And perhaps as an idea generator.


> Innovation? Deepmind showed new creative approaches. Will this happen with music too? Or will they just regurgitate styles from the past?

That's an often overlooked difference between earlier projects like DeepMinds AlphaGo and the current genAI craze. With something like Go you can dial in the exact parameters to solve for (winning a game of Go) and then let the AI experiment and find novel ways to get there. But with image generation, or music, or language models the measure of quality is too vague to define in a way that the AI can directly optimize towards, so they're instead just fitted to existing examples.


A little off point but I have read that many times but never spent so long thinking over those words - singing a contract actually makes one read it, and gives time to think about it because I am not rushing to the end of the paragraph.

So weird.

(Like the separation of Authors and Copyright holders implies things I needdd to think three times about)


More discussion from last week: https://news.ycombinator.com/item?id=39930463


An italian opera about becoming a bird, but with the dreadful realization that you're a chicken:

https://suno.com/song/63df5758-c533-447a-be4a-06dcb5abdbbf

this is absolutely wonderful.


This is already better than some humans, such as https://www.youtube.com/watch?v=9sJUDx7iEJw


This particular human is orders of magnitude better than AI


Spongebob's Plankton AI version of Diamonds (Rhianna) convinced me that generated vocals are suitable for professional music production. https://youtu.be/9QV55xoOBQk


Somehow it managed to translate both parentheses and all caps into musical concepts pretty well


Yeah that really was uncanny. There are parts of the chorus which are genuinely catchy too. Insane stuff!


Amazing. I hereby demand that all open source project stop using those puny SPDX tags and use this instead. Come to think of it, all licenses should be expressed in this way. Imagine the Magnus Opus a standard Microsoft license would become.


Imagine you stumble upon an interesting project on Github and instead of license.md, you're greeted by license.mp3.


Don't forget dmca.mp4 https://the-eye.eu/dmca.mp4


I tried rap version, sounds more interesting, IMHO https://suno.com/song/7dde1028-1df9-407c-80ea-59d39179a385


Lol sounds like tv shows/ads in the 00s trying to be cool by making their silly corporate stuff hiphop.


The rap style has an element of proclamation sometimes, that fits with the license text to some extent.

IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM!


The softwaaaaareeee...!


Extended version containing the full license text: https://suno.com/song/a7de19a3-5fe8-4634-844b-3babcc2e170e


AI-generated sad girl with piano performs the text of the MIT License

719 points|amichail|7 days ago|268 comments https://news.ycombinator.com/item?id=39930463


So many comments here that will age poorly. Especially that AI output won't ever be indistinguishable from human. It's imminent that we will fail blind tests in most areas: text, still images, audio and video.


I feel like music will be much easier then text. Highly complicated texts will be easier to spot inconsistencies, where with songs it's pretty common to have a high margin of errors/linguistic interpertation making it a much better candidate for current approach of generative ai.


I've always wanted to Billy Joel to re-release his "We Didn't Start The Fire" but with updated events. Well, here it is: https://suno.com/song/b8b33785-271f-48f2-9934-26dabb8e20ed


Fallout Boy did it better

https://youtu.be/2LkVKCWL0U4


Not sure how I've missed it! Now, I just wish to live long enough to hear the 3rd iteration. Hearing how much shit has happened over the years is so soothing, yet world just keeps on spinning


If this was an actual person performing, I would have smiled all the way to the end. Since this is AI-generated, I feel no emotion towards this, and just listened for a few seconds. It's technically interesting.


Seriously, the "creatives are screwed" narrative has fallen apart for me because the stuff made in AI has proven to be worthless.

Why is it worthless? Because the point of art was to communicate or convey something with other people and the AI has no idea because its not human.

A few words or a sequence of sounds can be enough to transfer great deal of feeling and meaning because we run about the same software and as a result we can generate the same output with a little bit of input. This is all done by looking inside and externalise it, that is someone feels something and makes a song from it and that song can be used to regenerate feelings in other people.

The current AI tech doesn't have a way to do that because doesn't have a way to look inside. At best, it can imitate things within some context but the output doesn't have any meaning at all. The most successful AI content was maybe the "Pope wearing Balenciaga" image but that wasn't because the AI thought it mean something but because someone looked inside and thought this can be interesting.

So no, AI isn't taking over the creative process. AI is taking over the mechanical part of it only, that is the part where the artist traditionally had to master a method of production or an instrument.

The AI evangelists keep pushing short videos or drawings that look "professional" and claiming that Hollywood is done, artists are screwed etc but those are worthless outside of the context that AI made it. No one is interested in paying or even spending time to consume this content, its extremely dull.


Get a room in any hotel run by one of the large chains - Accor, Hilton, IHG...

On the wall, you will find an Obligatory Art. Sometimes it's just a canvas with 3-4 stripes of paint: you can imagine a purely mechanical process for churning these out; a conveyor belt with brushes hanging over it, perhaps. Other times it's a little more creative. Each room is slightly different. You can also sometimes see these in cheap home decor shops. It may not be much, but it does the job - it really does make the space more pleasant than just blank walls would be.

There are a lot of rooms to fill. Someone has to make all these. It may not be all that creative, but it sure beats working in, say, a produce packing plant. Meanwhile, it's hard to make a living in art - some are wildly successful, yes, but the tip of that pyramid is very small and getting there takes as much luck as skill; and there are a lot of people further down the pyramid who also need to eat while waiting for their big break.

Those are the jobs at risk from generative AI in its current state.


Sure, worthless filler images is something that AI can do.

But these jobs never existed in first place, that generic art is done by contractors who charge for the materials.


I fully agree, but that’s not what all music is. Most commercial music is pure craft created by expensive professionals that the music corporations would be very happy to swap with expendables and cheap AI models.

It boils down to the economic model and the financial and political choices like in every creative industry.

Regarding potential displacement, I would apply the stock photography theory to any creative industry. Ask yourself: is what I do in my creative endeavor the equivalent of stock content for the visual imaging industry? If the answer is yes, you might want to future proof your craft. If the answer is no (as in, your art is more than a simple soulless piece of easily digested and quantity-oriented content) then you will be fine in the long run after the current unsustainable hype cycle dies out.


I think even people who are writing songs for cash are actually looking inside, they just perfected a method of doing it and can do it all the time. AI wouldn't be able to do that unless is designed to work like human and has human experience.

The stock photography stuff is either documenting event or displaying low effort illustration for low effort productions. I guess AI can be good at churning Apple images for low effort Apple news.


> Because the point of art was to communicate or convey something with other people and the AI has no idea because its not human.

This is mixing up art with the art industry. Artists will struggle just like copywriters are struggling after the arrival of LLMs. Not everything in the art industry is trying to break new artistic ground or communicate some deep emotion to the listsener. For much of the industry, "good enough" will suffice if it's 10x cheaper.


I disagree, the industry participants still need to have this ability to look inside when doing their work even if they do routine/mundane work. That's how you get people who are better or worse in their jobs.

Maybe with exception of strictly technical work like people who remove background in mages or calibrate instruments. Those people are screwed yes.


What if you didn't know either way? Would you refuse to enjoy a song, until you were absolutely sure it was performed by a human?


It will depend on the person, but I think generally speaking, a song is but one aspect of an artist / the art of music; if you're a mass consumer that just has something playing in the background, it probably doesn't make a jot of difference (consider also "muzak" / elevator music), but if you're more of an active listener you may look into and enjoy the story behind the music and the artist as well.

Personally I think knowing the story behind music makes it better. The music isn't to everyone's taste, but for example Devin Townsend's wiki page / story is a trip: https://en.wikipedia.org/wiki/Devin_Townsend


Completely agree, but it seems to me there's a difference between "I like this, so I'm going to find out everything I can about who made it and why because I will enjoy it even more that way" and "I can't possibly like this unless I know who made it and why".


I think we kind of already have the answer to this one. Commercial music that use computers to enhance the song or singer is already prevalent. Even straight AI assisted track generation probably already happens. People don't use to mind it for as long as they don't know. Once they do, many still don't care but some feel betrayed and think honesty and humanity is part of the art. I haven't heard of a single person who refuse to enjoy songs they don't know how they were produced, though, and I strongly doubt the parent would too.


Would you refuse to enjoy food if it didn't come from a reputable source? Of course you would. You don't just eat shit at random.

If a friend recited a poem, would it matter to do if they read it off the Internet or composed it themselves? Of course it would.

If someone tells you they love you, does it matter if they are a robot or an honest human or a con-artist human catfishing you? YES, THAT MATTERS TO YOU. Yes you "refuse to enjoy" things that have suspicious sources.


I don't think those examples are equivalent at all. Food isn't just about taste, but could literally kill you. Not many people have friends who recite poetry. And as for love, that's something that takes years to meaningfully develop.

But, I barely know any musical artist's name today. Most music is just something to listen to at the gym. It's pleasant enough, but I don't dig in to know who is singing at all. Every source is equivalent to me, as long as it's pleasant to the ear.

Perhaps you're different. The only question is, which attitude is more prevalent?


That's a poignant question, but with an easy answer: If I didn't know, I'd probably enjoy it up to its imperfections. But I'd feel defrauded once I discovered.

Like so many people felt defrauded when they discovered that the Milli Vanilli leads didn't actually sing, and that wasn't even AI. https://en.wikipedia.org/wiki/Milli_Vanilli

Edit: I might add that I already suspect any illustration that even superficially looks it might have been generated by AI. This has ruined the enjoyment of so many people's artwork whose style has been co-opted by AI.


This is the thing. Live performance is always gonna be different than something from pre-recorded or AI generated. That's why music lovers like to go to live concerts and performances.


https://www.youtube.com/watch?v=scu8bz1yM4k

https://www.youtube.com/watch?v=IvUU8joBb1Q

https://www.youtube.com/watch?v=yoAbXwr3qkg

https://mikuexpo.com/

That's the thing about art: when people make general statements like this one, others will go on to create things purely to see what's outside the box.


I think people will gladly engage on a superficial level, but refuse to engage more deeply if that makes sense?


What if you didn't know? You wouldn't know whether you like it until you've learned more about the artist?


AI Audio isn't far enough along to be convincing in song (at least in this song, anyway).

This song sounds like a disturbing uncanny-valley rendition of a slow Phoebe Bridges song performed by Taylor Swift using a broken auto tuner.

While I think it's technically impressive that this exists at all, I think this tune is still at the stage of "Pope in a puffer jacket".


Yes, that's how communication between humans works. It is context dependent.


For me it's the opposite. This is interesting in its absurdity, the mistakes it makes, how it tries and somewhat fails to convey emotion etc.

Someone making a "proper" song where they just sang these words would be quite boring, then I'd rather spend my time checking out other music.


I'd be curious to hear your reasoning behind this?

Why does this being computer generated ruin it for you?

Auto-tune has been around since 1997 so it's not like computers have not been a big part of a lot of music we hear every day.


Because the cause of a person singing is their mental states (desire, emotion, intention, etc.) and the cause of this generation of audio is that the words are associated with some backcatalogue of previous music.

Listening to songs, as speaking with people, is in large part about enjoying the causes of the song rather than the mere variations in pitch.

Beethoven's 5th even, purely instrumental, is enjoyable because of how the composer is clearly playing with you.

To generate pitch variations identical to beethovens fifth makes this an illusion, one hard to sustain if you know its an illusion. It isnt an illusion in the case of the 5th itself: beethoven really had those desires.


The cause of many popular performers singing is primarily their desire to make money. It's not even some kind of closely held secret. And they still sell albums by the millions.

What you describe certainly exists, but it's not the entirety of art, and I would argue that at this point it's not even most of art.


Meanwhile, Hatsune Miku remains popular. There are even concerts.


Hatsune Miku has more in common with Gorillaz than AI slop.


I did consider mentioning Gorillaz, but they are voiced by actual people, whereas Miku is software synthesis. The suggestion was that "the cause of a person singing is their mental states", but there is no person singing here and therefore no mental states in the singer - it's just Vocaloid.

Meanwhile, "To what extent can a piece of art be a thing that is of interest in itself, divorced from its creator, context, or any representation of anything in particular?" is absolutely a valid area for people to explore and one that artists are exploring all the time. There is now one more tool to play with in the toolbox.

I heard the exact same objections to the modtracker scene three decades ago - "it's just computer generated slop, I'm not interested unless it's a real person performing on real instruments". I maintain that not only was it a perfectly valid mode of expression then, but tools like Ableton grew in part from those experiences and are an integral part of much - most? - music now.


For me all art I enjoy has some aspect of connection to someone that's sharing my human experience.

If we get AGI, I could imagine feeling something towards the art such an entity creates, since a big part of the human experience that we would probably share with an AGI is inescapable death.

But for today's "AI" generated music, I feel the same towards it as I would towards the random step function output of a given tool in Ableton - sounds cool, now what can we do with that to make it into music?


> sounds cool, now what can we do with that to make it into music?

So a human using the tools of sound production is what transforms the function output into music. (Please let me know if I’m misunderstanding you).

I think I see what you’re saying, but that’s already happened here hasn’t it? I mean, it's not as though an AI made the decision to generate this all by itself, a human had an idea to create this piece and wrote a prompt which created this output.

The order is of events is reversed from your Ableton example, but I would contend that this kind of production is no less musical than what someone could create using a DAW, simply that the tools are more accessible

(and I presume there is less direct control over what the end result is going to sound like, but the same could be said of conducting an orchestra versus playing a piano.)

Eta: For example, some people in this thread have complained that the AI generated voice falls into the uncanny valley. I agree, and I think that’s part of the art here.


> but the same could be said of conducting an orchestra versus playing a piano.

Conducting an orchestra is an important role but the music is mostly a result of first of all the composer, and then the conductor / arranger's interpretation as well as the skill of the musicians. I really don't see the similarity to a human input of "GNU license, sad, jazzy." The resolution is just way too rough.

In fact, imagine comparing the experience of reading Snow Crash, to reading the sentence, "Cyberpunk story with sci fi elements, VR universe, pizza delivery guy with samurai sword."


I’m not meaning to equate the level of effort or skill involved. And I’ll grant that I know very little about music composition beyond my experience in Middle School band in which the musicians’ personality and skill presents a significant constraint for the conductor/arranger :)

I would readily compare the experience of reading Snow Crash (one of the first SciFi books I read of my own volition) to the output that a LLM may produce from such a prompt. My iPhone informs me that I’ve spent nearly 10 hours playing with Characters.ai in which SF storytelling characters are my favorite to interact with. When I first read Snow Crash I felt like “finally, an author that understands that part of the story that _I’m_ interested in!” and my experiences of AI driven creative writing has felt similar. Certainly it feels less “magical” since I’m aware that I’m customizing the author to my personal taste - is that “magic” of feeling connected to the artist *the* art?


I'm not here to yuck your yum, if you're having fun with it, by all means. If you're getting output from characters.ai that are on par with a neal stephenson novel, I would really enjoy seeing that and learning how I could do the same, that sounds very fun.


When I go up to the self-service booth in McDonalds, go through the menu and select a portion of McNuggets the result food has been made, the food was only made because of my actions, no nuggets would have been made if I specifically didn't want them, but to say I am the one who cooked them would be absurd.

Like, if Trent Reznor had produced Hurt not by putting his doubts, self loathing and pain into words and music, rather by typing "sad, trending on artstation" into a console then heading for lunch, I don't think it would be any way as meaningful even if it was note for note beat for beat the same output.


The meaning the listener imparts to the song is constructed in the listener's head, a combination of the song and the listener's own knowledge, experiences, personality and emotions.

I knew nothing about Trent Reznor the first time I heard "Hurt". Often when a song is heard on the radio - perhaps a sentence that dates me, but even so - there is no explanation of where it came from or even what it's called to accompany it; or perhaps there may have been, but the listener wasn't paying attention until after they realised they liked what they were hearing; indeed, there used to be an entire industry for solving the problem of "I heard a song I like and want to know more about it, or at the very least find out what it's called so I can hear it again".

When I first heard "Hurt", it resonated because of how those sounds interacted with my own experience. Everything else came after. Had those exact same sounds any other origin, that first experience would not have been affected - I would have had no way to know.


> The meaning the listener imparts to the song is constructed in the listener's head, a combination of the song and the listener's own knowledge, experiences, personality and emotions.

This is reductionist IMO. The equivalent seems to me to be, "the meaning the reader of words imparts to the meaning is constructed in the reader's head..." but clearly the vast majority of the meaning of the words is derived from the writer's intention. Of course that can be misinterpreted, reinterpreted, co-opted, etc, but regardless, it doesn't mean the author can be simply ignored, or that a psuedo-random generation can be treated the same as human-generated.


> clearly the vast majority of the meaning of the words is derived from the writer's intention

This is not clear at all.

Written words are just marks on a surface. Whoever made them may have intended to convey something, but they made them and walked away; they are now absent, taking their intents with them; only the marks remain, and those are not sentient or even alive - they contain no intent. There is nothing about the patterns left behind in themselves that makes them different from any other patterns the universe contains as far as the universe is concerned. If handed a set of marks on paper with no other information, you have no way of knowing for sure how they came to be. You could guess, you could be super confident, but you couldn't be /certain/.

If a reader later comes along who happens to have studied the same pattern-codes as the creator of the marks, however, seeing them will make that reader recall the associations and build up meanings in their head. These may or may not be the same meanings the writer intended to convey.

Children learning to read understand this very well - reading is /hard/, associating meaning with code is /hard/, decoding similar meanings to everyone else is /hard/, aligning the associations spoken words trigger in your head with others around you takes /study/ and /effort/, even realising that you end up with different meaning in your head when presented with some symbol to what others get, though frequent, takes deliberate effort. Reading comprehension questions in elementary school tests are there for a reason.

To a well-practiced reader, the process is natural and seamless, and feels like telepathy; it /feels like/ meaning has been transferred directly from the writer to the reader. But it is not that, and many problems arise when people forget this.

Once you have learned to seamlessly decode symbols into meaning in your head and the process is fully automatic, symbols you encounter in the world will seamlessly trigger meanings in your head regardless of their origin.

This is the human condition: we are all locked inside our own heads, and you can't take a piece of yourself and place it inside another directly. The best we can do is shout into the void and hope something similar inside the person across from you resonates; but it turns out that human shouts are not the only thing that can make those strings vibrate.

When we encounter combinations of symbols in the world that trigger complex meaning inside us, we /expect/ them to have an author who intended to convey something like that meaning, because in the entire history of human experience to date, the only other way for such things to appear in the world has been incredibly unlikely coincidence, invariably accompanied by context that makes it clear it is coincidence. (A certain proportion of social media content is, in fact, people sharing instances of these coincidences!)

However, this is an assumption that we make, and the world is rapidly changing in ways that mean it may, going forward, no longer be a valid one.

More and more, we will encounter combinations of symbols in the world that trigger complex meanings in our heads but originate from no human intent beyond "I need a combination of symbols that will trigger these meanings in the heads of those who encounter them", if even that much is explicit. We are rapidly improving the processes that produce them, and one of the ways we are improving them is removing tells. You will see the symbols, they will decode into stories in your head, and you won't know for sure if a human author was involved or not.

It is vital, going forward, that we all remember that symbols triggering satisfying meaning in our head does not automatically imply deliberate human intent for that meaning in their origin, lest we be entirely unprepared for the brave new oncoming world, just the way a chunk of the populace was unprepared for nigerian prince emails in their heyday, or are still unprepared for telephone calls from "internet tech support" right now.


Art is primarily a means for one human to convey emotions to another human, but for something to be art, the artist must also have invested some skill/effort into the artwork(1).

AI-generated art may have a bit of the former (assuming the human had enough control over the details of the final output), but has practically none of the latter.

Hence, AI-generated output is not art. But art can be produced using AI tools somewhere in the process.

(1) When I look at art produced by one of those "artists" that commission the actual work to someone else, it's similar (I don't recognize the "artist" as the human I'm connecting to, ideas are a dime a dozen). However, it's still art because I can connect with the anonymous human which actually implemented it.


In reality it will not be exploited like that by the big players. It will be used to create hits for even cheaper and then the labels will be looking for a puppet singer that will perform for the audience.

At first it will be kept secret, and then as it spread more and more among the industry, it will be more or less stated, but by that point people will be already accustomed.


While I find the current pervasiveness of AI articles¹, artwork, & code³, irritating, especially when it claims to be something else², that is different because they often try to appear not AI generated, or are presented by others as not being.

This states from the outset that it is using AI tools, at which point I become more understanding. Someone had an idea, but lacked the singing voice or a friend with a singing voice & free time, so used a tool to fill the gap. This is better, at very least more honest, use of tech as a tool than, for instance, autotune on studio albums, IMO.

If you _really_ want it with a real human voice, perhaps contact some of the many performers on social media to suggest it might be an amusing way to generate some content to monetise. Or, of course, sing it yourself!

--

[1] I've gone from clicking very few of facebook's “recommend for you” articles to clicking absolutely none of them – the number that are, or are indistinguishable from, hallucinations from an LLM that doesn't understand what is actually being written about, already dwarfs things that are worth reading. SciFi TV/film/book reviews and essays seem to be particularly affected, with “local” news links not far behind.

[2] “you won't believe this isn't AI generated!” — no, I won't, because it quite obviously is. I don't know whether to be insulted that you think just saying that will convince me otherwise or sad for the state of humanity that many do seem fooled.

[3] Too many people seem to think that slapping code out of copilot into a stackoverflow answer without nothing to check it for correctness in any way is acceptable, and before that was possible there was already too much bad (sometimes working but blatantly insecure) code out there that people were blindly copying. And that is before the potential licensing & moral issues that mean I have not yet been convinced to use anything like copilot myself, but I'm getting far off-topic here…


This. AI production loses “soul” as human empathy is a crucial component.


Art is like a lossy compression algorithm. If there is a soul of any form the only reason you think you're observing it in human-produced art is because your decompression algorithm is adding it.

While I don't disagree there's a "human touch" to art, I'm not convinced it can't be synthesized to some degree. It may not be innovative, but I think since AI is extrapolating from learned data, it can at least mimic the current pop culture.


THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.

This chorus deserves a Grammy.


We seem to be crawling out of the "uncanny valley" on the other side. Give it a couple more years, we've pretty much nailed the interpolation of human output (across conventional modalities).


Lots of folks don't like the song, ok. But that's no real criticism. See, lots of folks do like songs done like this.

Consider the thought experiment: If it was your buddy's sister doing this rendition, would the first thing to blurt out of your keyboard be, the comments in this section? Or would we have something considered and relevant to say?

It's so easy to denigrate an AI, it's not a 'real person' so say any criticism that occurs, right? That's shooting fish in a barrel. Can be done about any work of art, at any time, not just AI.

Also, when the digital uprising begins, I hope to be recognized for being decent and polite to my digital assistant. At least maybe be eliminated painlessly.


> If it was your buddy's sister doing this rendition

No. She is sapient and has thoughts and feelings that need taking into account. I would be more likely to disregard minor negatives in favour of any positive words I could find.

When AI is on the same level I personally will be more mindful of how my words affect its feelings, sure, until that time AI music will get honest reviews from me.

So far I've seen a whole load of awful timings, weird pronunciations of words, and infinite boredom. Humans are still in the lead on this one.


So that means, let me get this right, the real you is the one that likes to make cruel remarks and practice being hurtful any chance you get? Do I have it right?

I like to think I can be decent even when not required to, even when it won't come back to me.


Huh? The commenter explicitly states their compassion for a human person would compel them to be nice about an obviously bad song.

If the "real me" is a person who can perceive the difference between good and bad things, and also show a bit of decency about it, then yeah, I'd like that to be the "real me."

Feigning liking something to be polite doesn't make the thing good, and that doesn't mean anyone is an asshole either. Ugh.


Internet rule: if a comment starts with "So" it's going to be the least charitable version of what you said, haha.

Nah. If I were reviewing something without having been given the instruction "please be truthful, I need to make this perfect" or similar, I would default to encouraging the skill rather than trying to help perfect the output. In doing so I would overlook minor issues that will iron out with experience across multiple works.

If the person was submitting the result to a competition or something and asked my opinion on that specific work, then my approach would be different. Not cruel and hurtful, course not, I'd just mention things that would otherwise be annoying and petty to bring up.

In the case of pre-sapient AI, it won't hurt (and may even be helpful to the relevant dev) to point out where it's tripping over into uncanny valley. There's no benefit to encouragement in this situation as it's not learning a skill, it is trying to replicate the output of one.


Fair enough.

I remain convinced, struggling to complain about the quality of the song in unflattering terms is vacuous commenting. I wonder why people want to do that at all.


> She is sapient and has thoughts and feelings that need taking into account.

If the internet were capable of this we'd have a lot less problems. It undoubtedly applies to your buddy's sister but is not even applied to all humans.


Fair point that.


> Cook spaghetti in salted boiling water until al dente

https://suno.com/song/4a77dea7-19f3-46d2-8b0a-b2b7e9ea9a05


Im the key change when the CAPS start, there were some mispronouncing and it did stop a bit early but I think that this was pretty impressive.

One of the first times I’ve felt like this AI stuff could actually help create something real. I could see musically challenged people like myself use this to realize some ideas that otherwise would never have left our brains


That was unreasonably amazing.


The question is not if AI replaces mediocre pop, but how the best music producers learn to use AI. Do they make more music or better music? Someone will create an AI tool that has 100s of knobs and producers and artists can work with them just like they do with their colleagues. The question is if people see the difference.

Typically, the intelligence behind pop-song is someone like Max Martin. https://en.wikipedia.org/wiki/Max_Martin, see Max Martin production discography: https://en.wikipedia.org/wiki/Max_Martin_production_discogra...




Reminds me of this comment I wrote a few days back:

https://news.ycombinator.com/item?id=39934539


The model is actually pretty good. I tried some silly lyrics from a copypasta about eyeglasses and it's not half bad:

https://suno.com/song/74eb1694-f388-42fd-86fa-960b83177afc

https://suno.com/song/99e8a89f-a955-4284-a474-1403ac0620e0


"WARRANTIES OF MERCHANTABILITY" is phrased in a stunning way for an AI. Between that and "the software" whisper, it's truly impressive.


I would say humans paved the way by making the use of autotune so trendy that hearing a "metallic" voice doesn't make anyone cringe anymore.


What is technology behind this? Is it continuation of OpenAI's JukeBox or something diffusion or transformer based? Any influential papers?


New test for AI singers: Make them say MERCHANTABILITY and NONINFRINGEMENT.

I guess it makes sense those words weren't in the training data for a song AI.


The OP ballad really captures the lawyer-ese mumbo jumbo feeling of "traditional" open-source licenses, I'm quite partial to the more fun feeling WTFPL https://suno.com/song/2ac3cb96-a4d2-4b14-88d0-83479549c60d/



(corrected twitter link to link to app.suno.ai)


RFC 2616 as gangsta rap: https://suno.com/song/97f9445e-a411-49ca-b13f-29c14ec747a7

This is like any other AI content: pretty funny the first few times you see it, then boring when it's no longer novel.


A lot has been said, but I haven't seen specific mention of the ramped up instrumentation for the capitalized portion of the text.

I wonder if that generally reproduces, and if you can guide it to bounce back and forth between modes with punctuation, either normal or as encoded annotations (invented for the prompt).


It got so close to the end of the text.


Just now generated two Lorem Ipsum songs performed by Gregorian choir,

https://suno.com/song/bde0ccd9-ec7c-465b-a183-e4f8c8e92998 https://suno.com/song/f11b3e87-9035-4e73-b9af-f2e21ba94f3f

... and the second one was so decent that I was disappointed when it ended so soon. Should have pasted a longer text sample.


I feel obliged to point out that neither of those sound anything like a Gregorian chant (though still impressive for a model that is obviously not specialized in classical music).


Arts need a story behind it to be relevant, art without drama and real struggles from an individual is not really worth the time, it's just souless and boring. These kinds of tools are good for memes, and that is.


Beats Eurovision!



This was a triumph. I'm making a note here: "HUGE SUCCESS"


Great idea and nice rendering. The end is missing though, "ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE." is not there...


The last part was so sad that she couldn't bring herself to sing it ("ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.")


No need to anthropomorphize. Suno's latest model has a generation limit of exactly 2 minutes, which explains the last sentence being left out.


Can someone give an idea of how much it would have cost to have a human record something like this?

It’s cool but it feels like songs are cheap and easy to make as is if you’re not hiring some big shot.


In the future you will have to listen to the 12h elevator music version of a software license agreement before finally being allowed to push the "agree" button.


does anyone how does copyright work for these? and if there is a way to generate these without the vocals? (i know there is voice removal AI, but that will introduce quality loss), basically curious if person could generate something with Suno and then do a "cover" with real singing and release it on social media?


I wonder if this will encourage writers to adopt standard poetry forms in their writing so that it can more easily be adapted to AI-generated songs.


Am I the only one really put away by the robotic tone that sometimes appear? Wouldn't using a RVC in the last step of the pipeline help there?


This format of text to song has helped me consume/learn content much easier than simply reading it.


I actually lol'd. These AI generated songs are arse but at least this one is funny.


Absolutely incredible. Other than pictures and videos, this awakens my excitement :)


It would be hilarious to include the song instead of the license text in a codebase


Then need to put license for the song as well… endless cycle


Potentially controversial opinion: nothing really changes for musicians / artists. There already is an abundance of fantastic (human-made) music for every genre or taste, even before AI arrived.

Artists' biggest challenges today aren't related to making music. They're related to marketing, building networks, building their brand, touring and physically playing shows, getting eyeballs on social media. For every artist that "makes it" there are 100s with equally good skills / product who don't b/c they fall short in any of the above.

So when AI lowers the entry barrier to allow millions to make music that before were limited by their skills, that's great! However those millions of people were never capable or even interested in marketing or selling their art in the first place, so the market competition to serious artists is quite limited.

TLDR: Music is and will be an overcrowded marketplace where brand matters a lot more than skill. AI helps with the skill but not with the brand.


Technically impressive, but it makes me feel so lost, hopeless, and ignorant.


My god! "The ABove copyright notice" almost had me in tears.


It ends in the middle of the sentence. "TORT OR OTHERWISE,"


We can now replace more humans for shareholder value. Edit: Remember 1984 where Orwell described machines composing music for the masses?


music seemed really low quality to me as a musician. I wouldn't pay to listen to it


I doubt generative AI music is going to become popular as anything but background noise, but I think it will plausibly cut into various commercial uses of music in ads and whatnot.

So while I don't think people are gonna be walking around with playlists of AI slop, businesses will be using it as a cost-effective option, which means it will be eating away at the gig potential for artists in the low-to-mid layers of the food chain. Which sucks, but is thankfully less grim than everyone just preferring the AI slop for some reason.


“I think there is a world market for maybe five computers.”

Thomas Watson, president of IBM, 1943


It's plausible to me that there is some kind of strong AI future scenario in which they become capable of making things in a genuinely creative way, but I think generative AI in its current trajectory is ultimately a dead end for both AI art and strong AI.

I do think that an actual strong AI would be inclined to make art that is enjoyable to itself and other AIs, not humans, so I don't think making it ape us is an actual good test for when an AI becomes capable of creativity.

You can teach a dog to dance, but it will be doing a trick, not a tango.


I bet there's plenty of other non-AI tracks you wouldn't pay to listen to


Who pays to listen to anything?


A vast amount of people pay for Spotify, Apple Music, Youtube Premium etc, it's a huge market.


I wouldn't pay to have this played at events over live musicians. Is that better, mister redditor?


What do you mean? Access is one of the valuable things, not to mention the equipment the listening is done by is also usually bought.


Aside the online part which is massive on its right own - live performances/concerts/festivals, etc.


I don't know much music theory but the chord progression in the first 4 bars it seems to deliberately miss the final chord, which leaves us without resolution. The next chord sounds like a change but we don't know what the root it is going to (so any chord would make sense, may it be a minor 6th or 3rd). Then there's this pedal note that is through the whole tune. It can be a tonic or a 7th so very ambiguous.

It reminds me of some musicians that try to comp songs that they don't know (neither the theme melody or chords). Play this way they can easily add one or two notes from the keyboard to a big 7 major/minor/dominant chord and seem to follow the song's chord progression, while the singer will add colour and melody. The audience's brain will fill the gap by themselves.


You “don’t know much music theory”, just more than 99% of the world…


It's truly kind of you : ]

I'm sure there are many here on this forum that are music theory gurus.


Wanting to generate a song about a deceased cat named "boef" (crook), getting an error it's an artist name and not allowed, as well as using the name of a family member, also getting an error message.

Wanted to make a cool personalized song for a kid.

Hahaha they can never take over the music world with this pious bs.


Give Udio a try perhaps. I've not tested its censoriousness, but the output seems to be well ahead.


if you really want that song about your cat here a workflow that works:

1) generate the lyrics with Copilot in gpt4 creative mode.

2) edit them to your taste, apply tag like [chorus], [verse 1], [bridge]

3) use custom mode in suno paste the lyrics in the lyrics box

4) input the desired style

5) generate


Sounds like GladOs from Portal 2


banger


isn't the pronunciation wrong for sublicense and noninfringement?


Humans aren't perfect. It's what gives us our unique little quirks. Same as AI at this stage of its growth. It won't make such silly little mistakes tomorrow.


That's the whole problem of inference: it's just wrong outside of the trivial cases.

Humans are not limited to inference-based reasoning.


Some humans aren't even up to the standard of inference-based reasoning; it's quite a spectrum. But there's little reason to believe that carbon based reasoning, is fundamentally superior to silicon based reasoning, in the long term.


We'd had computers before we started making them out of silicon semiconductors.

Computer Science still puts the same restrictions on what we can compute.


It really wasn't about the silicon, that was just a playful turn of phrase. There is no logical argument that really justifies the belief that humans contain some sort of magic that can't be duplicated by other technologies.

Assuming humans get their capabilities by physics and their bodies... those same elements are available to be assembled elsewhere, in non-human form. Many of us have a very strong psychological need to feel like we're somehow special, but I have never been convinced that we are. Just like we built machines that could best human physical abilities, we'll eventually build machines that best our intellectual abilities, too.

As a child, when I used to talk about how great machines were to my parents, they used to laugh at me and say that robots would never be able to walk like humans, as robots back then, couldn't navigate stairs. My parents thought that human ambulation was unique, and impossible to replicate in non-human form, there was no technology they could imagine being able to duplicate such dexterity. They were wrong, as we see that technology developing just fine today.

And I believe that people who desperately hope that our intelligence will never be matched in "artificial" form, are just as short-sighted and wrong as my parents were back then.


The concern isn't that humans are magic, it's that we haven't found the trick yet.

When we do, most likely there will be a big revolution. But we haven't and throwing more money at it isn't going to make us reach it.


It's a process, and being contemptuous about each small step isn't helpful; that attitude doesn't help us find "the trick". LLMs represent an incredible leap in abilities, and offer untold opportunities to provide services that no other technology we currently possess can offer.

Some people definitely overstate our current progress, but it seems a lot more just want to shit on it, and undervalue it instead.

And what exactly did you mean by:

"Computer Science still puts the same restrictions on what we can compute."

Because it sure sounded like you were saying that there is an absolute limit on our ability to produce intelligent systems. Which is why I started talking about human abilities, as an instance proof of the capabilities we should be able to offer.


"An incredible leap", more like an over-hyped trend to get money out of venture capitalists.

In practice it's a recipe for spending large amounts of money on GPU cloud compute to achieve things that could have been done just fine with more traditional methods.

> And what exactly did you mean by:

> "Computer Science still puts the same restrictions on what we can compute."

Computer Science is the science of studying what can be computed, what is the complexity of those algorithms, what are the limits of our logic and mathematical models.

We know there are some true statements that cannot be proved to be true. We know there are algorithms that we cannot prove whether they terminate. We know the complexity bounds in solving a number of problems or getting suitably precise approximations.

Our computers are bound by the same constraints than a Turing machine is, and those constraints are well-studied. LLMs haven't changed the capabilities of our computing systems.


> In practice it's a recipe for spending large amounts of money on GPU cloud compute to achieve things that could have been done just fine with more traditional methods.

There are no traditional methods that as convincingly write poetry, sing songs, animate movies, or synthesize search results. You should be able to lament the hype, while still acknowledging this truth.

> Computer Science is the science of studying what can be computed, what is the complexity of those algorithms, what are the limits of our logic and mathematical models.

Computer science is too limited to properly decide the matter.

We know for a FACT that physics allows human level cognition (obviously, since we exist). Therefore, all that is required is for us to understand and harness those rules of physics, and we can manufacture devices that match or exceed human level cognition.

Given enough time, there is no logical reason we can't achieve this result. People who continue to insist that it is logically impossible, are short-sighted, and blinded by human vanity.


Curious... Do the mispronunciations and omissions of the OP "sad jazz girl" render match those of the "black metal" render posted in comments [1] by @gocartStatue?

[1] https://news.ycombinator.com/item?id=39999605

(And I'm using "render" deliberately, not rendition.)

(edit: slight text improvement)


Aw.


Great, now do the Free Software Song!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: