Harmonizing Code and Melody: Making Electronic Music Creatively

Video size:

Abstract

Experience the fusion of code & creativity with MusicAgent! Discover how with Gen AI, Python, and Sonic Pi we seamlessly compose, arrange & produce music live. From concept to recording, including album art, I’ll show you how to create programmable songs effortlessly. Let’s make electronical music!

Summary

Transcript

This transcript was autogenerated. To make changes, submit a PR.

Hey, hello, and welcome, everyone. Thank you for joining me and tuning out to this talk on harmonizing code and Melanie. I'm actually quite thrilled because we're going to do something quite special today, we're going to create our own new songs, new electronica songs, in just a matter of speaking. And for this, we'll be using music agent, which is actually a blend of multiple technologies, which will enable us in creating those. But hey, before doing that, let me first introduce myself. I am Jan van Wassenhoven. I am the lead architect at Sopra Stereo Benelux for the business line design and development. Also, quite a fun fact, I am the creator of the Scrum Programming Language. And by Scrum Programming Language, I don't mean the Scrum methodology. Of course, it's inspired by this methodology, but it's the actual Scrum methodology. programming language where you can type in, you can write your own scrum code. And by doing so, you can actually call yourself a scrum master programmer. don't hesitate in checking out that code and compiling your own scrum code of course. As we're going to talk about music, of course, you can find me on SoundCloud as well as on Spotify as Mighty John, I got my own Instagram channel as well as my blog where you can find any updates and any news on music agents as well. Because, hey, we're going to talk about creating some new songs, some new music, electronical music in this case, whereas I myself, I was always a big fan of listening to music. I was wanted to create my own music, but yeah, I'm lacking those capabilities. I got my wife playing the org, the piano. She knows how to sing. I got my son playing the drums, the guitar, but yeah, from my end, I did never get, got that worked out. But there's something else I'm quite good at. I know how to code. I know how to develop and integrate multiple technologies all together. And that's where the idea of MusicAgent came up. Quick introduction to MusicAgent. It's a homegrown project where the main coding is done in Python. It's open source and it's fully available on HitUp. So you can check it out and you can start creating your own songs on your own, at home. It's mostly based on using LLMs and image model generation. So we'll be using OpenAI or Anthropic APIs and we'll be using BlankChain for integration with those APIs. So what does it actually do? it's capable of publishing a complete new song. And as an end result, you'll be having your Sonic Pi song code. Whether or not you choose or not, you can also have a recording. It's capable of integrating your own audio samples within your music. It will generate an album cover. If you want to, of course, and it will even generate a full booklet with the lyrics and some documentation on your song, how the song was composed and generated. But first, while we talk about creating a song, we have to take some, some measurements into account because creating a song is not about just, Starting, starting to sing or, combining some instruments, no. There are a lot of different phases involved in the process of creating a song. Whereas, on one hand, we have the songwriting. We need to come up with an initial melody and create some new lyrics. We have to structure the song. We have to play some instruments. We have to arrange the instruments throughout the, full play of the song. And in the end, we need to capture that performance. We need to make a real recording of our new song. There's some polishing up to do, so some mixing, some remastering, where we have some, maybe some silent pauses that we want to get rid of. Maybe we want to add an intro or an outro with a fade in or a fade out. So there's also that final polishing coming up. And in the end we need to generate the final product. So that also means even including an album cover and some technical details, some technical information about the creation of the song. So how do we do that with the music agent? quite simply, we just pray to the AI God. No, not fully like that. It's a bit more complex. But still, first things first, a music agent is actually based on a MAS, a multi agent system. And what does that actually mean? So a MAS, a M A S, it's actually a system combining multiple autonomous agents, where each of your agent is focusing, has an individual focus and an individual mission. So coming from a complex Problem, which is in this case creation of a song, we'll be defining it in different parts where every agent can have its own individual focus and its own mission as part of the creation of the goal, where in the end they will work and interact with each other to achieve a common goal. In the context of music agent, of course, this means we'll, splitting up the different component tasks like the creation of the, setting up the melody, defining the harmony. Creating a ribbon, choosing the samples within the song where each and every agent will have this different task. But collaborating within those tasks, they will come up with a creative and cohesive composition, a new song, a new electronica song at the end of the process. So by doing that, we had to overcome some, different challenges, like we had some technical hurdles, like. the song code is being created, being generated in Sonic Pi, and Sonic Pi itself has some limitations. So we had to counter those. It's quite limited in terms of synths, in terms of samples that can be used. So we had to, find some, workarounds. Also, the LLM, the usage of the LLM is quite, restricted, quite limited and we need to set it in the right context. But it's also all about promoting the right answers and how do we deal with, token limitation because by the end, by promoting, multiple answers, we'll come up with a broad context, a broad amount of, prompts that we are being sent to those agents. And of course, we have some limitations. We have some token and character limitations while calling the APIs for which we had to deal with. Then again, while talking about multiple phase, we need to change the agents. So how can we set the correct conversation context? How do we make sure that the correct outcome is being processed to the next agent. So at the end that we come up with a full song. And then finally, there's also the choice of model of AI model for the different AIs, because depending on what they are doing, depending on the need, It's quite some difference in between and we encountered that while starting with GPT 3. 5 at the beginning and now using lots of 4. 0 for MINI or the Cloud 3. 5 SONNET. For instance, there's a lot of difference in, in the outcome where one is better in image generation, another one is already better in coding, for instance. So a lot of, testing and, involved it, but it's also one of the challenges that we need to overcome. Let's take a look at the flow of Musicaging, how it actually works. from, a user perspective, we will interact with music agents ba, with a CLI or with a, a gui with a web interface. So you get the choice to do, you can rather choose the CLI or you can, do it by, gui. That's your choice, to do But then when doing so, we provide some initial input on how we want, want the song to look like. Could be, a sentence, could be a full description of your song. But that's where the first phase, the first agents, come into practice. So we start with a design phase. Design phase is all about coming up with a concept, describing the lyrics, setting the arrangements, defining on the amount of verses, chorus, whether or not we use a bridge, do we implement a solo, do we use an outro, but also which instruments will play the verse, which instruments will be used during the chorus. And, might, also interesting, while we'll be creating electronic music, we want to include some, the possibility, the availability of samples. So we can have our own, mashup of samples that can be used, but depending on the initial input we gave, the concept we came up, we are gonna choose within the, the listing of samples, which one is the most suitable in usage, while creating that new song. wrapping that all up, that's our design phase, that's, once we have this phase, accomplished, we can actually start coding the song. let's go. We're getting to the creation phase. So within the creation phase, our agent, one of our agents will come up with a first proposal of a song in Sonic Pie code, and then the process start iterating all over again. This, this code, so we'll be starting to reviewed the code. The code will be reviewed by other agents, and, being, adapted based on those, Review input, the code will be adapted by our initial agent, our Sonic Pi coder again. After some iterations, we even have the possibility also to include human interaction from us ourselves as song creators. So we can actually listen to the song, and by doing that, we can provide, additional input, in order to create and improve the song, in the end. Then, finally, we come in a mastering phase where we're actually starting to check on, is everything, sounding quite properly, are there any, fade in, fade outs needed, do we have some silent pauses that need to be excluded, so we're actually starting the polishing of the song. Where in the end we can come to the publishing phase and publishing phase means actual, creating the recording, the album cover, as well as the full booklet of the song, meaning the full technical details of the, with the lyrics, the, the initial input that was used, but also the concept that we came up with, that our agents came up with in the end. Which will all be stored on your local drive, together with the booklet, the song album, as well as the actual song, the WAV recording file. on a side note, you might have noticed something. While discussing this chain of agents, it might have, started you thinking about something else. And something else could have been this software development life cycle. Because in the end, while we'll are developing also new software. We're also gathering requirements like conceptualization of a song. We're also doing an analysis. We're also gathering requirements and coming up with a proper design of our application software. While in the second phase, we start to create the song in terms of MusicAgent, but in the software development cycle, this would mean implementation and coding and reviewing it, testing it, As we did for our song it's actually quite the same in software development where our application is being tested, being reviewed, could be sent back to the developer to be adapted. Where in the end we're going to deploy it, we're going to actually produce the software, it will be deployed and will be set up in production. Same as creating a full song, your WAV, your recording file as well with a full booklet and it can still be adapted afterwards. But hey, let's get back to MusicAgent and a full glimpse on the architecture overview. So as mentioned, we can interact via CLI or via the GUI. where we'll be calling a bunch of Python code scripts. So these code scripts will actually, launch our different AI agents, which will, start doing the work, depending on the phase we're in, depending on which, part of the component of the song creation we're in. And by doing so, they will do this based on, a couple of configuration files that will, Tell the agents in what order they need to produce their task and how they should conclude or should continue based on the input, based on the different tasks that they were provided to. The agents ourself, they will interact with, providers like OpenAI or Anthropic. So they will be using those APIs in order to create, a new song or create new images and so on. Actually, also while, providing, while exchanging in between agents at one point during the creation phase, there is also, in human interaction possible, but while before doing so, you need to listen to the song. So in order to be able to. listen to the song. We'll also be interacting with Sonic Pi, and we can do this by using OSC, Open Sound Control. So this will enable you to have some, some, playback of the song already, while your agents are working on it. You'll, get to have a hearing of the initial, song, proposition. And by the end of course we'll have our, production of our song. We have the actual booklet, the actual, WEV file, as well as the album cover and so on. And as mentioned, do throughout the process because we can hear the feedback. There's always a possibility in return to the CLI or as well as the GUI to interact with our agents and to have, to add some additional information or some, to provide some remarks on the song to our agents. to improve the final song that will be created in the end. in practice, how does it look like? I mentioned there are a couple of, config files that can be used to set up Music Agent. first of all, we have the configuration of our artist in which we will define how the album will look like, what's our styling, but also the different type of assistance that will be cooperating together throughout the song creation. from an artist who will come up with a concept or a coder that will actually write the code in Sonic Pi. Secondly, once we got our, assistant, defined, we have our, artist, configured, we need to define, the different phases throughout the process. for instance, we have a phase of songwriting where a song, where the song need to be, written, where we have our, songwriter agent that will actually come up with, the lyrics of the song. They will, be given some, advice, some, some remarks via the composer, another agent, which already collects information as well as from us during the composition, the conceptualization phase. So as you can see within the, within the configuration as well, we will be providing already some initial input like a theme, a melody, a rhythm, and in an outcome we'll have, some lyrics as well as a structure of the song. That being said, we get the assistance, we get the description of different phases, like for instance songwrite, might as well be the recording, the album generation, as well as the sonic bi coding of your song. We need to bring them all together. finally, a third one, a third configuration is needed, and that's the actual sequence order of our different phases. They can be, defined in a sequential order, but sometimes we will also be needing to iterate multiple times all over the same phase again. And, one good example of this, of course, is the writing of the song while we'll be writing Sonic Pi code. This code will be reviewed and will have to be modified afterwards. So we can define ourselves the multiple of, the amount of iterations that will be needed to go through the song and to correct this song. just an example of MusicAgent practice using the basic flow. pre configuration file will be defining the role of the artist, the chaining and the different phases. there are multiple configurations possible, I will show more in the IDE afterwards. But that's, there's a basic setup available. Within music agent. So we got the chaining. So we start from a user query prompt or from the GUI, where we start a different phase from coming up with a concept to actually writing the song and creating the final song. for instance, within the concept, we got our agents. talking to one another, the artist talking to a composer, providing some initial input to come up with a final idea for the song and a concept of the song and so on. But for instance, when we go to the song code review, there are multiple agents connected to one another. Where one is doing the code review of the song, another agent will start modifying and correcting the code based on the input of this code review. And we can even go a bit further. While adapting those different configuration file, we can extend them with initial, with additional, possibilities, for instance, we can add a human code review phase, which will actually enable, in setting up, in sending out a, a, where we can define an agent to send out, OSC commands to make sure that we can have a playback of the song where the human, can review while listening to the song. He can, provide some review and taps in some review via the console or via the GUI where afterwards code modification can take place and where we can re listen to the song until we are, completely, glad or agreeing with the final composition of the song. Also, we can extend it with, song recording. Song recording can be done via Sonic Pi. again, we'll be using Open Sound Control to interact with Sonic Pi, where we can have a playback and generate an actual WV recording, which is your actual song at the end. that being said, that might be time for a little demo and an introduction to the code itself. first of all, let's go to the project itself. I got my project opened in IntelliJ. it comes up with a bunch of readme files. It's quite extensive. It's quite extensively explained how you can set it up locally, how you can use it on your own. As a code basis, it's quite limited. There's a bunch of Python scripts, some configuration to be set up that can be configured, but it comes already with the default stack of configuration, so you can actually, by simply installing all the libraries, you can be quickly up and running and start launching MusicAgent. first I want to, bring your attention to one particular folder, which is the agent configuration. As I mentioned, we have multiple configuration files available that enable us to define the creation of the song, the how we will chain up our agents. So we start with the default, Default one that can be used, the one I showed earlier on, which is the Mighty John, but of course we got the art, evaluation and full other setups that can be used. But basically, it comes to setting up the different assistants within your artist configuration file, where you can even define your style, which will be used for the cover art creation. There is the music creation phase, where we define the different phases of the music. of the creation of a song, like for instance, as you can see, songwriting, different segmentations where we define, which arrangement do we have one verse, two verse, chorus. there's the arrangements, that need, that are going to be defined, meaning which instruments will be using, sampling phase, initial song coding. And you can also notice now the different prompts being used. throughout the process for those different agents, as well as the input and output of those, those prompts. these are the phases. There's a third file, which is the creation chain, where we actually define the different, phase to be used throughout the process. Different agent configurations, where is the difference in between, might be John, the default configuration consists of the simple song creation with some, iterations on code review. Whereas, for example, the art configuration, only consists of creating cover art. So you just bring out a concept, it will not create a song, but it will come up with cover art creation, a cover album creation, whereas the full, for instance, involves human interaction, involves cover creation, but also the actual WEV recording in the end. Next to that, in terms of setup, there's also the samples folders where you can easily yourself integrate your own samples that can be used throughout the process. it's, you can simply drag and drop your samples in and it will be reloaded throughout the process. And they will be used depending on your concept, depending on the direction you're going to, they can be introduced, they can be used throughout the process. besides that, we, have some setup folders. More importantly, there are two ways in launching our, music agent. as I mentioned, we can do this via the CLI. Simply by running python run. py. And it's slowly starting up now. where it'll provide you some choices, okay, which API provider you want to use. In this case, I will be choosing, OpenAI. it asks me the type of model, the type of, agent configuration you'd like to be using. in this case, I will be, start, I'll use mighty John. And then we can go on in entering the name of the song, but besides doing this via the API, of course, there is the, you can also use the web application. as I showed you, you can use CLI for interacting with MusicAgent, but we can also use the web browser for interacting. So there's a, an application, a web application available with MusicAgent. We can do it completely the same as we did within the CLI. It also comes with a song configuration again, where you can choose the API provider, the model to be used. And of course, the agent type can be selected. And even while interacting with it, you can even still make some modifications to the configuration of your agents, as well as the assistance genres that are included. let's start creating our own song then. Let me come up with a title. Let's call it the banana song. I'm almost at summer vibes. the banana song, I will choose electro, but again the genres are also depending on your configuration, your agent configuration, but I'll choose electro in this case. Let me make it a song about a couple of bananas at the beach, maybe, as we're in a summer vibe mood, at the beach. Drinking some cocktails. the song, let it, the song is inspired by late 90's music. Cause hey, I'm a 90's guy and I can't drink. tend to appreciate that type of music. So that being said, so we got everything set up. We can start generating some new music. So there we go. And it's done. And you notice Underneath, we have quite some setup being done already. you can have different badges activated. So first of all, we can see the input parameters being passed by, through our music agent. And you'll notice along the way, it's getting filled in with everything provided by our agents. On the right, we see our timeline. So this is based on the setup for the default agent type, which contains the timeline, a timeline with the different phases include like conceptualization, songwriting, segmentation, and so on. But we also notice the iterations when it comes to song code review. In this run, the default setup, there's no human interaction, but this is also due to the fact that we want to speed up a bit because, the iterations can take quite long. So that's why we do for, the sake of the demo, we do the basic setup, the default stack, with a couple of iterations, as you notice, two cycles for code reviewing, where we'll, almost At the end, proceeding down, you can also notice in, MusicAgent, you can follow up the agent conversations in between, how they interact with one another, where you can also see, the amount of information which is being sent and, outputted along the way. And while we notice, this, cover being created from a couple of bananas drinking, A cocktail at the beach, of course. Accordingly, you can follow up the logs. And also quite interesting is also there's a view on the different code, the Sonic Pi code, which is being created, which, you, where you also have the availability to playback it or to send it to Sonic Pi. if we return to IntelliJ, let's get back to our IDE. You'll notice that in the meantime, a new folder has been created, which includes the banana song. And where we also can find back our Sonic by song creation, our complete file, with the coding of the song. It includes the album cover, as well as the readme, which is the full booklet of your song. let's have a quick look, if I can. Take along all the code and put it back in Sonic Pi. So it gives us something like this. So let's give it a quick run. Which actually sounds quite nice, isn't it? Now Just as a small remark, it doesn't end up always with a good result because depending on the amount of iteration, depending on how your agent was set up, it sometimes comes up with something completely eclectic or it messes up some of the, charts, for instance, within Sonic Pi. there's, There might be sometimes some modifications to be done, but then again, as you have the code, you have the full logs of your song creation, you can adapt and you can change accordingly. But the more you adapt your agent, the more you make it more detailed in terms of prompts, in terms of setup, and also the chaining, the more time you give it, the better the results you get. by the end. And also if you interfere, if you, add some human interaction. So if we go back to in our IDE, so back again to the agent config. So let me take, for instance, the the full setup where you can notice that, in this case we set up multiple code reviews, but also multiple human reviews. On one hand we have the agents reviewing some code as well as human reviews, but, there's also, when we take a look at the face config, we can even set up the playback if I get there. So this we can include code validation as well. As well, along the way. that was for a quick example of the code, so if we get back to the presentation now, to come to a conclusion. a small remark I wanted to make in the end. This is, a quote from, Nick Cave, which you might know. Quite, quite famous singer as well. And he, at the beginning, when they sorted out JetGPT, the first AI models, he got a lot of message mails coming in from fans, starting to say, okay, I can write code as you do. And, Quite importantly, he answered, he, because of the amount of, people asking him, he said it's a blood and guts business here at my desk that requires something me of me to initiate a new and fresh idea and it requires my humaneness and It's quite interesting because while, although we're using AI in this case to create a new song, but also in terms of coding, in terms of helping us out while developing new software. It's quite important that we as a human remain to keep the full perspective to the full perspective on the context on what we're doing. And whether we talk about song creation, whether we talk about software development, we are in control of our AI. And you may notice also while using music agent while creating a new song, it's quite important that we give feedback that we set the right context and that we direct our agent in the right direction. Sometimes. pop up with good results, but not every time. And then we need to redirect them. So it's a nice quote, quite important one also to keep in mind. So as a wrap up, as a conclusion, the full code repository is available on GitHub, where you can find MusicAgent on my account. There's also more information to be found on the blog mightyjohn. com. So go ahead and check that one out. out. If you have some more questions after this talk, please connect with me. Just ping me via LinkedIn or, one of the social, social platforms, of course, are also possible, but Hey, let's get started. Shall we, shouldn't we start creating some music? Yes, we do. okay. I'll invite you to check out a music agent and start creating your own music. Thank you for listening. Thank you for watching and see you again on the next talk. Thank you.

Slides

Download slides (PDF)

See all 53 talks at this event!

Conf42 Python 2025 - Online

February 06 2025 - premiere 5PM GMT

Harmonizing Code and Melody: Making Electronic Music Creatively

Video size:

Abstract

Summary

Transcript

Slides

Jan Van Wassenhove

Backend Pythonista @ Sopra Steria

Join the community!

Featured event

2025

2024

Info

Conf42 Python 2025 - Online

February 06 2025 - premiere 5PM GMT

Harmonizing Code and Melody: Making Electronic Music Creatively

Video size:

Abstract

Summary

Transcript

Slides

Jan Van Wassenhove

Backend Pythonista @ Sopra Steria

Join the community!