Video Game Music Implementation With Wwise Part 1

The very mention of game audio implementation can be rather overwhelming to some composers and sound designers.

Game audio workflow is very different than working on a film or a trailer. To put it simply, films and trailers have a linear timeline where everything goes from start to finish (left to right), but game audio is non-linear and it can go on forever randomly in various parts.

You are being constricted to the film’s timeline and every time you rewind part of the film, you get the same playback over and over again. In games that is not the case. In games, everything is interactive and it depends on the player’s choice, so you need to make the audio correspond with the player’s actions.

This will be a series of texts in a few parts, because the very concept of game audio is complex, and we will be concentrating on music only.

What we will be covering here is:

– how to prepare your music parts for implementation

– how to set up your project inside audio middleware such as Wwise,

– importing your music parts, editing and connecting them together,

…and finally making the transitions of parts for various states in the game.

So, as you can see, there’s a lot of things that need explaining here, so buckle up, here we go!


In this post, we will cover the basics of music parts preparation for working in Wwise. It may seem like a completely casual thing, but trust me when I say that this is THE MOST IMPORTANT part of the whole process. This is where you will practice your audio editing skills.

Because of the way game music is implemented, you need to be very precise in your editing and define some things first of all. In order to make the game music flow seamlessly without any pops and clicks, you need to take care of the starting points and tails of your audio clips.

Most game music is composed in various parts which are bounced to smaller clips with a few bars of length. For ambient music when you are exploring a forest or some space galaxy, the clips are usually 16 bars of length because it takes some time to develop the sound of all of those pads and drones.

For action music, those lengths can be 8 bars, because there are usually a lot of parts in action music and they need to get changed randomly to make the music more dynamic without sounding repetitive. This is the foundation of interactive music in games.

But here is the thing you need to know: You need to have one bar of pre-roll and one bar of end tail in order to make the transitions smooth between parts when they are being randomly played in Wwise. What that means is that you have 18 bars of audio for an ambient clip (1 bar pre-roll + 16 bars of music + 1 bar of tail out) and 10 bars for action music.

This pre-roll is usually some sort of a whoosh sound, maybe a cymbal swell or drum roll if it’s a percussive action track, while the tail-out is just your fade-out of the clip. But not every clip needs to have a sound as a pre-roll. That’s mostly used on percussion or SFX clips.

This is where we get into stems of audio clips. Basically, for an orchestral track, you have clips of each of the instrument groups that you divided into stems. In other words, individual stems for strings, brass, woodwind, percussion, keys, voices, solo instruments, sfx, etc…

Some developers like to save much-needed resources, so they split stems into bigger groups. Orchestra, percussion, voices, synths, etc… Or a full mix of the parts with stems blended together as one clip.

But even though some of the clips don’t have an audible pre-roll sound, they should have that 1 bar of pre-roll anyway in order to be in sync with the rest of the audio clips when being played back together. All of them need to have tail-out.

When editing audio clips, the best way to do it is to make the project the same BPM and signature as you composed it, because your Wwise project setup will have to be the same. This ensures that all of your audio clips will snap perfectly and they will be triggered properly in Wwise audio engine. Plus, it’s good to test the clip playback inside your DAW as well as make everything play in sync.

Take a look at the screenshot above for a better explanation. If you get a feeling that your tail-ends are ending abruptly with a popping sound, try to add a small fade-out curve to it in order to make it fade out more naturally before you bounce all of the parts.

You can see on the example below why making these pre-rolls and tail-outs is important to make the clips transition from one to another without ending or starting abruptly.

In short, while the previous is fading out, the next one is fading in seamlessly in a natural way. Notice how the transients snap to one another because of the same BPM and signature.


Once you take care of all the clips lengths, their pre-rolls and tail-ends, it’s time to export them individually and label them properly.

The naming convention of game audio assets (including music) usually depends on the developer’s audio team, so if you’re working with people who have years of experience in game audio, you will mostly be following their way of labelling the assets, but I will give an example anyway:


The m usually stands for music asset. This is how the audio designer will know that this is a music clip, not an atmosphere or a footsteps sound clip.

dungeon01 is a region or a level in the game where the music is playing. This would be the first dungeon that you’re exploring in the game. If it’s the second dungeon, it would be labelled dungeon02. This is just an example mind you. It can be a forest, cave, star system, whatever the developers call it.

action02 means that this is the second action track that plays in that region (dungeon01 in our case) because usually the developers like triggering a few tracks for various encounters inside the level for more variations.

120bpm is obviously the tempo of the track and the clip, and it’s helpful to include this in the file name because you will be syncing everything to this BPM when setting up the project in Wwise.

4-4 is the signature of the track, and this is also helpful to have in the file name when setting up Wwise.

str stands for the stem part of the clip. This is for strings, and strings are mostly labelled str for short. It could be brs for brass or ww for woodwinds, prc for percussion, etc…

01 is for labelling each audio clip of the stem. Each track can have a couple of clips for each of the stems in order to make them play randomly and transition between each other. Think of it as a song that has an intro, verse, hook, pre-chorus, chorus, etc. This is how we name parts of our audio clips when working on games. Some like using letters like A, B, C, etc., but numbers are more common.

And that’s it regarding the preparation of your audio clips, naming and exporting them for implementation inside game audio middleware.

Join us in our next post as we continue this series by going deeper inside game audio implementation, and where we will set up our project in Wwise.