Create Sound Synchronization Magic in Flash, Part 1

Create Sound Synchronization Magic in Flash, Part 1

or basic sound synchronization needs, Flash offers two sound settings that give its users a choice in how to match audio and graphics sequences. “Event” sounds are triggered when the Flash playback head comes in contact with a frame in which the sound is placed. This is useful for short sounds and simple sound effects, for example a ball bounce sound placed in the frame where a bouncing ball hits the floor. Event sounds must be fully loaded before they can play. “Stream” sounds allow Flash to synchronize sound more like video software. When an animation soundtrack is set to stream, Flash will drop frames to keep the animation in sync with the sound. Stream sounds, as the name implies, can also play during download.

These options offer fine ways of handling sounds when basic running synchronization is needed, but what about situations requiring more control over audio sync? For example, what if you want to use programming to synchronize sound with animation such as syncing a voice-over presentation with accompanying visuals, using ActionScript instead of the timeline.

A common method of accomplishing this goal is to break the audio portion into small chunks and import them into separate frame spans that correspond to changes in the visuals. Using Flash MX or later, it is also possible to use the onSoundComplete event handler to execute code when the sound finishes playing. However, chopping sounds into small pieces this way is tedious, and requires a lot of management.

While this, and similar, methods offer solutions to some of the synchronization issues that face developers, none are particularly good at handling rapid events (many changes per second). Perhaps more importantly, none of the previously discussed methods allow for more advanced audio sync techniques such as linking assets to a sound’s volume, or its bass, midrange, and treble frequencies.

New Sync Possibilities
Taking that idea a step further, what if you could pre-process a sound and get back a simple array of amplitude (volume) values, or even up to 16 bands of frequency data, to react to? If this array included a value for every frame per second that your Flash movie used, a simple enterFrame event could retrieve each corresponding value easily allowing you to manipulate your file in exact sync with the sound’s sonic characteristics. That’s exactly what I’m going to show you how to do in this article.

The ease with which this can be accomplished makes it more feasible for designers with little scripting experience to code dynamic animations driven by sound. Another, equally important idea, is that these same techniques give programmers who never considered themselves to be designers a chance to create simple but compelling sound-based designs.

Just about any user can create beautiful sound visualizations simply by scaling shapes, or changing their transparency to the beat of a soundtrack. Remember the speaker cabinets in the ’70s with lights that throbbed to the music? Think of how much easier it could be to bring games or linear animations to life by driving assets with sound rather than additional programming.

Remember that electronic game from the ’70s called Simon? The programming beauty of Simon is that a reasonably limitless sequence of tones/shapes could be created just by adding a random number to each round. What if that process of creating the tones/shapes weren’t random, but instead were linked to pre-recorded music? A musical sequence with four discreet frequencies is played in any pitch the designer chooses. The pre-analyzed sound triggers the shape changes and the user is left to imitate the sequence. Adding more frequencies to the mix (up to 16) gets you close to a more sophisticated game such as the Konami arcade game Dance Dance Revolution.

A much simpler, but still impressive, example might be something similar to the popular Nintendo game, Donkey Konga. The drum sounds, and synchronized visual cues, can be triggered with a very few lines of script simply by pre-analyzing a stereo drum track. A drum in the left channel corresponds to the left on-screen drum and a drum in the right channel corresponds to the right on-screen drum. Simply by playing the songs, the audio and visual cues will be triggered, and the user has a sequence that must be matched.

Author’s Note: To follow along with this article, download the source code, including low-resolution audio files. If you want to try to analyze the sounds used herein yourself, you may also want to grab the high resolution audio files in an optional separate download.

Enter FlashAmp
Okay, okay, enough examples. All of these ideas and more can be realized with a powerful tool called FlashAmp, by Marmalade Multimedia It allows developers to pre-process audio files to get an array of the sound’s amplitudes, or an array of the sound’s frequencies divided into 2 to 16 ranges, or bands.

The FlashAmp application is very easy to use. Its power is surprising considering its small number of settings, which I’ll discuss in detail later on. For now, it’s only important to know that FlashAmp works by analyzing a sound and storing amplitude or frequency data at regular intervals. Because you can determine the intervals, choosing an interval that matches your Flash file’s frame rate, such as 12 fps, makes synchronizing to sound playback a simple prospect.

Similarly, you can choose the scale to which the data will conform. For example, if you wanted to change a MovieClip’s transparency with a sound’s amplitude, you can choose a scale of 100. This scale results in values from 0 to 100?perfect for assigning a value to the _alpha (transparency) property. If, instead, you wanted to send a MovieClip to a particular frame, a scale of 10 might be more practical.

Amplitude Analysis
In the “amplitude.fla” example source file (included in the download with this article), I want to show a speaker that changes size to the beat of the music. When the music is very quiet, the speaker will be approximately normal size, whereas when the music is very loud, the speaker will be 50% bigger.

Figure 1. The amplitude source file example uses a speaker that throbs (gets larger and smaller) to the beat.

To accomplish this, I need to use FlashAmp to measure the sound’s amplitude as many times per second as my Flash file’s frame rate requires?in the case of Flash’s default frame rate, 12 times per second?scaling the data to a range of 0-50.

Creating this data requires little more knowledge than what I’ve already discussed. Working in FlashAmp, I need to create an amplitude list, using a scale of 50, and a frame rate of 12. I am only displaying one speaker, so I don’t need stereo data (a value for both the left channel and the right channel at every interval), and I don’t need to worry about any other settings during this first FlashAmp use.

After choosing the mono MP3 sound file using the Input pane, and copying the data from the Output pane after processing, we end up with this array:

amplitude = [0, 23, 35, 49, 45, 27, 21, 30, 11, 0, 0, 3, 33, 12, 29, 45, 35, 38, 24, 2, 0, 22, 16, 0] 

As you can see, there are 24 amplitude values (2 seconds x 12 values per second) and each value ranges from 0 to 50. Making a picture of a speaker throb to this music would be as simple as adding the above scale values to its existing scale of 100. That is, when no sound is coming out of the speaker it will be normal size (100% + 0%), but when the amplitude is maxed, it will increase the speaker size of 150% (100% + 50%, see Figure 1).

Author’s Note: Notice that the array index is not just _root._currentframe. This is because when a sound starts, you want the first value in the array to be used. Because ActionScript uses zero-based arrays, the first array index is zero. Thus, if you start playing a sound in frame 1, you must subtract 1 from the _currentframe property. This technique changes when you want to use external sounds, and I’ll elaborate on this subject later.

The script on the speaker would be as simple as the code below.

onClipEvent(enterframe) {    this._xscale = 100 + amplitude[_root._currentframe-1]    this._yscale = 100 + amplitude[_root._currentframe-1]} 

Spectrum Analysis
You can use the sound’s frequency data similarly. The frequency data of a sound is usually referred to as spectral data, or a look at the frequency spectrum of the file. FlashAmp can output this data in your choice of 2 to 16 ranges, or bands. You’ve probably seen this represented in an audio equalizer in a stereo or in a software application like iTunes. Equalizers allow you to boost the low frequencies, temper the high frequencies, or anything in between. FlashAmp doesn’t modify the sound like an equalizer, but it does show you the frequency data at any point during playback.

The principle is the same as amplitude analysis with one difference. Instead of returning a single value at each interval, it returns a value for each frequency band requested at each interval. Here is an example of a six-band spectrum analysis, for a 2-second sound, using a scale of 100, at a rate of 1 fps. (The very short sound and very slow frame rate are both impractical for this purpose, but it simplifies the example.)

spectrum = [[0, 0, 2, 10, 10, 1], [45, 53, 30, 38, 22, 2]] 
Figure 2. The spectrum source code example uses a six-band frequency visualization that also throbs to the beat?six times per interval.

As you can see, there are two array values (2 seconds x 1 value per second), six frequency bands per value (low frequencies first, moving on to higher frequencies), and each frequency value ranges from 0 to 100. If you imagine an equalizer visualization, this might represent a bass hit. The lowest frequency jumps from 0 to 45, the midranges each move 12 to 53 units, and the highest frequency only changes from 1 to 2 (see Figure 2).

Making a six-band peak meter to show this data, would be as simple as setting each of six MovieClips (in this case called “bar0”, “bar1”, etc.) to the scale specified in the array. The MovieClip’s simple script is shown below. This can be seen in the “spectrum.fla” source file, included in the download with this article.

onClipEvent (enterFrame) {    //determine frame offset outside of loop    //  to avoid recalculating 6 times    myFrame = _root._currentframe-1    // change 6 MovieClips for each enterFrame    for (i = 0; i < 6; i++) {        // identify each peak meter 'bar'        myMC = eval("" + i);        // set its vertical scale to array value        // at the index matching the sound frame        myMC._yscale = spectrum[myFrame][i];    }} 

Lip Sync by Jumping to Frames
For one final example, I want to look at a great lip sync technique, using a sound's amplitude. In this article, I will not examine the high quality, labor-intensive approach of matching phonetic speech sounds with specific mouth positions. For a more practical approach, suited to average needs, I will show how to play a "mouth animation" while text is being spoken. While it is not possible to match a mouth position to phonetics using FlashAmp, it is a good idea to vary the mouth positions more than just to the degree the mouth is open. This prevents your animated characters from having flapping jaws like the Team America marionettes. The example source file, "lipsync.fla," uses the 11-frame mouth animation shown in Figure 3.

Figure 3. Easy lip synchronization is achieved when this 11-frame sequence of mouth positions is manipulated by a sound's amplitude values.

Traditional lip sync methods call for breaking the audio file into sections of speech and pauses, playing the sequential or, perhaps randomized, animation during speech, stopping the animation during pauses. However, in addition to being tedious, this approach makes changes difficult. Often, changes require that entire passages be reanimated to match new audio.

Using FlashAmp, however, you can use a sound's amplitude to display different mouth positions, thus making a character speak only when the voice is loud enough to hear. Therefore, you could play a single sound as long as you needed, with one simple script that sent the mouth animation to each frame indicated by the array. To accomplish this, we'll use a scale of 10, instead of 100, because we want to send the "mouth" MovieClip to one of the positions pictured above. This example array again uses a demonstration two-second sound at Flash's default 12 fps:

amplitude = [2, 6, 6, 4, 5, 3, 1, 5, 4, 5, 3, 1, 0, 0, 0, 0, 3, 3, 3, 2, 4, 3, 3, 3] 

You can see that various mouth positions are being displayed, based on how loud the character is speaking, as the sound progresses. You can even see a pause between phrases, indicated by the four consecutive zeros toward the center of the array.

The script on the mouth MovieClip in frame 1 would read something like this:

onClipEvent(enterFrame) {    myFrame = _root.amplitude[_root._currentframe-1] + 1;    this.gotoAndStop(myFrame);} 

While we still use our standard frame offset compensation to get the correct index value for the array (in this case, subtracting one because the sound starts in frame one), you may notice an additional "+ 1" at the end of the first line of script. This is because using a scale of 10 to create our array could result in values from 0 to 10, and we're working with frame numbers where there is no frame 0. By adding 1 to the frame number value, a minimum value of 0 will yield frame 1, and a maximum value of 10 will yield frame 11. This gives you more control than if you were to allow Flash to automatically default to frame 1 if you specify frame 0, or to default to the last frame of a clip if you specify a higher frame number.

In part 1 of this article, I showed you how to use FlashAmp to create a simple amplitude array, and how to use amplitude and spectrum data in a variety of ways within Flash. The source code that accompanies this article includes amplitude and frequency visualizations, as well as an amplitude lip sync example. Next month, I'll discuss the remaining FlashAmp settings, how to use external as well as internal sounds in your files, how to access the FlashAmp arrays properly even when your sound playback doesn't begin in frame 1, and a few tips and tricks for optimizing your data.


About Our Editorial Process

At DevX, we’re dedicated to tech entrepreneurship. Our team closely follows industry shifts, new products, AI breakthroughs, technology trends, and funding announcements. Articles undergo thorough editing to ensure accuracy and clarity, reflecting DevX’s style and supporting entrepreneurs in the tech sphere.

See our full editorial policy.

About Our Journalist