SPOKEN VOICE VARIABLE PROCESSING USING SHORT CONVOLUTIONS

in Soundwalk: Paris: Pigalle (2005)

** ORIGINAL ISSUE

This is a project mixing spoken vocals, ambiences and music. The decision was taken to create an original timbre for the spoken vocals.. The final product lasting near 1 hour, this timbre must be slighty evolutive. What's more, important spoken intructions should be underlined sonically.

** SOLUTION

There are two simultaneous vocal tracks, one is main and the other is secondary. The main vocal track is the "dry" track, even though it's processed with an Altiverb used as an EQ. The other track is processed more heavily, and is used for all the effects.

Initially, the two tracks feature the same content, albeit with a different timbre. Edits are done on the secondary track so that different effects are created:
1. If both tracks are perfectly simultaneous, it makes the first track louder
2. If regions from the second track are manually delayed just a tiny bit (say 20ms), the main track's timbre sounds as if it was modified
3. If regions from the second track are manually delayed a bit more (over say 100ms), it creates an echo to the first track
What's more, the second track features quite a lot of volume automation, and that constantly modifies the perceptual nature of the effects.

Underline can use any of the three techniques: as long as the underlined words stand out, the desired effect is achieved.

Following is a screen capture of the vocals audio and aux tracks inside the correpsonding Pro Tools session.

 

The Altiverb is used not as a reverb, but as a special EQ - with very short, synthetised impulse responses.

- the "Altiverb as an EQ" on bus 1 gives the vocal's main color ; a thick voice, not too heavy.
- the "Altiverb as an EQ" on bus 2 is more drastic, and gives special coloration when needed.

For instance, when they were purely fictional (and sexual...) scenes, bus 2 was given more importance.

Also. the regions on the track that goes in bus 2 were manually delayed (variable, 20ms to 500ms), in order to be used as echoes/repeats when needed (~emphasis on a particular word).