Below is shown an implementation of the Schroeder complemenary comb filters as a Csound UDO:
opcode schroeder, aa, ak ain, kdelay xin imaxdel = 30 kdelay limit kdelay, 0.001, imaxdel adelay1 vdelay3 ain, kdelay, imaxdel adelay2 vdelay3 adelay1, kdelay, imaxdel aleft = -ain + (2 * adelay1) - adelay2 aright = ain + (2 * adelay1) + adelay2 xout aleft*0.5, aright*0.5 endop
There are two significant issues with these techniques, which are also the reason why these techniques are not favoured by critical listeners. They both produce strong coloration: the signal sounds noticeably different from the original, and phasiness is present due to the sharp comb filters. Consequently, they are rarely used nowadays. Although their effect is quite dramatic and noticeable it is also tiring to listen to after a while.
Bauer proposed the usage of all-pass networks to produce phase difference between two channels [3]. He proposed this technique for loudspeaker reproduction, so his reports that spatial impression was notably affected does not seem to contradict Schroeder's findings for headphones, as there will be filtering in the amplitude through the addition in the air of both signals. He does note however that more reverberant rooms tend to lessen the effect.
Orban proposed an enhancement to this technique that adds the all-pass networks, and through a gain control allows controlling the amount of image spread [4]. His method is shown in Figure 4. Orban suggests setting N = 4 and M= 2,3,4.
An all-pass filter can be produced in Csound using the following UDO:
opcode AllPass, a, a ain xin ; Generate stable poles irad exprand 0.1 irad = 0.99 - irad irad limit irad, 0, 0.99 ;iang random -$M_PI, $M_PI iang random -$M_PI, $M_PI ireal = irad * cos(iang) iimag = irad * sin(iang) print irad, iang, ireal, iimag ; Generate coefficients from poles ia2 = (ireal * ireal) + (iimag * iimag) ia1 = -2*ireal ia0 = 1 ib0 = ia2 ib1 = ia1 ib2 = ia0 printf_i "ia0 = %.8f ia1 = %.8f ia2= %.8f\n", 1, ia0, ia1, ia2 aout biquad ain, ib0, ib1, ib2, ia0, ia1, ia2 xout aout endop
This UDO produces the filter coefficients for an IIR biquad filter by randomly generating stable poles which guarantee an all-pass response. This UDO in particular produces a 2-pole all-pass filter.
This UDO can then be used to realize the Orban method like below (including selection of number of poles for the second filter):
opcode Orban, aa, akk ain, kwidth, kmode xin ;; Cascade two filters to create a 4-pole all-pass a1 AllPass ain a1 AllPass a1 a2 AllPass ain if kmode == 1 then a2 AllPass a2 endif aout1 = a1*kwidth + a2 aout2 = - a1*kwidth + a2 xout aout1, aout2 endop
It is important to note that since the filter coefficients are random, there are infinitely many different filters which could be used in combination. Usually, the best combination must be determined by trial and error as different filters will have different effects on different material.
Gerzon proposed a further refinement which allows very precise control over the characteristics of the complementary filters [5]. It uses two identical all-pass filters, as the basis for construction of comb filters. It is shown in Figure 5. Like the Orban method, the spread can be controlled using gain on one of the paths to each channel. It must be noted that this method is no longer completely mono-compatible, although it can have less artifacts than the previous one.
The method can be realized in Csound using a UDO below, which has a mode parameter to select the number of poles of the all-pass filters by chaining two-pole all-pass filters:
opcode Gerzon, aa, akk ain, kwidth,kmode xin a1 AllPass ain a2 AllPass a1 if kmode > 0 then a1 AllPass a1 a2 AllPass a1 endif if kmode > 1 then a1 AllPass a1 a2 AllPass a1 endif if kmode > 2 then a1 AllPass a1 a2 AllPass a1 endif if kmode > 3 then a1 AllPass a1 a2 AllPass a1 endif if kmode > 4 then a1 AllPass a1 a2 AllPass a1 endif if kmode > 5 then a1 AllPass a1 a2 AllPass a1 endif if kmode > 6 then a1 AllPass a1 a2 AllPass a1 endif aout1 = ain*kwidth + a1 aout2 = -a2*kwidth + a1 xout aout1, aout2 endop
Gerzon mentions that one of the purposes of this technique is to do "simulation of the size of a sound source" [5]. Both Schroeder and Gerzon envisioned this process as one that could also be applied individually to individual mono components of a stereo mix.
In essence as Gauthier states [6], any combination of inverted filtering can serve as pseudo-stereo filters for artistic purposes. This could be done both with time domain filtering or with frequency domain filtering.
Artificial Double Tracking (ADT) is a recording technique developed to generate a second channel of lead vocal from one take, using a second synchronized tape machine. The second machine would have small delay and a amount of wow and flutter effectively modulating the signal. The original signal is sent to one channel, while the second signal goes to the other. The variations in frequency between both signals will create a modulated comb filter, which in turn produces variations in IACC, perceived as spatial properties if the modulation is small enough. The wow and flutter can be modeled as simple periodical modulation, or using some form of interpolated jitter:
opcode modulate, a, akkkki ain, kfreq, kamp, ktype, kdelay, imaxtime xin if ktype == 0 then amod jspline kamp, kfreq, kfreq elseif ktype == 1 then amod poscil kamp, kfreq, 1 elseif ktype == 2 then amod poscil kamp, kfreq, 2 endif aout vdelay3 ain, kdelay *(1.00001 + amod), imaxtime xout aout endop
The processing would then look like:
aflutter modulate gainput, gkflutter_freq, gkflutter_amt, gkflutter_type, gkdelay_adt, 100 awow modulate aflutter, gkwow_freq, gkwow_amt, gkwow_type, gkdelay_adt, 100 aout1 = gainput*gklevel_adt aout2 = awow*gklevel_adt
A variation of this technique, implemented by hbasm's Stereoizer plugin, adds phase inverted chorus to the dry signal on each channel. Mono downmix will result in the original signal, while chorusing will be heard on each individual channel. These techniques are pictured in Figure 6.
aflutter modulate gainput, gkflutter_freq, gkflutter_amt, gkflutter_type, gkdelay_adt, 100 aout1 = (gainput + (aflutter*gkwidth_adt) )*gklevel_adt aout2 = (gainput - (aflutter*gkwidth_adt) )*gklevel_adt
Artificial decorrelation is defined by Kendall as the "process whereby an audio source signal is transformed into multiple output signals with waveforms that appear different to each other but which sound the same as the source" [7]. In the same article, Kendall proposes a method to spread a monophonic source over multiple loudspeakers using artificial decorrelation. Kendall proposed using FIR filters derived from doing inverse FFT from a spectrum which is flat, but with phases randomized between +Π and -Π. Additionally, the amount of decorrelation can be controlled by mixing the phase vectors before doing the IFFT to decrease the decorrelation. Alternatively, instead of designing a filter this way, a set of coefficients which can provide decorrelation can be obtained from MLS (Maximal length sequences) or Golay codes sequences.
In this technique there is a trade-off between transient preservation and representation of low frequency. To produce a sufficiently adequate frequency response these filters must have at least 1024 points, which might cause smearing of transients for some material. Another issue with this method is that it will have side effects, as Kendall notes, because although the points for the calculated filter will lie at an amplitude of 1, points in between (when the filter is upsampled, or when doing DAC), are not, so the filters are not in effect flat. Additionally, since the phases are different, they will produce different cancellations and reinforcements varying with frequency. For this reason, Kendall states that a "good-sounding'' pair of filter coefficients must be chosen through subjective evaluation. The author does mention that this timbral coloration is less noticeable if the filtering is applied to individual tracks instead of processing the entire mix.
Kendall also proposed constructing dynamic filters from IIR all-pass filters, interpolating between random coefficients to have a constantly varying phase response to produce a similar effect.
Although these techniques are a natural evolution of previous techniques using all-pass filters, particularly to those proposed by Bauer, since they attempt to leave the amplitude spectrum unchanged, they are the first to include the concept of decorrelation as an important attribute in the perception of ASW. It is also significant that these techniques, since they do not involve complementary filters, can be used on a greater number of channels, and therefore can go beyond Pseudo-stereo.
To implement artificial decorrelation filters in Csound, it is necessary to use python, since pure Inverse FFTs (without phase vocoder) are required, which is a low level operation not available in Csound. To use the implementation presented, it will be necessary to have the python opcodes installed in addition to the numpy python library for FFT and IFFT computation.
The filter coefficients and python functions for data exchange with Csound are presented below:
#! /usr/bin/env python import sys from numpy import * from numpy.fft import ifft import numpy.random as random ## global configuration and variables max_chnls = 16 Xn = [] yn = [] for i in range(max_chnls): Xn.append(array([])) yn.append(array([])) ## Csound functions def new_seed(seed_in = -1): # print "New seed: ", int(seed_in) if seed_in == -1: random.seed() #can put seed here, leaving blank uses system time else: random.seed(longlong(seed_in)) #can put seed here, leaving blank uses system time return float(seed_in); def get_ir_length(channel = -1): index = int(channel) if channel != -1 else 1 return float(len(Xn[index])) def get_ir_point(channel, index): global yn return float(yn[int(channel - 1)][int(index)].real) def new_ir_for_channel(N, channel = 1, max_jump=-1): global Xn, yn [Xfinal, y] = new_ir(N, max_jump) Xn[int(channel) - 1] = Xfinal yn[int(channel) - 1] = y def new_ir(N, max_jump=-1): # ''' new_ir(N) # N - is the number of points for the IR, # max_jump - is the maximum phase difference (in radians) between bins # if -1, the random numbers are used directly (no jumping). # ''' if N < 16: print "Warning: N is too small." print "Generate new IR size=", N n = N//2 # before mirroring Am = ones((n)) Ph = array([]) limit = pi Ph = append(Ph, 0) old_phase = 0 for i in range(1,n): if max_jump == -1: Ph = append(Ph, (random.random()* limit) - (limit/2.0)) else: # make phase only move +- limit delta = (random.random() * max_jump * 2* pi) - (max_jump * pi) new_phase = old_phase + delta Ph = append(Ph, new_phase) old_phase = new_phase #pad DC to 0 and double last bin Am[0] = 0 Xreal = multiply(Am, cos(Ph)) Ximag = multiply(Am, sin(Ph)) X = Xreal + (1j*Ximag) Xsym = conj(X[1:n])[::-1] # reverse the conjugate X = append(X, X[0]) Xfinal = append(X,Xsym) y = ifft(Xfinal) return [Xfinal, y.real]
The decorrelation algorithm in Csound using the above python script will be:
sr = 44100 ksmps = 256 0dbfs = 1 nchnls = 2 #define MAX_SIZE #4096# gisize init 1024 gkchange init 0 pyinit pyexeci "decorrelation.py" giir1 ftgen 100, 0, $MAX_SIZE, 2, 0 giir2 ftgen 101, 0, $MAX_SIZE, 2, 0 gifftsizes ftgen 0, 0, 8, -2, 128, 256, 512, 1024, 2048, 4096, 0 opcode getIr, 0, ii ichan, ifn xin index init 0 idummy init 0 isize pycall1i "get_ir_length", -1 isize = gisize doit: ival pycall1i "get_ir_point", ichan, index tableiw ival, index, ifn loop_lt index, 1, isize, doit endop instr 1 ; Inputs and control gainput inch 1 ; gainput diskin2 "file.wav", 1 gkon_decorrelation invalue "on_decorrelation" gklevel_decorrelation invalue "level_decorrelation" endin instr 2 ; Generate new IR and load it to ftable giir inew_seed = p4 kfftsize invalue "fftsize" gisize table i(kfftsize), gifftsizes ; Clear tables giir1 ftgen 100, 0, $MAX_SIZE, 2, 0 giir2 ftgen 101, 0, $MAX_SIZE, 2, 0 kmax_jump1 init -1 kmax_jump2 init -1 icur_seed pycall1i "new_seed", inew_seed ;-1 means use system time ;Chan 1 imax_jump = i(kmax_jump1) ichan = 1 pycalli "new_ir_for_channel", gisize, ichan, imax_jump getIr 1, giir1 ;Chan 2 imax_jump = i(kmax_jump2) ichan = 2 pycalli "new_ir_for_channel", gisize, ichan, imax_jump getIr 2, giir2 gkchange init 1 turnoff endin instr 17 ; Kendall method if gkchange == 1 then reinit dconvreinit gkchange = 0 endif dconvreinit: prints "Reinit ftconv\n" if gkon_decorrelation == 1 then aout1 ftconv gainput, giir1, 1024, 0, gisize aout2 ftconv gainput, giir2, 1024, 0, gisize aout1 = aout1*gklevel_decorrelation aout2 = aout2*gklevel_decorrelation else aout1 = gainput*gklevel_decorrelation aout2 = gainput*gklevel_decorrelation endif outch 1, aout1, 2, aout2 endin
Other similar techniques have been proposed, like decorrelation using Feedback delay networks, sub-band decorrelation, random shifting of critical bands and artificial decorrelation in the frequency domain.
Examples from this article can be downloaded here.
[1] H. Lauridsen, "Nogle forsog reed forskellige former rum akustik gengivelse," Ingenioren, vol. 47, p. 906, 1954.
[2] M. R. Schroeder, "An artificial stereophonic effect obtained from using a single signal," in AES Convention 9, 1957.
[3] B. B. Bauer, "Some techniques toward better stereophonic perspective." IEEE Transactions on Audio, vol. 11, pp. 88–92, May 1963.
[4] R. Orban, "A rational technique for synthesizing pseudo-stereo from monophonic sources," Journal of the Audio Engineering Society , vol. 18, pp. 157–164, April 1970.
[5] M. A. Gerzon, "Signal processing for simulating realistic stereo images," in AES Convention 93, 1992.
[6] P.-A. Gauthier, "A review and an extension of pseudo-stereo for multichannel electroacoustic compositions: Simple DIY ideas," eContact!, vol. 8.3, 2005.
[7] G. Kendall, "The decorrelation of audio signals an its impact on spatial imagery," Computer Music Journal, vol. 19:4, pp. 71–87, 1995.