Báo cáo hóa học: " Research Article A Fast Mellin and Scale Transform Antonio De Sena1 and Davide Rocchesso2"

Chia sẻ: Linh Ha | Ngày: | Loại File: PDF | Số trang:9

Thêm vào BST

Báo xấu

33
lượt xem 4
download

Download Vui lòng tải xuống để xem tài liệu đầy đủ

Tuyển tập báo cáo các nghiên cứu khoa học quốc tế ngành hóa học dành cho các bạn yêu hóa học tham khảo đề tài: Research Article A Fast Mellin and Scale Transform Antonio De Sena1 and Davide Rocchesso2

Chủ đề:

Bình luận(0) Đăng nhập để gửi bình luận!

Lưu

Nội dung Text: Báo cáo hóa học: " Research Article A Fast Mellin and Scale Transform Antonio De Sena1 and Davide Rocchesso2"

Hindawi Publishing Corporation EURASIP Journal on Advances in Signal Processing Volume 2007, Article ID 89170, 9 pages doi:10.1155/2007/89170 Research Article A Fast Mellin and Scale Transform Antonio De Sena1 and Davide Rocchesso2 1 Dipartimento di Informatica, Universit` di Verona, Strada Le Grazie, 15-37134 Verona, Italy a 2 Dipartimento di Arti e Disegno Industriale, Universit` Iuav di Venezia, Dorsoduro 2206, 30123 Venezia, Italy a Received 24 August 2006; Revised 30 December 2006; Accepted 5 March 2007 Recommended by Jar-Ferr Kevin Yang A fast algorithm for the discrete-scale (and β-Mellin) transform is proposed. It performs a discrete-time discrete-scale approx- imation of the continuous-time transform, with subquadratic asymptotic complexity. The algorithm is based on a well-known relation between the Mellin and Fourier transforms, and it is practical and accurate. The paper gives some theoretical background on the Mellin, β-Mellin, and scale transforms. Then the algorithm is presented and analyzed in terms of computational complexity and precision. The eﬀects of diﬀerent interpolation procedures used in the algorithm are discussed. Copyright © 2007 A. De Sena and D. Rocchesso. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. 1. INTRODUCTION variance to shift, scale, and rotation. In [3], various tech- niques have been presented for the implementation of the The Mellin transform, and the particular version called scale Fourier-Mellin transform, including a polar-log coordinates transform, can represent a signal in terms of scale. The scale remapping. In [4], the problem of estimation of scale and can be interpreted, similarly to frequency, as a physical at- orientation diﬀerences between objects in images has been tribute of signals. The proposed fast (subquadratic) imple- approached using the analytical Fourier-Mellin transform mentation allows this transform to be used in practical ap- [3]. plications. The algorithm can compute the Mellin transform Other approaches to the Mellin transform implementa- tion have been taken by J. Bertrand et al. [5–7]. In their ∞ f (t )t p−1 dt , M f ( p) = (1) studies, the authors tackled the transform in the frequency 0 domain by considering analytic signals. An implementation in the complex variable p = − jc + β, with β ∈ R ﬁxed pa- based on exponential resampling in the time domain should rameter and c ∈ R independent variable. We call this family give a solution to a few practical problems. Namely, a starting of transforms the β-Mellin transform. It is indeed a restric- point near 0 implies an impossible exponential resampling, and if the signal support in time is very small compared to tion of the Mellin transform, as the real part of the complex variable p is parameterized. The β parameter allows to se- the starting point of the signal, the exponential sampling lect among: (i) a scale-invariant transform (β = 1/ 2, scale becomes a quasiuniform sampling. An implementation that follows this idea has been made by Goncalv´ s and Lemoine ¸e transform); (ii) a compression/expansion invariant trans- form (β = 0); (iii) a shape-invariant transform (β = −1, the (http://gdr-isis.org/tftb/refguide/node32.html), but the algo- rithm appears to be quadratic in complexity. The authors ratio between the maximum of the function and its extension have searched for other implementation of J. Bertrand et al. is a constant). fast Mellin transform idea, but no sub-quadratic implemen- The proposed algorithm is based on the well-known re- tation has been found. lation between the Mellin and Fourier transforms. While In our work, that proceeds in the time domain by cre- methods that exploit such relation have been proposed long ago [1, 2], eﬃciency and practicality are still remarkable ob- ating parallels with Fourier-based theories, we tried to ﬁnd practical solutions for exponential sampling, while simulta- jectives to be achieved. neously keeping the whole framework as simple as possible. Mellin and scale transforms are important in vision and In particular, the resampling process does not pose problems image processing. In particular, a so-called Fourier-Mellin regarding the relative small length of the signal because there transform can be used for pattern recognition for its in-
2 EURASIP Journal on Advances in Signal Processing c ∈ R, just as the Fourier transform can be seen as a restric- is no prebuilt exponential grid, and the exponential warping is adapted to the signal under analysis. tion of the Laplace transform on the imaginary axis. Thus, the scale transform is deﬁned1 as We are mainly interested in applications of the scale transform in the realm of speech and audio processing, ∞ 1 f (t )e(− jc−1/2) ln t dt. D f (c ) = √ where it can be used for various purposes, like scale normal- (3) 2π 0 ization, signal analysis in the scale domain [8] (scale can be considered as a joint time-frequency attribute), audio ma- The scale inverse transform is given by nipulation in scale domain [9], and vowel recognition [10]. In Section 2, an introduction to the Mellin and scale ∞ 1 D f (c)e( jc−1/2) ln t dc. f (t ) = √ (4) transforms will be given along with the deﬁnitions of scale pe- 2π −∞ riodicity and an interpretation of the scale transform. Analo- gies and relations with the Fourier transform will also be pro- The key property of the scale transform is its scale in- variance. This means that if f is a function and g is a scaled vided. An exponential sampling theorem (an extension of the version of f , the transform magnitude of both functions is one provided in [11]) will be presented. This section is the collection of known concepts and new deﬁnitions and ex- the same. A scale modiﬁcation is a compression or expan- tensions useful as support for the β-Mellin transform imple- sion of the time axis of the original function that preserves signal energy. Thus, a function g (t ) can be obtained with a mentation. In Section 3, the base theory for the implemen- √ scale modiﬁcation from a function f (t ) if g (t ) = α f (αt ), tation of the fast Mellin transform will be provided. Expo- with α ∈ R+ . When α < 1, we get a scale expansion, when nential sampling will be introduced and sinc, cardinal spline, α > 1 we get a scale compression. Given a scale modiﬁcation and spline interpolations will be discussed. In Section 4, the with parameter α, the scale transforms of the original and implemented algorithm, its computational cost, and an error analysis will be described. scaled signals are related by Dg (c) = α jc D f (c). (5) 2. THE MELLIN TRANSFORM This property derives from a similar property of the Mellin Originally developed by Robert Hjalmar Mellin (1854–1933) transform. In fact, if h(t ) = f (αt ), then for the study of the gamma function, hypergeometric func- tions, Dirichlet series, the Riemann zeta function and for the Mh ( p) = α− p M f ( p). (6) solution of partial diﬀerential equations, the Mellin trans- form was also used in electrical engineering, for example In both (5) and (6), scaling is reﬂected by a multiplicative for studying motor control systems [12]. In [8], Cohen in- factor for the transforms, and for (5) such factor reduces to a troduced the “scale transform.” This transform is said to pure phase contribution. So, the scale transform magnitudes be scale-invariant (the Fourier transform is shift-invariant), of the original and scaled signals are the same, thus meaning that the signals diﬀering just by a scale trans- formation (compression or expansion with energy preserva- Dg ( c ) = D f ( c ) . (7) tion) have the same transform magnitude distribution. Co- hen showed that the scale transform is a restriction of the Mellin transform on the vertical line p = − jc + 1/ 2, with 2.3. Relation with the Fourier transform c ∈ R. From its deﬁnition and interpretation, the Mellin transform provides a tight correspondence with the Fourier transform 2.1. Deﬁnition and existence of Mellin transform [10]. More precisely, the Mellin transform with parameter p = − jc can be interpreted as a logarithmic-time Fourier The Mellin transform of a function f is deﬁned as in (1), transform: where p ∈ C is the Mellin variable. ∞ The existence of the Mellin transform (1) depends on f (t )e− jc(ln t) d(ln t ). M f (c ) = (8) convergence of the transform integral, −∞ ∞ Similarly, we can deﬁne the scale transform of a function f (t ) f (t ) t p−1 dt < ∞. (2) using the Fourier transform of a function g (t ) [8] with g (t ) 0 obtained from f (t ) by time-warping (g (t ) = f (et )): This is a general suﬃcient condition for the existence of the transform. Further considerations [5] can be made using the ∞ g (t )e− jct dt. M f (c ) = fact that p = − jc + β, and diﬀerent or simpler forms of (2) (9) −∞ can be derived. This result can be generalized for any p deﬁned as p = − jc + β, with β ∈ R, by using g (t ) = f (et )eβt . 2.2. Deﬁnition of scale transform The scale transform [8] is a particular restriction of the √ Mellin transform on the vertical line p = − jc + 1/ 2, with The heading 1/ 2π is for energy normalization purpose. 1
A. De Sena and D. Rocchesso 3 2.4. Scale periodicity and scale is ⎧ transform interpretation ⎨e jc ln α , | c | ≤ C0 , Γ(c) = ⎩ (11) A parallel can be drawn between the properties of the Fourier 0 elsewhere. and scale transforms. In particular, we can deﬁne scale pe- riodicity [11] as follows: a function f (t ) is √ to be scale said β The β-Mellin transform of f (t ), indicated with M f (c), is periodic with period τ if it satisﬁes f (t ) = τ f (tτ ), where a support-limited function by assumption. Then an expan- τ = b/a, with a and b starting and ending points of the scale β period. C0 = 2π/ ln τ is the “fundamental scale” associated sion of M f (c) using Fourier series representation can be per- β formed. The period (in the Fourier sense) of M f (c) is sup- with the scale periodic function. By analogy with the Fourier theory, we can deﬁne a “scale series” and Parseval theorem. β posed to be T = 2C0 (the whole support of M f (c), i.e., the Of particular importance is the “exponential sampling the- bandwidth in β-Mellin domain of f (t )), orem” [11] that, like the Nyquist-Shannon theorem, allows the reconstruction of a scale-band limited signal from its ∞ β m am e jc ln τ , samples. These samples must be distributed exponentially in M f (c ) = (12) time according to positions pk = τsk , with k ∈ Z, τs = eπ/Cm , m=−∞ and Cm is the signal maximum scale. with am deﬁned as A more general theorem can be formulated by working on β-Mellin (rather than scale) band-limited signals. 1 −m am = √ ln(τ )eβ ln τ f τ −m . (13) 2π 2.5. Exponential sampling theorem Now, starting from the deﬁnition of inverse β-Mellin trans- form, and using (10)–(13), the reconstruction equation for Starting from what has been done for the scale transform exponential sampling can be obtained: [11], an extension/generalization of the exponential sam- pling theorem can be provided for all types of β-Mellin trans- C0 1 β M f (c)e( jc−β) ln t dc f (t ) = √ forms. 2π −C0 (14) Deﬁnition 1. The β-Mellin band of a function f (t ) is the sup- ∞ 1 f τ m ψ t τ −m . = ln τ √ port of F (c), where F (c) is the β-Mellin transform of f (t ). 2π m=−∞ Deﬁnition 2. A function f (t ) is β-Mellin band-limited to C0 Equation (14) allows a perfect (in the Nyquist-Shannon when F (c) = 0 for all c ∈ (−C0 , C0 ), where F (c) is the β- / sense) reconstruction of the signal starting from its (expo- Mellin transform of f (t ). nentially spaced) samples. Furthermore, (14) can be shown to be very close to the Nyquist-Shannon interpolation for- Now the exponential sampling theorem for β-Mellin mula, in fact it can be rewritten as band-limited functions can be stated. ∞ −β Theorem 1. A function f (t ) ∈ L2 (R), β-Mellin band-limited2 f τ m t τ −m sinc C0 ln t τ −m . f (t ) = (15) to C0 , can be exactly reconstructed from its samples in the time m=−∞ domain if the samples are spaced exponentially along the time The reconstruction function (tτ −m )−β sinc(C0 ln (tτ −m )) is axis as in {τ n }∞ , where τ = e2π/2C0 . −∞ composed by a logarithmic-time sine cardinal function mod- ulated by a power function. The summation (15) is made by A quick outline of proof can be given. The proof is similar summing β-dilatocyclic3 versions of the reconstruction func- to the one shown in [11] for the scale transform. We need to rebuild the equation chains using the β-Mellin related equa- tion weighted by each sample taken exponentially in time. tions. Let ψ (t ) be the following function: 3. THE FAST MELLIN TRANSFORM Computing a discrete Mellin transform is relatively straight- 2 sin C0 ln t −β ψ (t ) = √ t. (10) forward. For example, we can do an approximation of the ln t 2π transform integral using the Riemann sum. Unfortunately, doing this would give us algorithms exhibiting quadratic complexity, thus meaning that they are not usable in most The β-Mellin transform of γα (t ) = αβ ψ (αt ) (i.e., γ is a β- scaled version of ψ ), where α = τ m , τ = e2π/2C0 and m ∈ Z, Similar to the deﬁnition given in [5], a β-dilatocyclic signal is a sig- 3 nal composed by expanded/compressed replicas of a base signal, mod- ulated/ampliﬁed by a function of β. This concept is in some way similar Obviously, the theorem assumes that the β-Mellin transform of f (t ) ex- 2 to the concept of periodic signal. ists.
4 EURASIP Journal on Advances in Signal Processing f (t) x (n) tially sampled signal view in which we know that the signal must be considered along a warped time axis. In that view, the signal is a Mellin (more precisely β-Mellin) band-limited signal. In this case, for example, a single-cycle sinusoid can Exponential Spline interpolation and still be plotted with the same shape as the original, but with time warping exponential resampling a higher sample density near the signal start. The other inter- pretation is the time-warped uniformly sampled signal view. In that view, the warped signal is Fourier band-limited. In this case, for example, the shape of a single-cycle sinusoid will Exponential Point-by-point be heavily distorted. The assumptions underlying our imple- multiplication exponential multiplication mentation are that the signal is (i) time-limited because it is saved in a ﬁnite-dimensional storage system; (ii) Fourier band-limited because it results from uniform sampling un- der Nyquist-Shannon conditions; (iii) β-Mellin band-limited Fourier FFT algorithm to have a ﬁnite number of points in the Mellin transform rep- transform resentation. These conditions are possible only if the original signal is thought as a single period of an inﬁnitely long pe- riodic signal (to have a Fourier band-limited signal) or as a M f (c ) Mx (c ) single scale-period of a inﬁnitely long β-dilatocyclic signal. (a) (b) 3.2. Exponential resampling Figure 1: Block diagram of the fast Mellin transform idea (a) and Several problems arise when making an exponential resam- the relative implementation blocks (b). pling starting from an unknown uniformly sampled signal: how many samples are needed, how they should be dis- tributed over time, how the signal start time alters this in- practical applications. The basic idea of the fast Mellin trans- formation, how can we reconstruct the signal, and so forth. form (FMT) algorithm comes from (9), in particular when β = 1/ 2 (scale transform). While presented in prior works While being aware of prior answers [11, 13], we address these questions in this section. (i.e., [1, 2, 13]), this idea is here used to build a practical and eﬃcient computer program (in particular a Matlab toolbox). First of all, we must fulﬁll the Nyquist-Shannon sampling condition, so the distance between two adjacent samples in The algorithm approximates the exponential resampling cannot exceed the distance of the ∞ original uniform sampling step. This means that the sam- f et eβt e− jct dt M f (c ) = (16) pling period Ts is the upper limit for the distance between −∞ the last two contiguous samples in exponential resampling. by taking a uniformly sampled function, warping it exponen- The second constraint is that the resampling process must tially, multiplying it by an exponential, and performing a fast cover the entire signal, from its starting point to its ending Fourier transform (see Figure 1). Naturally, all the problems point. The two constraints force us to have more samples in come from the warping operation. Once digitized, the signal the exponential resampled signal. The original signal starting must be resampled from a uniform to an exponential sam- point t0 is very important: in fact, the more t0 is close to zero, pling grid. A resampling-based approach has been already the more samples are needed in the exponential resampling studied in vision and image processing. In particular in [3], process. Thus, if we let t0 = 0, we need an inﬁnite number an implementation of the Fourier-Mellin transform of im- of exponential samples. So, for using this algorithm we need ages has been presented, based on the idea of log warping, a starting point strictly greater than zero. We can write the which can be dated back to [1, 2]. Conversely, in the imple- exponential sampling like a sequence: mentation of the fractional Mellin transform [14], warping is done logarithmically instead of exponentially. τs k∞ k=−∞ , (17) 3.1. Sampled signal where k is the sample index and τs can be called the exponen- tial base step. So, using the ﬁrst4 constraint (τske − τske −1 = In practical applications, the original analog signal is hardly available, because a uniform-sampling stage is inherent in Ts ), we can ﬁnd the acquiring process. So, the raw material is the Shannon- t0 + nTs sampled version of a Fourier band-limited signal. The τs = t , (18) Nyquist-Shannon theorem tells us that in this condition, we + (n − 1)Ts 0 can reconstruct the original (analog) signal from the sampled version. This implies that we can resample the original (ana- τk τk 1 log) signal in a diﬀerent way. In particular, after resampling − In theory, we should write s e − s e ≤ Ts , but if we want to use as few 4 ττ τ k k −1 k the signal exponentially (see Section 3.2), two interpretations samples as possible, we can use s e − s e = Ts . s e is the last sample and ke is the last sample index (sample at the ending temporal point te ). can be used. The ﬁrst interpretation is based on an exponen-
A. De Sena and D. Rocchesso 5 where n is the number of samples of the uniformly sampled signal. Now, using the second constraint (τsk0 = t0 ∧ τske = Uniform sampling t0 + nTs = te ), the number of needed exponential samples is ln t0 + nTs / t0 eN = + 1. (19) t0 + nTs t0 + (n − 1)Ts ln If we use Ts as the starting point (i.e., t0 = Ts ), we can obtain the lighter approximate expression Exponential resampling ln (n + 1) eN = . (20) ln (n + 1) n Furthermore, if we use a high number of samples (in prac- 0 1 2 3 4 5 6 7 8 9 tice, e.g., a number greater than 16 is suﬃcient) we can ap- Samples proximate (20) in a very simple form [13] using a known important limit (limx→∞ (1 + 1/x)x = e): Figure 2: Uniform sampling and (critical) exponential resampling. eN = n ln n. (21) Now we have the exponential sampling step and the number of samples needed, so we can proceed with resampling (see Figure 2). The use of cardinal spline interpolation is, from a the- oretical point of view, a good choice. In fact, the cardinal spline, generated by cardinal B-spline [19], has a behavior 3.3. Sinc, cardinal spline, and spline interpolation similar to the sinc function. Like the sinc function, each car- Resampling (in a nonuniform way) an already sampled sig- dinal spline vanishes at all integers except the origin, and the nal is not trivial. In theory, the Nyquist-Shannon sampling value at 0 is 1. Furthermore, at limit, the cardinal spline con- theorem tells that a signal, under well-known conditions, can verges to the sinc function. be reconstructed from its samples using a sinc interpolation. Eventually, it is an experimental analysis of errors that Unfortunately, fast sinc interpolation on an exponential grid guides the choice of the interpolation method, as presented is cumbersome, even using lookup table [15]. In [16] (but in Section 4.2, for diﬀerent oversampling factors. also [17, 18] are important for more stable algorithms), an idea for reducing the theoretically inﬁnite computation of a sinc interpolation to a ﬁnite summation has been presented, 4. IMPLEMENTATION AND EXPERIMENTS but the computation still requires a quadratic algorithm. A fast interpolation technique that can approximate sinc inter- polation is cardinal spline interpolation [19]. This interpo- Using the ideas presented in Section 3, a fast Mellin trans- lation is a modiﬁed version of the cubic Hermite spline in- form has been developed. The algorithm takes a signal uni- terpolation. The Hermite spline is a third-degree spline with formly sampled and performs an exponential resampling. each polynomial of the spline in Hermite form. The Hermite The signal is considered to be sampled at Nyquist frequency and, to obtain a good tradeoﬀ between accuracy and speed, form consists of two control points and two control tangents the number of new (exponential) samples used is 2eN . Here, for each polynomial. On each subinterval, the interpolating polynomial depends on the starting point pi and an ending the starting point of the signal is considered to be Ts (the uni- point pi+1 , with starting and ending tangents mi and mi+1 , re- form sampling step), but in Section 4.2 a solution for com- puting the Mellin transform with a diﬀerent starting point is spectively. A cardinal spline is a cubic Hermite spline whose tangents are deﬁned by the points and a tension parame- given. The algorithm can use a natural spline interpolation ter c. The tension allows the computation of the tangents. or cardinal spline interpolation. Either solutions has a linear A general-purpose tension value can be 0.5 and the cardinal computational cost (natural spline interpolation is embed- spline using this value is called Catmull-Rom spline. In the ded in Matlab): more precisely, the asymptotic complexity is O (N ), where N is the number of exponential samples. After FMT algorithm, various values of tension have been tested along with other types of spline interpolation, in particular, resampling, an exponential point-by-point multiplication is performed (the eβt component of (16)) with a computational natural cubic spline interpolation [19]. A natural cubic spline cost of O (N ). Then a fast Fourier transform is computed. is a spline constructed of piecewise third-order polynomials which pass through a set of m control points. The second The FFT has a subquadratic computational cost, more pre- cisely O (N ln N ) (see Figure 1). At last, an energy normaliza- derivative of each polynomial is set to zero at the endpoints, tion is performed, again a linear operation (O (N )). So the and this provides a boundary condition that completes the system of m − 1 equations. This interpolation is simpler than whole asymptotic complexity depends only on the FFT and cardinal spline, yet oﬀering the same goodness of approxi- is O (N ln N ). Written in terms of n (the initial number of uniform samples), the asymptotic complexity is O (n ln2 n). mation.
6 EURASIP Journal on Advances in Signal Processing 105 100 100 10−5 Error Error 10−5 10−10 10−10 10−15 10−15 10−20 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0 0 Time (s) Time (s) 2eN samples; maximum absolute error: 3.19e-002 0.5eN samples; maximum absolute error: 1.83e+000 1eN samples; maximum absolute error: 6.01e-001 2eN samples; maximum absolute error: 3.19e-002 Figure 3: Reconstruction error for a white noise using twice as 3eN samples; maximum absolute error: 1.15e-002 many samples as strictly needed (2eN ). Figure 4: Error trend (in time) for a white noise. When taking twice as many samples as required (2eN ), the maximum error of these sig- nals goes towards 10−2 . Each curve is derived from piecewise-linear regression of the actual error curve, as the one shown in Figure 3. 4.1. Assumptions and approximations The algorithm works using the assumptions and approxima- smaller, the displacement of the samples on the exponential tions presented in the previous sections and are summarized grid is dramatically less accurate and this introduces an even here. larger error in the computation of the transform. First, there are errors due to quantization and ﬁnite- So, the preferred solution for error reduction is oversam- precision arithmetics. Then we can mention all the approxi- pling. Using more samples than those strictly required by mations bound to the algorithmic realization. Namely, spline the sampling theory allows the implemented transform to be interpolation introduces errors; (21) is a limit approxima- more precise. This oversampling can be tuned with respect to tion; signals are supposed to start at t0 = Ts , where Ts is the user needs. A good choice is to use twice as many samples the sampling period of the uniformly sampled signal; no in- as those required by theory. In this way, the maximum inter- formation on Mellin bandwidth is typically available before- polation error goes towards 10−2 on amplitude-normalized hand.5 signals. The “worst-case scenario” (shown in Figure 4) is when using signals that, in the ﬁnal part of them, have fre- quency components near the Nyquist limit. Violet noise and 4.2. Errors and reversibility white noise are simple examples that maximize the error, but it is suﬃcient that only the ﬁnal part of the signal has high The algorithm is clearly based on subblocks: the interpola- tion block, the FFT block, and the multiplication and nor- frequency components. In fact, at the end of the resampling malization blocks. In the case of complexity analysis, all the grid, the samples are spaced as in the original uniformly sam- focus was on the FFT and on the relation between the num- pled signal. So, in that region of the signal the exponential ber of uniform samples and the number of exponential sam- sampling is very close to the uniform and then you do not ples. The error analysis, instead, is all focused on the interpo- have the beneﬁt of the oversampling and errors can be big- lation block. Other computational errors are negligible. As it ger. In the ﬁnal part of the signal, the interpolator is an ap- was explained in Section 3.3, the exponential distribution of proximation of the theoretical sinc interpolator. The closer samples and the need of a fast interpolation algorithm force the frequencies are to the Nyquist limit, the more the dif- us to choose an approximation for the sinc interpolation and ference between spline and sinc interpolators is noticeable this introduces errors.6 Alternative distributions of interpo- (see Figures 6 and 7). In Figure 3, the reconstruction error lation nodes have been tried to reduce error, like Chebyshev is shown, while in Figure 4 the curves show the trends of the or Leja nodes, but although the interpolation error becomes reconstruction errors as obtained from piecewise linear re- gression on log-scale plots. From a computational point of view, the oversampling introduces a multiplication by a con- 5 Indeed, the unknown Mellin bandwidth can be approximately computed stant to the number of exponential sampling points, so the after exponential warping. asymptotic complexity remains O (n ln2 n). 6 Actually, true sinc interpolation would also introduce errors, due to the In Figure 5, a short-time SNR plot has been drawn. The intrinsic problem of the noninﬁniteness of the computer-computed sinc plot shows how the SNR varies over time for a white noise function.
A. De Sena and D. Rocchesso 7 10−2 350 Short-time SNR (dB) (size = 1024 samples) 300 10−3 250 10−4 200 Error 150 10−5 100 10−6 50 10−7 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0 Time (s) Time (s) 3eN samples; overall SNR: 123, last SNR: 105 Sinc interpolator; maximum absolute error: 5.86e-004 2eN samples; overall SNR: 100, last SNR: 84 1eN samples; maximum absolute error: 7.87e-003 1eN samples; overall SNR: 49, last SNR: 28 2eN samples; maximum absolute error: 1.05e-003 0.5eN samples; overall SNR: 15, last SNR: 3 3eN samples; maximum absolute error: 5.62e-004 Figure 5: SNR over time for four diﬀerent oversampling factors. Figure 6: Error trend over time for three diﬀerent oversampling Test signal: white noise, 16 bits, 44100 Hz, 65536 samples. factors and a sinc interpolator. Test signal: sparrow chirp, 16 bits, 16000 Hz, 2048 samples. Table 1: Elapsed times in seconds. Times for complete operation (including loading ﬁle from disk). Oversampling recorded elapsed times only for the FMT algorithm without Number of samples 3eN 2eN 1eN 0.5eN any other operation. The ﬁrst and the last rows of each table are aﬀected by machine limitations. In fact for n = 218 , the 218 306.594 56.703 45.297 8.297 machine begins to paginate memory to disk so the perfor- 216 9.953 6.735 3.531 2.328 mance is heavily aﬀected. For n = 28 , secondary operations 214 2.218 1.312 0.656 0.359 unbound to the algorithm result to be heavier than the algo- 212 0.782 0.297 0.140 0.110 rithm itself. The machine that has been used for the tests was 210 0.515 0.094 0.047 0.046 a notebook with 3 Ghz Pentium 4 processor, with 512 Mb of 208 0.453 0.047 0.031 0.031 RAM running Matlab 7R14 for WindowsXP. Theoretically, the most accurate interpolation is sinc in- terpolation. So, we compared cardinal spline interpolation Table 2: Elapsed times in seconds. Times for FMT algorithm only. and sinc interpolation. Results can be viewed in Figures 6 and 7, computed using real recording (sparrow chirp) containing Oversampling Number of samples frequencies near Nyquist limit (signal sampled at 16 kHz). 3eN 2eN 1eN 0.5eN The experiments showed that in every case, a non-eﬃcient 218 167.735 39.625 17.157 6.890 interpolation (i.e., O (n2 ) complexity) is too slow and not 216 7.172 4.985 2.640 1.640 practical. For example, analyzing 8192 samples required 202 214 1.890 1.078 0.547 0.390 minutes. Conversely, the factor-3 oversampling with spline 212 0.484 0.203 0.093 0.062 interpolation is almost as accurate as sinc interpolation. 210 0.250 0.047 0.032 0.016 The FMT is reversible (if the Mellin transform of a func- 208 0.219 0.015 0.016 0.016 tion exists, then the inverse of the transformed function also exists) and an IFMT algorithm has been implemented. The IFMT is entirely based on the FMT. The only caveat is in the computation of the inverse of the equation N = n ln n, signal. In this example, the overall SNR for three-times over- which can be performed with a bisection method. The inter- sampling (3eN ) is 123 dB, for 2eN is 100 dB, for 1eN is 49 dB, polation scheme is the same as the one used in the FMT, and and for 0.5eN is 15 dB. the process of interpolation is simply reversed. Alternatively, Tables 1 and 2 give a short snapshot of the algorithm per- backward interpolation can proceed by warping linear time formance. Data of the ﬁrst table are recorded elapsed times to logarithmic time. Again, the error is totally due to inter- for entire operations (from loading a wave ﬁle to the end polation. The pairs transform and antitransform allow us to of the transform computation, including separation of phase work in the Mellin domain and then go back to the original and magnitude, etc.), while data from the second table are time domain [9].
8 EURASIP Journal on Advances in Signal Processing 1 140 Short time SNR (dB) (size = 512 samples) 0. 8 130 0. 6 120 0. 4 110 0. 2 Amplitude 100 0 90 − 0. 2 80 − 0. 4 70 − 0. 6 − 0. 8 60 −1 50 0.02 0.04 0.06 0.08 0. 1 0.12 0.14 0 14 16 64 Time (s) Time (s) 3eN samples; overall SNR: 117, last SNR: 128 2eN samples; overall SNR: 113, last SNR: 117 Figure 8: Scale shifts tuned with the scale period of the original 1eN samples; overall SNR: 84, last SNR: 79 signal (the original signal starts at 1 second and ends at 4 seconds). Sinc interpolator; overall SNR: 117, last SNR: 130 One scale-compressed version with the same scale period has been reproduced from 0.25 second to 1 second, and two scale-expanded Figure 7: SNR over time for three diﬀerent oversampling factors versions with the same scale period are reproduced from 4 seconds to 16 seconds, and from 16 seconds to 64 seconds. and a sinc interpolator. Test signal: sparrow chirp, 16 bits, 16000 Hz, 2048 samples. signal starting from Ts . In conclusion, the algorithm can be 4.3. Scale shifting and hybrid transform extended to aﬀord the choice of the starting point, possi- bly setting it to multiples of Ts . Nevertheless, if the trans- The FMT works under the assumption that the signal starts from Ts , where Ts is the uniform sampling interval. The im- form is used only for scale normalization or for ﬁltering pact of this hypothesis can be important, especially when the or recognition applications, the starting point looses impor- transform is used for scale analysis. In fact, the starting point tance. of the signal changes the associated Mellin distribution, be- cause the Mellin is not shift-invariant. If the objective is to 4.4. Availability of the code analyze just the Mellin magnitude, a simple scale shifting can be done. This means that the signal in the original domain A matlab implementation of the FMT and some process- must be shifted and scaled according to its scale period. The ing examples are freely available at http://profs.sci.univr.it/. scale period, in the case of an unknown ﬁnite-length signal, ˜desena/FMT. is the ratio between the ending instant and the starting in- stant. When shifting the signal to a new starting point (Ts 5. CONCLUSIONS for our purposes), the ratio must be still the same, so the signal must be scaled, that is, compressed or expanded pre- This paper proposed a fast algorithm for the discrete-scale serving the total energy of the signal. However, this solution (and β-Mellin) transform. The idea is based on the well- presents problems if phase analysis is needed or if the orig- known relation between the Mellin and Fourier transforms, inal signal starts near zero, as in the limiting zero case, the and has been developed to be practical and accurate. As op- scale-periodicity is not computable. To avoid these problems, posed to other implementations, this work tries to solve the the scale shift must be done with a granularity computed problem entirely in the time domain by choosing an eﬃcient, according to the scale period (see Figure 8), thus implying yet accurate, exponential resampling process. The proposed that zero padding will be necessary to compensate the dif- algorithm has been analyzed in terms of computational com- ferences between the obtained point and the wanted starting plexity and precision. In particular, the fast algorithm has point. Moreover, if the starting point is far from Ts , the re- been compared with a nonapproximated interpolation solu- quired sampled frequency becomes too high, thus becoming tion. unpractical. If the signal starts exactly at zero, a hybrid approach can be pursued: the part of signal from 0 to Ts can be trans- ACKNOWLEDGMENTS formed directly. For example, we can consider the signal con- stant in the one-sample interval between 0 and Ts , and pro- The authors would like to thank Stefano De Marchi for his ceed by explicit area computation. This initial contribution help with interpolation methods, and Carlo Drioli for the can be summed with the FMT of the remaining part of the discussions on nonuniform sampling theory.
A. De Sena and D. Rocchesso 9 REFERENCES [18] S. R. Dooley and A. K. Nandi, “Notes on the interpolation of discrete periodic signals using sinc function related ap- [1] D. Casasent and D. Psaltis, “Position, rotation, and scale in- proaches,” IEEE Transactions on Signal Processing, vol. 48, variant optical correlation,” Applied Optics, vol. 15, no. 7, pp. no. 4, pp. 1201–1203, 2000. 1795–1799, 1976. [19] M. Unser, “Splines: a perfect ﬁt for signal and image process- ing,” IEEE Signal Processing Magazine, vol. 16, no. 6, pp. 22–38, [2] D. Casasent and D. Psaltis, “New optical transforms for pattern 1999. recognition,” Proceedings of the IEEE, vol. 65, no. 1, pp. 77–84, 1977. [3] S. Derrode and F. Ghorbel, “Robust and eﬃcient Fourier- Antonio De Sena received the Laurea de- Mellin transform approximations for gray-level image recon- gree in computer science in 2004 from the struction and complete invariant description,” Computer Vi- University of Verona, Department of Com- sion and Image Understanding, vol. 83, no. 1, pp. 57–78, 2001. puter Science, where he is now a Ph.D. [4] S. Derrode and F. Ghorbel, “Shape analysis and symmetry student. He worked at the University of detection in gray-level objects using the analytical Fourier- Verona under a research contract between Mellin representation,” Signal Processing, vol. 84, no. 1, pp. 25– May 2004 and December 2004. In 2007, he 39, 2004. has been visiting the Hunter College, City [5] J. Bertand, P. Bertrand, and J. P. Ovarlez, “The Mellin trans- University of New York, for several months form,” in The Transforms and Applications Handbook, A. D. of studies. His works are related to sound Poularikas, Ed., The Electrical Engineering Handbook, pp. 11- processing and analysis. In particular, he is interested in the Mellin 1–11-68, CRC Press LLC, Boca Raton, Fla, USA, 1995. transform and the scale transform applied to digital audio ﬁltering [6] J. Bertrand, P. Bertrand, and J. P. Ovarlez, “Discrete Mellin and eﬀects, speech recognition, and time-frequency analysis. transform for signal analysis,” in Proceedings of IEEE Interna- tional Conference on Acoustics, Speech, and Signal Processing Davide Rocchesso received the Laurea de- (ICASSP ’90), vol. 3, pp. 1603–1606, Albuquerque, NM, USA, gree in electrical engineering and the Ph.D. April 1990. degree from the University of Padova, Italy, [7] J. P. Ovarlez, J. Bertrand, and P. Bertrand, “Computation in 1992 and 1996, respectively. In 1994 of aﬃne time-frequency distributions using the fast Mellin and 1995, he was a Visiting Scholar at the transform,” in Proceedings of IEEE International Conference on Center for Computer Research in Music Acoustics, Speech, and Signal Processing (ICASSP ’92), vol. 5, and Acoustics (CCRMA), Stanford Univer- pp. 117–120, San Francisco, Calif, USA, March 1992. sity. Since 1991, he has been collaborating [8] L. Cohen, “The scale representation,” IEEE Transactions on with the Center of Computational Sonol- Signal Processing, vol. 41, no. 12, pp. 3275–3292, 1993. ogy (CCS), University of Padova, as a Re- searcher and Live-Electronic Designer. Between 1998 and 2006, he [9] A. De Sena and D. Rocchesso, “A fast Mellin transform with has been with the University of Verona, Italy, as an Assistant and As- applications in DAFx,” in Proceedings of the 7th International Conference on Digital Audio Eﬀects (DAFx ’04), pp. 65–69, sociate Professor. At the Computer Science Department of the Uni- versity of Verona, he coordinated the EU Project Sounding Object. Napoli, Italy, October 2004. He is now Associate Professor at the Department of Art and Indus- [10] T. Irino and R. D. Patterson, “Segregating information about trial Design of the IUAV University of Venice. He launched the EU the size and shape of the vocal tract using a time-domain au- COST Action Sonic Interaction Design (SID). His main interests ditory model: the stabilised wavelet-Mellin transform,” Speech are in audio signal processing, physical modeling, and interaction Communication, vol. 36, no. 3-4, pp. 181–203, 2002. design. [11] H. Sundaram, S. D. Joshi, and R. K. P. Bhatt, “Scale period- icity and its sampling theorem,” IEEE Transactions on Signal Processing, vol. 45, no. 7, pp. 1862–1865, 1997. [12] F. Gerardi, “Application of Mellin and Hankel transforms to networks with time-varying parameters,” IRE Transactions on Circuit Theory, vol. 6, no. 2, pp. 197–208, 1959. [13] E. J. Zalubas and W. J. Williams, “Discrete scale transform for signal analysis,” in Proceedings of the 20th IEEE Interna- tional Conference on Acoustics, Speech, and Signal Processing (ICASSP ’95), vol. 3, pp. 1557–1560, Detroit, Mich, USA, May 1995. [14] E. Biner and O. Akay, “Digital computation of the fractional Mellin transform,” in Proceedings of the 13th European Sig- nal Processing Conference (EUSIPCO ’05), Antalya, Turkey, September 2005. [15] J. O. Smith, Digital Audio Resampling Home Page, January 2002. [16] T. Schanze, “Sinc interpolation of discrete periodic signals,” IEEE Transactions on Signal Processing, vol. 43, no. 6, pp. 1502– 1503, 1995. [17] F. Candocia and J. C. Principe, “Comments on “sine interpola- tion of discrete periodic signals”,” IEEE Transactions on Signal Processing, vol. 46, no. 7, pp. 2044–2047, 1998.