Window function

A popular window function, the Hann window. Most popular window functions are similar bell-shaped curves.

In signal processing and statistics, a window function (also known as an apodization function or tapering function[1]) is a mathematical function that is zero-valued outside of some chosen interval, normally symmetric around the middle of the interval, usually near a maximum in the middle, and usually tapering away from the middle. Mathematically, when another function or waveform/data-sequence is "multiplied" by a window function, the product is also zero-valued outside the interval: all that is left is the part where they overlap, the "view through the window". Equivalently, and in actual practice, the segment of data within the window is first isolated, and then only that data is multiplied by the window function values. Thus, tapering, not segmentation, is the main purpose of window functions.

The reasons for examining segments of a longer function include detection of transient events and time-averaging of frequency spectra. The duration of the segments is determined in each application by requirements like time and frequency resolution. But that method also changes the frequency content of the signal by an effect called spectral leakage. Window functions allow us to distribute the leakage spectrally in different ways, according to the needs of the particular application. There are many choices detailed in this article, but many of the differences are so subtle as to be insignificant in practice.

In typical applications, the window functions used are non-negative, smooth, "bell-shaped" curves.[2] Rectangle, triangle, and other functions can also be used. A rectangular window does not modify the data segment at all. It's only for modelling purposes that we say it multiplies by 1 inside the window and by 0 outside. A more general definition of window functions does not require them to be identically zero outside an interval, as long as the product of the window multiplied by its argument is square integrable, and, more specifically, that the function goes sufficiently rapidly toward zero.[3]

The Fourier transform of the function cos ωt is zero, except at frequency ±ω. However, many other functions and waveforms do not have convenient closed-form transforms. Alternatively, one might be interested in their spectral content only during a certain time period.

In either case, the Fourier transform (or a similar transform) can be applied on one or more finite intervals of the waveform. In general, the transform is applied to the product of the waveform and a window function. Any window (including rectangular) affects the spectral estimate computed by this method.

Figure 2: Windowing a sinusoid causes spectral leakage, even if the sinusoid has an integer number of cycles within a rectangular window. The leakage is evident in the 2nd row, blue trace. It is the same amount as the red trace, which represents a slightly higher frequency that does not have an integer number of cycles. When the sinusoid is sampled and windowed, its discrete-time Fourier transform also suffers from the same leakage pattern. But when the DTFT is only sampled, at a certain interval, it is possible (depending on your point of view) to: (1) avoid the leakage, or (2) create the illusion of no leakage. For the case of the blue sinusoid (3rd row of plots, right-hand side), those samples are the outputs of the discrete Fourier transform (DFT). The red sinusoid DTFT (4th row) has the same interval of zero-crossings, but the DFT samples fall in-between them, and the leakage is revealed.

Windowing of a simple waveform like cos ωt causes its Fourier transform to develop non-zero values (commonly called spectral leakage) at frequencies other than ω. The leakage tends to be worst (highest) near ω and least at frequencies farthest from ω.

If the waveform under analysis comprises two sinusoids of different frequencies, leakage can interfere with the ability to distinguish them spectrally. If their frequencies are dissimilar and one component is weaker, then leakage from the stronger component can obscure the weaker one's presence. But if the frequencies are similar, leakage can render them unresolvable even when the sinusoids are of equal strength. The rectangular window has excellent resolution characteristics for sinusoids of comparable strength, but it is a poor choice for sinusoids of disparate amplitudes. This characteristic is sometimes described as low dynamic range.

At the other extreme of dynamic range are the windows with the poorest resolution and sensitivity, which is the ability to reveal relatively weak sinusoids in the presence of additive random noise. That is because the noise produces a stronger response with high-dynamic-range windows than with high-resolution windows. Therefore, high-dynamic-range windows are most often justified in wideband applications, where the spectrum being analyzed is expected to contain many different components of various amplitudes.

In between the extremes are moderate windows, such as Hamming and Hann. They are commonly used in narrowband applications, such as the spectrum of a telephone channel. In summary, spectral analysis involves a trade-off between resolving comparable strength components with similar frequencies and resolving disparate strength components with dissimilar frequencies. That trade-off occurs when the window function is chosen.

When the input waveform is time-sampled, instead of continuous, the analysis is usually done by applying a window function and then a discrete Fourier transform (DFT). But the DFT provides only a sparse sampling of the actual discrete-time Fourier transform (DTFT) spectrum. Figure 2, row 3 shows a DTFT for a rectangularly-windowed sinusoid. The actual frequency of the sinusoid is indicated as "13" on the horizontal axis. Everything else is leakage, exaggerated by the use of a logarithmic presentation. The unit of frequency is "DFT bins"; that is, the integer values on the frequency axis correspond to the frequencies sampled by the DFT. So the figure depicts a case where the actual frequency of the sinusoid coincides with a DFT sample, and the maximum value of the spectrum is accurately measured by that sample. In row 4, it misses the maximum value by ½ bin, and the resultant measurement error is referred to as scalloping loss (inspired by the shape of the peak). For a known frequency, such as a musical note or a sinusoidal test signal, matching the frequency to a DFT bin can be prearranged by choices of a sampling rate and a window length that results in an integer number of cycles within the window.

Figure 3: This figure compares the processing losses of three window functions for sinusoidal inputs, with both minimum and maximum scalloping loss.

The concepts of resolution and dynamic range tend to be somewhat subjective, depending on what the user is actually trying to do. But they also tend to be highly correlated with the total leakage, which is quantifiable. It is usually expressed as an equivalent bandwidth, B. It can be thought of as redistributing the DTFT into a rectangular shape with height equal to the spectral maximum and width B.[note 1][5] The more the leakage, the greater the bandwidth. It is sometimes called noise equivalent bandwidth or equivalent noise bandwidth, because it is proportional to the average power that will be registered by each DFT bin when the input signal contains a random noise component (or is just random noise). A graph of the power spectrum, averaged over time, typically reveals a flat noise floor, caused by this effect. The height of the noise floor is proportional to B. So two different window functions can produce different noise floors.

In signal processing, operations are chosen to improve some aspect of quality of a signal by exploiting the differences between the signal and the corrupting influences. When the signal is a sinusoid corrupted by additive random noise, spectral analysis distributes the signal and noise components differently, often making it easier to detect the signal's presence or measure certain characteristics, such as amplitude and frequency. Effectively, the signal to noise ratio (SNR) is improved by distributing the noise uniformly, while concentrating most of the sinusoid's energy around one frequency. Processing gain is a term often used to describe an SNR improvement. The processing gain of spectral analysis depends on the window function, both its noise bandwidth (B) and its potential scalloping loss. These effects partially offset, because windows with the least scalloping naturally have the most leakage.

The figure at right depicts the effects of three different window functions on the same data set, comprising two equal strength sinusoids in additive noise. The frequencies of the sinusoids are chosen such that one encounters no scalloping and the other encounters maximum scalloping. Both sinusoids suffer less SNR loss under the Hann window than under the Blackman–Harris window. In general (as mentioned earlier), this is a deterrent to using high-dynamic-range windows in low-dynamic-range applications.

Window functions are sometimes used in the field of statistical analysis to restrict the set of data being analyzed to a range near a given point, with a weighting factor that diminishes the effect of points farther away from the portion of the curve being fit. In the field of Bayesian analysis and curve fitting , this is often referred to as the kernel.

When analyzing a transient signal in modal analysis, such as an impulse, a shock response, a sine burst, a chirp burst, or noise burst, where the energy vs time distribution is extremely uneven, the rectangular window may be most appropriate. For instance, when most of the energy is located at the beginning of the recording, a non-rectangular window attenuates most of the energy, degrading the signal-to-noise ratio.[8]

One might wish to measure the harmonic content of a musical note from a particular instrument or the harmonic distortion of an amplifier at a given frequency. Referring again to Figure 2, we can observe that there is no leakage at a discrete set of harmonically-related frequencies sampled by the DFT. (The spectral nulls are actually zero-crossings, which cannot be shown on a logarithmic scale such as this.) This property is unique to the rectangular window, and it must be appropriately configured for the signal frequency, as described above.

Figure 4: Two different ways to generate an 8-point Hann window sequence for spectral analysis applications. MATLAB calls them "symmetric" and "periodic". The latter is also historically called "DFT Even".

Window functions generated for digital filter design are symmetrical sequences, usually an odd length with a single maximum at the center. Windows for DFT/FFT usage, as in spectral analysis or time-frequency filtering, are even-length sequences, usually created by deleting the right-most coefficient of an odd-length, symmetrical window. These are known as periodic,[9][note 2] or DFT-even.[10] Such a window is generated by the MATLAB function hann(512,'periodic') for instance. To generate it with the formula in section Hann window, the window length (N) is 513, and the 513th coefficient of the generated sequence is discarded. With N=512, the same formula is equivalent to hann(512,'symmetric').

The DFT of an N-length DFT-even window has zero-valued imaginary components, because the n=0 and n=N/2 samples do not contribute to them, and the other samples are all symmetric about N/2, causing their contributions to cancel each other; i.e. e−i2πkn/N+e−i2πk(N−n)/N=2cos⁡(2πkn/N){\displaystyle e^{-i2\pi kn/N}+e^{-i2\pi k(N-n)/N}=2\cos(2\pi kn/N)} is real-valued. The inverse DFT is always N-periodic, but when the input is a DFT-even window, the inverse is symmetrical around the origin, which is the classic indicator of a real-valued frequency-domain representation. See section Cosine-sum windows for an example of how this characteristic can be beneficial.

For a window function with zero-valued end-points, deleting one or both end-points has no effect on its DTFT (spectral leakage). But the function designed for N+1 samples, in anticipation of deleting an end point, typically has a slightly narrower main lobe, slightly higher sidelobes, and a slightly lower noise bandwidth. Similarly, deleting both zeros from a function designed for N+2 samples further amplifies those effects.

There is also a cosmetic result of truncating an N+1 sample symmetric window. It happens when we sample the DTFT only at intervals of 1N{\displaystyle {\tfrac {1}{N}}} cycles/sample, which is the effect of an N-point DFT. For example, the N-point DFT of the sequence generated by hann(N,'periodic') has only 3 non-zero values. (see DFT-even Hann window) All the other samples coincide with zero-crossings of the DTFT, which creates an illusion of little or no spectral leakage. Such a sparse sampling only reveals the leakage into the DFT bins from a sinusoid whose frequency is also an integer DFT bin. The unseen sidelobes reveal the leakage to expect from sinusoids at other frequencies.[11] That is why it's important to sample the DTFT more densely (as we do in the subsequent sections) and choose a window that suppresses the sidelobes to an acceptable level.

N represents the width, in samples, of a discrete-time, symmetrical window function w[n],0≤n≤N−1.{\displaystyle w[n],\ 0\leq n\leq N-1.} When N is an odd number, the non-flat windows have a singular maximum point. When N is even, they have a double maximum.

It is sometimes useful to express w[n]{\displaystyle w[n]} as a sequence of samples of the lagged version of a zero-phase function:

The rectangular window (sometimes known as the boxcar or Dirichlet window) is the simplest window, equivalent to replacing all but N values of a data sequence by zeros, making it appear as though the waveform suddenly turns on and off:

w(n)=1.{\displaystyle w(n)=1.}

Other windows are designed to moderate these sudden changes, which reduces scalloping loss and improves dynamic range, as described above (Window function#Spectral analysis).

The rectangular window is the 1st order B-spline window as well as the 0th power Power-of-sine window.

B-spline windows can be obtained as k-fold convolutions of the rectangular window. They include the rectangular window itself (k = 1), the triangular window (k = 2) and the Parzen window (k = 4).[15] Alternative definitions sample the appropriate normalized B-splinebasis functions instead of convolving discrete-time windows. A kth order B-spline basis function is a piece-wise polynomial function of degree k−1 that is obtained by k-fold self-convolution of the rectangular function.

where L can be N,[10][16]N+1,[17] or N-1.[18] The last one is also known as Bartlett window or Fejér window. All three definitions converge at large N.

The triangular window is the 2nd order B-spline window and can be seen as the convolution of two N/2 width rectangular windows. The Fourier transform of the result is the squared values of the transform of the half-width rectangular window.

In most cases, including the examples below, all coefficients ak ≥ 0. The popular periodic form has only 2K + 1 non-zero N-point DFT coefficients, and they are all real-valued (see Symmetry).[note 4] These properties make periodic cosine-sum windows a natural choice for real-time applications that require both windowed and non-windowed (rectangularly windowed) transforms, because the windowed transforms can be efficiently derived from the non-windowed transforms by convolution.[22] The formulas below are symmetric. As discussed earlier, choosing an odd value of N and dropping the last coefficient also produces a periodic (DFT-even) window.

named after Julius von Hann, and sometimes referred to as Hanning, presumably due to its linguistic and formulaic similarities to the Hamming window. It is also known as raised cosine, because the zero-phase version, w0(n),{\displaystyle w_{0}(n),} is one lobe of an elevated cosine function.

Setting a0{\displaystyle a_{0}} to approximately 0.54, or more precisely 25/46, produces the Hamming window, proposed by Richard W. Hamming. That choice places a zero-crossing at frequency 5π/(N − 1), which cancels the first sidelobe of the Hann window, giving it a height of about one-fifth that of the Hann window.[10][25][26]

Approximation of the coefficients to two decimal places substantially lowers the level of sidelobes,[10] to a nearly equiripple condition.[26] In the equiripple sense, the optimal values for the coefficients are a0 = 0.53836 and a1 = 0.46164.[26][27]

By common convention, the unqualified term Blackman window refers to Blackman's "not very serious proposal" of α = 0.16 (a0 = 0.42, a1 = 0.5, a2 = 0.08), which closely approximates the "exact Blackman",[28] with a0 = 7938/18608 ≈ 0.42659, a1 = 9240/18608 ≈ 0.49656, and a2 = 1430/18608 ≈ 0.076849.[29] These exact values place zeros at the third and fourth sidelobes,[10] but result in a discontinuity at the edges and a 6 dB/oct fall-off. The truncated coefficients do not null the sidelobes as well, but have an improved 18 dB/oct fall-off.[10][30]

Considering n as a real number, the Nuttall window function and its first derivative are continuous everywhere, like in Hann. That is, the function goes to 0 at n = 0, unlike the Blackman–Nuttall and Blackman–Harris windows, which have a small positive value at zero (a "step" from the zero outside the window), like the Hamming window. The Blackman window defined via α is also continuous with continuous derivative at the edge, but the described "exact Blackman window" is not.

A flat top window is a partially negative-valued window that has minimal scalloping loss in the frequency domain. That property is desirable for the measurement of amplitudes of sinusoidal frequency components.[14][33] Drawbacks of the broad bandwidth are poor frequency resolution and high noise bandwidth.

Flat top windows can be designed using low-pass filter design methods,[33] or they may be of the usual cosine-sum variety:

The Fourier transform of a Gaussian is also a Gaussian (it is an eigenfunction of the Fourier transform). Since the Gaussian function extends to infinity, it must either be truncated at the ends of the window, or itself windowed with another zero-ended window.[37]

The confined Gaussian window yields the smallest possible root mean square frequency width σω for a given temporal width Nσt.[40] These windows optimize the RMS time-frequency bandwidth products. They are computed as the minimum eigenvectors of a parameter-dependent matrix. The confined Gaussian window family contains the sine window and the Gaussian window in the limiting cases of large and small σt, respectively.

for any even p{\displaystyle p}. At p=2{\displaystyle p=2}, this is a Gaussian window and as p{\displaystyle p} approaches ∞{\displaystyle \infty }, this approximates to a rectangular window. The Fourier transform of this window does not exist in a closed form for a general p{\displaystyle p}. However, it demonstrates the other benefits of being smooth, adjustable bandwidth. Like the Tukey window discussed later, this window naturally offers a "flat top" to control the amplitude attenuation of a time-series (on which we don't have a control with Gaussian window). In essence, it offers a good (controllable) compromise, in terms of spectral leakage, frequency resolution and amplitude attenuation, between the Gaussian window and the rectangular window.
See also [42] for a study on time-frequency representation of this window (or function).

The so-called "Planck-taper" window is a bump function that has been widely used[44] in the theory of partitions of unity in manifolds. It is smooth (a C∞{\displaystyle C^{\infty }} function) everywhere, but is exactly zero outside of a compact region, exactly one over an interval within that region, and varies smoothly and monotonically between those limits. Its use as a window function in signal processing was first suggested in the context of gravitational-wave astronomy, inspired by the Planck distribution.[45] It is defined as a piecewise function:

where I0 is the zero-th order modified Bessel function of the first kind. Variable parameter α determines the tradeoff between main lobe width and side lobe levels of the spectral leakage pattern. The main lobe width, in between the nulls, is given by 21+α2,{\displaystyle 2{\sqrt {1+\alpha ^{2}}},} in units of DFT bins,[51] and a typical value of α is 3.

Sometimes the formula for w(n) is written in terms of a parameter β=defπα.{\displaystyle \beta \ {\stackrel {\text{def}}{=}}\ \pi \alpha .}[50]

The Ultraspherical window was introduced in 1984 by Roy Streit[54] and has application in antenna array design,[55] non-recursive filter design,[54] and spectrum analysis.[56]

Like other adjustable windows, the Ultraspherical window has parameters that can be used to control its Fourier transform main-lobe width and relative side-lobe amplitude. Uncommon to other windows, it has an additional parameter which can be used to set the rate at which side-lobes decrease (or increase) in amplitude.[56][57]

The Poisson window, or more generically the exponential window increases exponentially towards the center of the window and decreases exponentially in the second half. Since the exponential function never reaches zero, the values of the window at its limits are non-zero (it can be seen as the multiplication of an exponential function by a rectangular window [58]). It is defined by

where τ is the time constant of the function. The exponential function decays as e ≃ 2.71828 or approximately 8.69 dB per time constant.[59]
This means that for a targeted decay of D dB over half of the window length, the time constant τ is given by

A Planck-taper window multiplied by a Kaiser window which is defined in terms of a modified Bessel function. This hybrid window function was introduced to decrease the peak side-lobe level of the Planck-taper window while still exploiting its good asymptotic decay.[60] It has two tunable parameters, ε from the Planck-taper and α from the Kaiser window, so it can be adjusted to fit the requirements of a given signal.

When selecting an appropriate window function for an application, this comparison graph may be useful. The frequency axis has units of FFT "bins" when the window of length N is applied to data and a transform of length N is computed. For instance, the value at frequency ½ "bin" (third tick mark) is the response that would be measured in bins k and k+1 to a sinusoidal signal at frequency k+½. It is relative to the maximum possible response, which occurs when the signal frequency is an integer number of bins. The value at frequency ½ is referred to as the maximum scalloping loss of the window, which is one metric used to compare windows. The rectangular window is noticeably worse than the others in terms of that metric.

Other metrics that can be seen are the width of the main lobe and the peak level of the sidelobes, which respectively determine the ability to resolve comparable strength signals and disparate strength signals. The rectangular window (for instance) is the best choice for the former and the worst choice for the latter. What cannot be seen from the graphs is that the rectangular window has the best noise bandwidth, which makes it a good candidate for detecting low-level sinusoids in an otherwise white noise environment. Interpolation techniques, such as zero-padding and frequency-shifting, are available to mitigate its potential scalloping loss.

When the length of a data set to be transformed is larger than necessary to provide the desired frequency resolution, a common practice is to subdivide it into smaller sets and window them individually. To mitigate the "loss" at the edges of the window, the individual sets may overlap in time. See Welch method of power spectral analysis and the modified discrete cosine transform.

Two-dimensional windows are used in, e.g., image processing. They can be constructed from one-dimensional windows in either of two forms.[62]

The separable form, W(m,n)=w(m)w(n){\displaystyle W(m,n)=w(m)w(n)} is trivial to compute. The radial form, W(m,n)=w(r){\displaystyle W(m,n)=w(r)}, which involves the radius r=(m−M/2)2+(n−N/2)2{\displaystyle r={\sqrt {(m-M/2)^{2}+(n-N/2)^{2}}}}, is isotropic, independent on the orientation of the coordinate axes. Only the Gaussian function is both separable and isotropic.[63] The separable forms of all other window functions have corners that depend on the choice of the coordinate axes. The isotropy/anisotropy of a two-dimensional window function is shared by its two-dimensional Fourier transform. The difference between the separable and radial forms is akin to the result of diffraction from rectangular vs. circular appertures, which can be visualized in terms of the product of two sinc functions vs. an Airy function, respectively.

^ abMathematically, the noise equivalent bandwidth of transfer function H is the bandwidth of an ideal rectangular filter with the same peak gain as H that would pass the same power with white noise input. In the units of frequency f (e.g. hertz), it is given by:

where the limits of integer n ensure an odd number of coefficients and symmetry about n=0, whether N is even or odd valued.

^The N-point DFT of an N-sample DFT-even Hann or Hamming window, for example, has only 3 DTFT samples that do not coincide with zero-crossings. An illustration, for N=16, can be viewed at DFT-even Hann window.

^"Overlap-Add (OLA) STFT Processing | Spectral Audio Signal Processing". www.dsprelated.com. Retrieved 2016-08-07. The window is applied twice: once before the FFT (the "analysis window") and secondly after the inverse FFT prior to reconstruction by overlap-add (the so-called "synthesis window"). ... More generally, any positive COLA window can be split into an analysis and synthesis window pair by taking its square root.

^ abWelch, P. (1967). "The use of fast Fourier transform for the estimation of power spectra: A method based on time averaging over short, modified periodograms". IEEE Transactions on Audio and Electroacoustics. 15 (2): 70. doi:10.1109/TAU.1967.1161901.

^"Matlab for the Gaussian Window". ccrma.stanford.edu. Retrieved 2016-04-13. Note that, on a dB scale, Gaussians are quadratic. This means that parabolic interpolation of a sampled Gaussian transform is exact. ... quadratic interpolation of spectral peaks may be more accurate on a log-magnitude scale (e.g., dB) than on a linear magnitude scale

^Kaiser, James F.; Kuo, Franklin F. (1966). System Analysis by Digital Computer. John Wiley and Sons. pp. 232–235. This family of window functions was "discovered" by Kaiser in 1962 following a discussion with B. F. Logan of the Bell Telephone Laboratories. ... Another valuable property of this family ... is that they also approximate closely the prolate spheroidal wave functions of order zero