ORGANIZING SECRETARIAT

D.G.M.P. srl

Via di Scornigiana,
trav. B
56014 Ospedaletto
Pisa - ITALY

Phone :
+39 050 989.310
Fax :
+39 050 981.264
E-mail:
info@dgmpincor.it

# 14th European Signal Processing Conference Program

Room: Auditorium

## 9:00 AM - 11:00 AM

### Tue.2.1: Image Segmentation - 6 papers

Chair: Paulo Correia (Instituto Superior Técnico, Portugal)
Image segmentation with a class-adaptive spatially constrained mixture model
Christophoros Nikou (University of Ioannina, Greece); Nikolaos Galatsanos (University of Ioannina, Greece); Aristidis Likas (University of Ioannina, Greece); Konstantinos Blekas (University of Ioannina, Greece)
We propose a hierarchical and spatially variant mixture model for image segmentation where the pixel labels are random variables. Distinct smoothness priors are imposed on the доставка цветов в Белгороде probabilities and the model parameters are computed in closed form through maximum a posteriori (MAP) estimation. More specifically, we propose a new prior for the label probabilities that enforces spatial smoothness of different degree for each cluster. By taking into account spatial information, adjacent pixels are more probable to belong to the same cluster (which is intuitively desirable). Also, all of the model parameters are estimated in closed form from the data. The proposed conducted experiments indicate that our approach compares favorably to both standard and previous spatially constrained mixture model-based segmentation techniques.
Supervised evaluation of synthetic and real contour segmentation results
Helene Laurent (Laboratoire Vision et Robotique, France); Sebastien Chabrier (Laboratoire Vision et Robotique, France); Christophe Rosenberger (ENSI de Bourges - Université d'Orléans, France); Yu-Jin Zhang (Tsinghua University, P.R. China)
This article presents a comparative study of 14 supervised evaluation criteria of image segmentation results. A pre-liminary study made on synthetic segmentation results allows us to globally characterise the behaviours of the selected criteria. This first analysis is then completed on a selection of 300 real images extracted from the Corel data-base. Ten segmentation methods based on threshold selection are used to generate real segmentation results and various situations corresponding to under- and over-segmentation. Experimental results permit to reveal the advantages and limitations of the studied criteria face to various situations.
Enhanced Spatial-Range Mean Shift Color Image Segmentation by Using Convergence Frequency and Position
Nuan Song (Chalmers university of technology, Sweden); Irene Y.H. Gu (Chalmers university of technology, Sweden); ZhongPing Cao (Chalmers university of technology, Sweden); Mats Viberg (Chalmers University of Technology, Sweden)
Mean shift is robust for image segmentation through local mode seeking. However, like most segmentation schemes it suffers from over-segmentation due to the lack of semantic information. This paper proposes an enhanced spatial-range mean shift segmentation approach, where over-segmented regions are reduced by exploiting the positions and frequencies at which mean shift filters converge. Based on our observation that edges are related to spatial positions with low mean shift convergence frequencies, merging of over-segmented regions can be guided away from the perceptually important image edges. Simulations have been performed and results have shown that the proposed scheme is able to reduce the over-segmentation while maintaining sharp region boundaries for semantically important objects.
Multi-scale Image Segmentation in a Hierarchy of Partitions
Olivier Lezoray (University of Caen, France); Cyril Meurie (University of aen, France); Philippe Belhomme (University of Caen, France); Abderrahim Elmoataz (University of Caen, France)
In this paper, we propose a new multi-scale image segmentation relying on a hierarchy of partitions. First of all, we review morphological methods based on connections which produce hierarchical image segmentations and introduce a new way of generating fine segmentations. From connection-based fine partitions, we focus on producing hierarchical segmentations. Multi-scale segmentations obtained by scale-space or region merging approaches have both benefits and drawbacks, therefore we propose to integrate a scale-space approach in the production of a hierarchy of partitions which merges similar regions. Starting from an initial over-segmented fine partition, a region adjacency graph is alternatively simplified and decimated. The simplification of graph nodes models tends to produce similar ones and this is used to merge them. The algorithm can be used to simplify the image at low scales and to segment it at high scales.
Identification of image structure by the mean shift procedure for hierarchical MRF-based image segmentation
Raffaele Gaetano (Università  "Federico II" di Napoli, Italy); Giovanni Poggi (Università  "Federico II" di Napoli, Italy); Giuseppe Scarpa (Università  "Federico II" di Napoli, Italy)
Tree-structured Markov random fields have been recently proposed in order to model complex images and to allow for their fast and accurate segmentation. By modeling the image as a tree of regions and subregions, the original K-ary segmentation problem can be recast as a sequence of reduced-dimensionality steps, thus reducing computational complexity and allowing for higher spatial adaptivity. Up to now, only binary tree structures have been considered, which simplifies matters but also introduces an unnecessary constraint. Here we use a more flexible structure, where each node of the tree is allowed to have a different number of children, and also propose a simple technique to estimate such a structure based on the mean shift procedure. Experiments on synthetic images prove the structure estimation procedure to be quite effective, and the ensuing segmentation to be more accurate than in the binary case.
Color Image Segmentation In RGB Using Vector Angle And Absolute Difference Measures
Sanmati Kamath (Georgia Institute of Technology, USA); Joel Jackson (Georgia Institute of Technology, USA)
This paper introduces a multi-pronged approach for segmentation of color images of man-made structures like cargo containers, buildings etc. A combination of vector angle computation of the RGB data and the absolute difference between the intensity pixels is used to segment the image. This method has the advantage of removing intensity based edges that occur where the saturation is high, while preserving them in areas of the image where saturation is very low and intensity is high. Thus, unlike previous implementations of vector angle methods and edge detection techniques, relevant objects are better segmented and unnecessary details left out. A novel method for connecting broken edges after segmentation using the Hough transform is also presented.

### Tue.5.1: Distributed signal processing in sensor networks (Invited special session) - 6 papers

Chair: Sergio Barbarossa (University of Rome La Sapienza", Italy)
Chair: Ananthram Swami (Army Research Lab., USA)
Hypothesis Testing Over a Random Access Channel in Wireless Sensor Networks
Elvis Bottega (University of Padova, Italy); Petar Popovski (Aalborg University, Denmark); Michele Zorzi (Università  degli Studi di Padova, Italy); Hiroyuki Yomo (CTIF, Aalborg University, Denmark); Ramjee Prasad (Aalborg University, Denmark)
In the design of the communication protocols for wireless sensor networks a specific requirement emerges from the fact that the data contained in an individual sensor is not important per se, but its significance is instantiated with respect to the contribution to the overall sensing task and the decision fusion. Therefore, the communication protocols should be application-aware and operate by reckoning the utility of the carried data. In this paper we consider the problem of hypothesis testing at the fusion center (sink) when all the sensors communicate with the sink via a random access channel. Each sensor contains a binary information 0 (event occurred) or 1 (event did not occur). In a traditional protocol design, an existing random--access protocol is used by which the sink collects the data from all sensors and subsequently makes the decision through majority voting over the received data. In this paper, we propose approaches for joint design of the communication and the decision fusion for the application of hypothesis testing. The fusion center terminates the data gathering through the random access channel as soon as it can make sufficiently reliable decision based on the data received so far. We describe two instances of the protocols, where the total number of sensors $N$ is known and not known, respectively. Our results show that the proposed approaches provide optimized performance in terms of time, energy and reliability.
Reducing Power Consumption in a Sensor Network by Information Feedback
Mikalai Kisialiou (University of Minnesota, USA); Zhi-Quan Luo (University of Minnesota, USA)
We study the role of information feedback for the problem of distributed signal tracking/estimation using a sensor network with a fusion center. Assuming that the fusion center has sufficient energy to reliably feed back its intermediate estimates, we show that the sensors can substantially reduce their power consumption by using the feedback information in a manner similar to the stochastic approximation scheme of Robbins-Monro. For the problem of tracking an autoregressive source or estimating an unknown parameter, we quantify the total achievable power saving (as compared to the distributed schemes with no feedback), and provide numerical simulations to confirm the theoretical analysis.
Distributed Estimation with Dependent Observations in Wireless Sensor Networks
Sung-Hyun Son (Princeton University, USA); Sanjeev Kulkarni (Princeton University, USA); Stuart Schwartz (Princeton University, USA)
A wireless sensor network with a fusion center is considered to study the effects of dependent observations on the parameter estimation problem. The sensor observations are corrupted by Gaussian noise with geometric spatial correlation. From an energy point of view, sending all the local data to the fusion center is the most costly, but leads to optimum performance results since all the dependencies are taken into account. From an estimation accuracy point of view, sending only parameter estimates is the least accurate, but is the most parsimonious in terms of communication costs. Hence, this tradeoff between the energy efficiency and the estimation accuracy is explored by comparing the performance of maximum likelihood estimator (MLE) and the sample average estimator (SAE) under various topologies and communication protocols. We start by reviewing the results from the one-dimensional case and continue by extending those results to various two-dimensional topologies. Surprisingly, we discover a class of regular polygon topologies where the MLE under spatial correlation reduces to the SAE.
Distributed Estimation with Ad Hoc Wireless Sensor Networks
Ioannis Schizas (University of Minnesota, USA); Alejandro Ribeiro (University of Minnesota, USA); Georgios B. Giannakis (University of Minnesota,, USA)
We consider distributed estimation of a deterministic parameter vector using an ad hoc wireless sensor network. The estimators derived are obtained as solutions of constrained convex optimization problems. Using the method of multipliers in conjunction with a block coordinate descent approach we demonstrate how the resultant algorithms can be decomposed into a set of simpler tasks suitable for distributed implementation. We show that these algorithms have guaranteed convergence to properly defined optimum estimators, and further exemplify their applicability to solving estimation problems where the signal model is completely or partially known at individual sensors. Through numerical experiments we illustrate that our algorithms outperform existing alternatives.
Distributed Detection and Estimation in Decentralized Sensor Networks: An Overview
Sergio Barbarossa (University of Rome, Italy); Gesualdo Scutari (University of Rome "La Sapienza", Italy); Ananthram Swami (Army Research Laboratory, USA)
In this work we review some of the most recent in-network computation capabilities that can be used in sensor networks to alleviate the information traffic from the sensors towards the sink nodes. More specifically, after briefly reviewing distributed average consensus techniques, we will concentrate on consensus mechanisms based on self-synchronization of coupled dynamical systems, initialized with local measurements. We will show how to achieve globally optimal distributed detection and estimation through minimum exchange of information between nearby nodes in the case where the whole network observes one common event.
Spatio-Temporal Sampling and Distributed Compression of the Sound Field
Thibaut Ajdler (EPFL, Switzerland); Robert Konsbruck (Swiss Federal Institute of Technology - EPFL, Switzerland); Olivier Roy (Ecole Polytechnique Federale de Lausanne (EPFL), Switzerland); Luciano Sbaiz (Ecole Polytechnique Federale de Lausanne (EPFL), Switzerland); Emre Telatar (EPFL, Switzerland); Martin Vetterli (EPFL, Switzerland)
We investigate the spatio-temporal characteristics of the sound field. Spatial sampling using a set of microphones is studied for different array topologies. The reconstruction problem is also discussed. Distributed compression is then addressed using an information-theoretic point of view. In particular, optimal rate-distortion tradeoffs are derived for a linear network setup and a hearing aids configuration.

### Tue.1.1: OFDM and Multicarrier Systems - 6 papers

Room: Auditorium
Chair: Shahram Shahbazpanahi (University of Ontario Institute of Technology, Canada)
Balanced allocation strategy in multi-user OFDM with Channel State Information at the transmitter
Antonio Cipriano (ENST, France); Philippe Ciblat (ENST, France); Sophie Gault (Motorola Labs, France); Walid Hachem (Supélec, France)
In powerline or quasi-static wireless systems, the use of a multi-user OFDM based communication is advocated. It is reasonable to consider that the channel is known at the transmitter and receiver. Moreover, in downlink, a spectral mask constraint is usually imposed. Furthermore, to guarantee fairness between all the active users, a balanced rate based criterion optimization is recommended. Therefore, in such previous context, we investigate the achievable rate region in the general case of MC-DS-CDMA, of which the OFDMA is a particular case. Then we propose a simplified algorithm to calculate an approximate balanced rate solution for the OFDMA case. The loss of the OFDMA solution with respect to the MC-DS-CDMA solution is shown to be acceptable. Comparisons with other OFDMA allocation algorithms have also been performed.
Group-wise Blind OFDM ML Detection for Complexity Reduction
Tsung-Hui Chang (National Tsing Hua University, Taiwan); Wing-Kin Ma (National Tsing Hua University, Taiwan); Chong-Yung Chi (National Tsing Hua University, Taiwan)
This paper presents a low-complexity blind Maximum-Likelihood (ML) detector for Orthogonal Frequency Division Multiplexing (OFDM) systems in block fading channels. We reduce the receiver complexity by subcarrier grouping (SG) in which the OFDM block is partitioned into smaller groups, and then the data are detected on a group-by-group basis. An identifiability analysis is also provided, which shows that the data in each group can be identified under a more relaxed condition than that in [1], enabling us to use small group size for implementation efficiency. Our simulation results show that the proposed detector can provide good symbol error performance even when the group size is much smaller than the discrete Fourier transform size.
Joint Compensation of OFDM Transmitter and Receiver IQ Imbalance in the Presence of Carrier Frequency Offset
Deepaknath Tandur (Katholieke Universitiet Leuven, Belgium); Marc Moonen (Katholieke Universiteit Leuven, Belgium)
Zero-IF based OFDM transmitters and receivers are gaining a lot of interest because of their potential to enable low-cost, low-power and less bulky terminals. However these systems suffer from In-phase/Quadrature-phase (IQ) imbalances in the front-end analog processing which may have a huge impact on the performance. We also consider the case where the local oscillator suffers from carrier frequency offset. As OFDM is very sensitive to the carrier frequency offset, this distortion needs to be taken into account in the derivation and analysis of any IQ imbalance estimation/compensation scheme. In this paper the effect of both transmitter and receiver IQ imbalance under carrier frequency offset in an OFDM system is studied and algorithms are developed to compensate for such distortions in the digital domain.
First arrival detection based on channel estimation for positioning in wireless OFDM systems
Ali Aassie-Ali (University of Magdeburg, Germany); Van Duc Nguyen (International University Bremen, Germany); Kyandoghere Kyamakya (Department of Informatics Systems, University of Klagenfurt, Austria); Abbas S. Omar (University of Magdeburg, Germany)
Based on the estimated channel, this paper presents a new method for first arrival estimation for positioning application in OFDM mobile communication systems. In the new method, the characteristics of the information theoretic criteria is exploited to estimate the time of arrival (TOA). The information theoretic criteria is established on the basic of the different statistical characteristics of noise and the mobile channel. In the proposed algorithm, the calculation of the autocorrelation matrix and their eigenvalues are not required. Therefore, the complexity of the proposed method is low. Simulation results show that the performance of system in terms of the detection rate is very high An accurate estimation of the first arrival path (or the time of arrival) can be obtained even though the first arrival path is strongly attenuated and the system suffers from strong additive noise.
Frequency-Domain IQ-Imbalance and Carrier Frequency Offset Compensation for OFDM over Doubly Selective Channels
Imad Barhumi (KULeuven-ESAT/SCD, Belgium); Marc Moonen (Katholieke Universiteit Leuven, Belgium)
In this paper we propose a frequency-domain IQ-imbalance and carrier frequency offset (CFO) compensation and equalization for OFDM transmission over doubly selective channels. IQ-imbalance and CFO arise due to imperfections in the receiver and/or transmitter analog front-end, whereas user mobility and CFO give rise to channel time-variation. In addition to IQ-imbalance and the channel time-variation, the cyclic prefix (CP) length may be shorter than the channel impulse response length, which in turn gives rise to inter-block interference (IBI). While IQ-imbalance results in a mirroring effect, the channel time-variation results in inter-carrier interference (ICI). The frequency-domain equalizer is proposed to compensate for the IQ-imbalance taking into account ICI and IBI. The frequency-domain equalizer is obtained by transferring a time-domain equalizer to the frequency-domain resulting in the so-called per-tone equalizer (PTEQ).
Low Complexity Post-Coded OFDM Communication System : Design and Performance Analysis
Syed Faisal Shah (University of Minnesota, USA); Ahmed Tewfik (Prof. University of Minnesota, USA)
Orthogonal frequency division multiplexing (OFDM) provides a viable solution to communicate over frequency selective fading channels. However, in the presence of frequency nulls in the channel response, the uncoded OFDM faces serious symbol recovery problems. As an alternative to previously reported error correction techniques in the form of pre-coding for OFDM, we propose the use of post-coding of OFDM symbols in order to achieve frequency diversity. Our proposed novel post-coded OFDM (PC-OFDM) comprises of two steps: 1) upsampling of OFDM symbols and 2) subsequent multiplication of each symbol with unit magnitude complex exponentials. It is important to mention that PC-OFDM introduces redundancy in OFDM symbols while precoded OFDM introduces redundancy in data symbols before performing the IFFT operation. The main advantages of this scheme are reduction in system complexity by having a simple encoder/decoder, smaller size IFFT/FFT (inverse fast Fourier transform/fast Fourier transform) modules, and lower clock rates in the receiver and transmitter leading to lower energy consumption. The proposed system is found to be equally good over Gaussian and fading channels where it achieves the maximum diversity gain of the channel. Simulation results show that PC-OFDM performs better than existing precoded OFDM and Pulse OFDM systems.

### Tue.6.1: Filter Bank Design and Analysis - 6 papers

Room: Room 4
Chair: Elio Di claudio (University of Rome La Sapienza, Italy)
Fast windowing technique for designing discrete wavelet multitone transceivers exploiting spline functions
Fernando Cruz-Roldán (Universidad de Alcalà¡, Spain); Manuel Blanco-Velasco (University of Alcala, Spain); Pilar Martín-Martín (Universidad de Alcalà¡, Spain); Tapio Saramaki (Tampere University of Technology, Finland)
A very fast technique for designing discrete wavelet multi-tone (DWMT) transceivers without using time-consuming nonlinear optimization is introduced. In this method, the filters in both the transmitting and receiving filter banks are generated based on the use of a single linear-phase finite-impulse response prototype filter and a cosine-modulation scheme and the prototype filter is optimized by using the windowing technique. The novelty of the proposed technique lies in exploiting spline functions in the transition band of the ideal filter, instead of using the conventional brick-wall filter. In this approach, a simple line search is used for find-ing the passband edge of the ideal filter for minimizing a predetermined cost function. The resulting DWMT trans-ceivers closely satisfy the perfect reconstruction property, as is illustrated by means of examples.
Oversampled complex-modulated causal IIR filter banks for flexible frequency-band reallocation networks
This paper introduces a class of oversampled complex-modulated causal IIR filter banks for flexible frequency-band reallocation networks. In the simplest case, they have near perfect magnitude reconstruction (NPMR), but by adding a phase equalizer they can achieve near-PR.
Basis orthonormalization procedure impacts of the basic quadratic non-uniform spline space on the scaling and wavelet functions
Anissa Zergainoh (LSS, Supelec, France); Pierre Duhamel (LSS SUPELEC, France)
This paper investigates the mathematical framework of the multiresolution approach under the assumption that the sequence knots are irregularly spaced. The study is based on the construction of nested non-uniform quadratic spline multiresolution spaces. We focus on the construction of suitable quadratic orthonormal spline scaling and wavelet bases. If no more additional conditions than multiresolution ones are imposed, the orthonormal basis of the quadratic spline space is represented, on each bounded interval of the sequence, by three discontinuous scaling functions. Therefore, the quadratic spline wavelet basis, closely related to the scaling basis, is also defined by a set of discontinuous wavelet functions on each bounded interval of the sequence. We show that a judicious orthonormalization procedure of the basic quadratic spline space basis allows to (i) satisfying the continuity conditions of the scaling and wavelet functions, (ii) reducing the number of the wavelet functions to only one function, and (iii) reducing the complexity of the filter bank.
Quaternionic Approach to the One-Regular Eight-Band Linear Phase Paraunitary Filter Banks
Marek Parfieniuk (Bialystok Technical University, Poland); Alexander Petrovsky (Bialystok Technical University, Poland)
Besides perfect reconstruction and linear phase, regularity is a desirable essential property of filter banks for image coding as it is associated with the smoothness of the related wavelet basis. This paper shows how to constrain quaternionic factorizations of eightband linear phase paraunitary filter banks to have the first regularity structurally imposed. The result is not very general but some facts make it notable. Firstly, these systems are a direct extension of the standard eight-point discrete cosine transform (DCT) and this facilitates practical applications. Secondly, the first regularity eliminates the DC leakage which cause visually annoying checkerboard artifact. Finally, our solution offers clear advantages over the known ones as the regularity conditions are formulated directly in terms of quaternionic lattice coefficients. Namely, both regularity and losslessness can be easily preserved regardless of coefficient quantization unavoidable in finite-precision implementations.
The Design of Low-Delay Nonuniform Pseudo QMF Banks
Ying Deng (University of Utah, USA); V. John Mathews (University of Utah, USA); Behrouz Farhang-Boroujeny (Univ of Utah, USA)
This paper presents a method for designing low-delay nonuniform pseudo QMF banks. The method is motivated by the work of Li, Nguyen and Tantaratana, in which the nonuniform filter bank is realized by combining an appropriate number of adjacent subbands of a uniform pseudo QMF filter bank. In prior work, the prototype filter of the uniform pseudo QMF is constrained to have linear phase and the overall delay associated with the filter bank was often unacceptably large for filter banks with a large number of subbands. By relaxing the linear phase constraints, this paper proposes a pseudo QMF filter bank design technique that significantly reduces the delay. An example that experimentally verifies the capabilities of the design technique is presented.
Theory and Lattice Structures for Oversampled Linear Phase Paraunitary Filter Banks with Arbitrary Filter Length
Zhiming Xu (Nanyang Technological University, Singapore); Anamitra Makur (Nanyang Technological University, Singapore)
This paper presents the theory and lattice structures of a large class of oversampled linear phase paraunitary filter banks. We deal with FIR filter banks with real-valued coefficients in which all analysis filters have the same arbitrary filter length and share the same symmetry center. Necessary existence conditions on symmetry polarity of the filter banks are firstly derived. Lattice structures are developed for type-I oversampled linear phase paraunitary filter banks. Furthermore, these lattice structures can be proven to be complete. Finally, several design examples are presented to confirm the validity of the theory and lattice structures.

### Tue.4.1: Beamforming - 6 papers

Room: Sala Onice
Chair: Andreas Jakobsson (Karlstad University, Sweden)
On the Efficient Implementation and Time-Updating of the Linearly Constrained Minimum Variance Beamformer
Andreas Jakobsson (Karlstad University, Sweden); Stephen Alty (King's College London, United Kingdom)
The linearly constrained minimum variance (LCMV) method is an extension of the classical minimum variance distortionless response (MVDR) filter, allowing for multiple linear constraints. Depending on the spatial filter length and the desired frequency grid, a direct computation of the resulting spatial beampattern may be prohibitive. In this paper, we exploit the rich structure of the LCMV expression to find a non-recursive computationally efficient implementation of the LCMV beamformer with fixed constraints. We then extend this implementation by means of its time-varying displacement structure to derive an efficient time-updating algorithm of the spatial spectral estimate. Numerical simulations indicate a dramatic computational gain, especially for large arrays and fine frequency grids.
Generalization and Analysis of the Conventional Beamformer for Localization of Spatially Distributed Sources
Mikael Coldrey (Chalmers University of Technology, Sweden); Mats Viberg (Chalmers University of Technology, Sweden)
In this paper, we generalize the point source-based conventional beamformer (CBF) to localization of multiple distributed sources that appear in sensor array processing. A distributed source is commonly parameterized by its mean angle and spatial spread. The generalized CBF uses the principal eigenvector of the parameterized signal covariance matrix as its optimal weight vector, which is also shown to be a matched filter. The desired parameter estimates are taken as the peaks of the generalized 2-dimensional beamforming spectrum. Further, the performance of the algorithm is compared numerically to a generalized Capon estimator [1]. Finally, an asymptotic performance analysis of the proposed algorithm is provided and numerically verified.
Multichannel fast QR-decomposition RLS algorithms with explicit weight extraction
Mobien Shoaib (Helsinki University of Technology, Finland); Stefan Werner (Helsinki University of Tehcnology, Finland); José Apolinário Jr. (IME, Brazil); Timo Laakso (Helsinki University of Technology, Finland)
Multichannel fast QR decomposition RLS (MC-FQRD-RLS) algorithms are well known for their good numerical properties and low computational complexity. However, these algorithms have been restricted to problems seeking an estimate of the output error signal. This is because their transversal weights are embedded in the algorithm variables, and are not explicitly available. In this paper we present a novel technique that can extract the filter weights associated with the MC-FQRD-RLS algorithm at any time instant. As a consequence, the range of applications is extended to include problems where explicit knowledge of the filter weights is required. The proposed weight extraction technique is used to identify the beampattern of a broadband adaptive beamformer implemented with an MC-FQRD-RLS algorithm. The results verify our claim that the extracted coefficients of the MC-FQRD-RLS algorithm are identical to those obtained by any RLS algorithm such as the inverse QRD-RLS algorithm.
Downlink beamforming under EIRP constraint in WLAN OFDM systems
Alex Kuzminskiy (Bell Laboratories, Lucent Technologies, United Kingdom)
Downlink beamforming for a wireless local area network (WLAN) orthogonal frequency-division multiplexing (OFDM) system is addressed, where an access point (AP) is equipped with multiple antennas and a terminal uses a single antenna. Conventional beamforming solutions with the total power (TP) and equivalent isotropic radiated power (EIRP) constraints are compared in typical propagation conditions. A joint optimization problem for the beamforming weights over all the sub-carriers subject to the EIRP constraint is formulated and a sub-optimal solution is proposed, which is based on grouping of non-adjacent sub-carriers. Its efficiency is illustrated in the IEEE 802.11 environment.
Interference Suppression in Multiuser Downlink MIMO Beamforming using an iterative optimization approach
Vimal Sharma (Cardiff University, United Kingdom); Sangarapillai Lambotharan (Cardiff University, United Kingdom)
Multiple antennas at the transmitter and the receiver have the potential to either increase the data rate through spatial multiplexing or enhance the quality of transmission through exploitation of diversity. In this paper, we address the problem of multi-user multiplexing using spatial diversity techniques so that a base station could serve multiple users in the same frequency band making huge saving in bandwidth utilization. In particular, we have proposed various techniques to improve substantially the performance of a recently proposed signal-to-leakage maximization based algorithm. Our simulation results reveal a lower error floor and more than 10 dB improvement in BER performance.
Wiener solution for OFDM pre and post-FFT beamforming
Daniele Borio (Politecnico di Torino, Italy); Laura Camoriano (Istituto Superiore Mario Boella, Italy); Letizia Lo Presti (Politecnico di Torino, Italy)
Orthogonal Frequency Division Multiplexing (OFDM) is one of the most promising techniques for high-speed transmission over severe multipath fading channel. However, once delays of secondary multipath rays are greater than the guard interval duration, intersymbol interference causes a severe degradation in the transmission performance. To solve this problem, a multiple antenna array can be used at the receiver, not only for spectral efficiency or gain enhancement, but also for interference suppression. In this paper we analyze the asymptotical behavior of two beamforming algorithms, a low complexity pre-FFT method and a more efficient post-FFT system. The optimum weight set for beamformers is derived on the basis of the minimum mean square error (MMSE) criterion and the Wiener solution is studied under different working conditions.

### Tue.3.1: Reverberant Environment and Human Audition - 6 papers

Room: Sala Verde
Chair: Giovanni Sicuranza (University of Trieste, Italy)
An evaluation measure for reverberant speech using tail decay modelling
Jimi Wen (Imperial College London, United Kingdom); Patrick Naylor (Imperial College London, United Kingdom)
An objective measure for the perceived effect of reverberation is an essential tool for research into dereverberation algorithms or acoustic space modelling. There are two different effects that contribute to the total perceived reverberation, colouration and the reverberation decay tail effect. This paper presents an objective reverberation decay tail measure RDT , and measures perceptually the effect directly from speech. RDT uses the perceptually weighted Bark Spectral Distortion (BSD) in conjunction with an end-point search algorithm and decay modelling. The measure was evaluated against T60 and BSD for colourless and constant colouration reverberation impulse responses.
Bernard Mulgrew (Institute for Digital Communications, The University of Edinburgh, United Kingdom)
An alternative mechanism for audio masking is postulated. This mechanism is derived as a solution to the classic problem of representing a signal as a linear combination of basis functions which are only approximately orthogonal and hence are prone to leakage. This mechanism involves augmenting each basis function or filter with an auxiliary filter. In this combined detection/estimation process the instantaneous amplitude output of the auxiliary filter sets the masking threshold for the basis filter. No interconnection between basis functions is required to compute this masking threshold. For a gammatone filter bank the auxiliary filter is formed from the cascade of the gammatone itself and a single zero notch filter. The single zero (in the z-plane) has the same frequency as the centre frequency as the gammatone filter and is at a radius dependent on its bandwidth.
Blind Estimation of Reverberation Time in Occupied Rooms
Yonggang Zhang (Cardiff University, United Kingdom); Jonathon Chambers (Cardiff University Wales, United Kingdom); Francis Li (Manchester Metropolitan University, United Kingdom); Paul Kendrick (University of Salford, United Kingdom); Trevor Cox (University of Salford, United Kingdom)
A new framework is proposed in this paper to solve the reverberation time (RT) estimation problem in occupied rooms. In this framework, blind source separation (BSS) is combined with an adaptive noise canceller (ANC) to remove the noise from the passively received reverberant speech signal. A polyfit preprocessing step is then used to extract the free decay segments of the speech signal. RT is extracted from these segments with a maximum-likelihood (ML) based method. An easy, fast and consistent method to calculate the RT via the ML estimation method is also described. This framework provides a novel method for blind RT estimation with robustness to ambient noises within an occupied room and extends the ML method for RT estimation from noise-free cases to more realistic situations. Simulation results show that the proposed framework can provide a good estimation of RT in simulated low RT occupied rooms.
Analysis of a Multichannel Filtered-X Partial-Error Affine Projection Algorithm
Alberto Carini (University of Urbino, Italy); Giovanni Sicuranza (University of Trieste, Italy)
The paper provides an analysis of the transient and the steady-state behavior of a filtered-$x$ partial error affine projection algorithm suitable for multichannel active noise control. The analysis relies on energy conservation arguments, it does not apply the independence theory nor does it impose any restriction to the signal distributions. The paper shows that the partial error filtered-$x$ affine projection algorithm in presence of stationary input signals converges to a cyclostationary process, i.e., the mean value of the coefficient vector, the mean-square-error and the mean-square-deviation tend to periodic functions of the sample time.
On Robust Inverse Filter Design for Room Transfer Function Fluctuations
Takafumi Hikichi (NTT Communication Science Laboratories, Japan); Marc Delcroix (Hokkaido University, Japan); Masato Miyoshi (NTT Communication Science Laboratories, Japan)
Dereverberation methods based on the inverse filtering of room transfer functions (RTFs) are attractive because high deconvolution performance can be achieved. Although many methods assume that the RTFs are time-invariant, this assumption would not be guaranteed in practice. This paper deals with the problem of the sensitivity of a dereverberation algorithm based on inverse filtering. We evaluate the effect of RTF fluctuations caused by source position changes on the dereverberation performance. We focus on the filter energy with a view to making the filter less sensitive as regards these fluctuations. By adusting three design parameters, namely, filter length, modeling delay, and regularization parameter, the signal-to-distortion ratio was improved by up to 15 dB when the source position was changed with one-eighth wavelength distance, whereas conventional investigations have claimed that such a variation would cause a large degradation.
Masahiro Yukawa (Tokyo Institute of Technology, Japan); Isao Yamada (Tokyo Institute of Technology, Japan)
Adaptive Projected Subgradient Method (APSM) serves as a unified guiding principle of various set-theoretic adaptive filtering algorithms including NLMS/APA. APSM asymptotically minimizes a sequence of non-negative convex functions in a real-Hilbert space. On the other hand, the exponentially weighted stepsize projection (ESP) algorithm has been reported to converge faster than APA in the acoustic echo cancellation (AEC) problem. In this paper, we first clarify that ESP is derived by APSM in a real Hilbert space with a special inner product. This gives us an interesting interpretation that ESP is based on iterative projections onto the same convex sets as APA with a special metric. We can thus expect that a proper choice of metric will lead to improvement of convergence speed. We then propose an efficient adaptive algorithm named adaptive quadratic-metric parallel subgradient projection (AQ-PSP). Numerical examples demonstrate that GEM-PSP with a very simple metric achieves even better echo canceling ability than ESP, proportionate NLMS, and Euclidean-metric version of AQ-PSP, while keeping low computational complexity.

Room: Auditorium

## 11:20 AM - 1:00 PM

### Tue.2.2: Filter Design and Analysis - 5 papers

Chair: Wolfgang Mecklenbraeuker (TU Wien, Austria)
The Design of Multi-Dimensional Complex-Valued FIR Digital Filters by Projections Onto Convex Sets
Wail Mousa (University of Leeds, United Kingdom); Des McLernon (The University of Leeds, United Kingdom); Said Boussakta (University of Leeds, United Kingdom)
We propose the design of multi-dimensional (m-D) complex-valued FIR digital filters using the thoery of Projection onto Convex Sets (POCS). The proposed design algorithm is a generalization of the one-dimensional (1-D) real-valued FIR filter design cases to m-D complex-valued FIR filters. Simulation results show that the resulting frequency responses possess an approximate equiripple nature. Also, they illustrate superior designs using POCS when compared with the complex Remez filter design method.
On the Time Invariance of Linear Systems
Valentina Cellini (University of Padova, Italy); Gianfranco Cariolaro (University of Padova, Italy); Francesco De Pellegrini (Universita di Padova, Italy)
In this paper we revise the concept of time invariance of a system, using exclusively a time domain approach. To this aim, we adopt a group theoretical formulation which let us recognize that the concept of time invariance is truly confined to a few possible cases. This depends on the fact that, for a general multirate linear system, the shift invariance property depends not only on the kernel, but also on the input and output domains. We illustrate the concept with an example of application in onedimensional domains, indicating that our definition has a useful impact in the analysis and synthesis of multirate linear systems. Furthermore, the proposed approach permits to extend the concept of system invariance to multidimensional domains.
Variable Digital Filter Design with Least Square Criterion subject to Peak Gain Constraints
Hai Huyen Dam (Western Australian Telecommunications Research Institute, Australia); Antonio Cantoni (University of Western Australia, Australia); Kok Teo (Curtin University of Technology, Australia); Sven Nordholm (Western Australian Telecommunications Research Institute, Australia)
Variable digital filters (VDFs) are useful for various signal processing and communication applications where the frequency characteristics, such as fractional delays and cutoff frequencies, can be varied online. In this paper, we present a formulation that allows the trade-off between the integral squared error and the maximum deviation from the desired response in the passband and stopband. With this formulation, the maximum deviation can be reduced below the least square solution with a slight change in the performance of the integral squared error. Similarly, the total square error can be reduced below the minmax solution with a minor change in the maximum deviation from the minmax solution. Efficient numerical schemes with adaptive grid size are proposed for solving the optimization problems.
Parametric Higher-Order Shelving Filters
Martin Holters (Helmut-Schmidt-University, Germany); Udo Zölzer (Helmut-Schmidt-University, Germany)
The main characteristic of shelving filters, as commonly used in audio equalization, is to amplify or attenuate a certain frequency band by a given gain. For parametric equalizers, a filter structure is desirable that allows independent adjustment of the width and center frequency of the band, and the gain. In this paper, we present a design for arbitrary-order shelving filters and a suitable parametric digital filter structure. A low-shelving filter design based on Butterworth filters is decomposed such that the gain can be easily adjusted. Transformation to the digital domain is performed, keeping gain and denormalized cut-off frequency independently controllable. Finally, we obtain band- and high-shelving filters using a simple manipulation, providing the desired parametric filter structure.
Reconstruction of M-Periodic Nonuniformly Sampled Signals Using Multivariate Polynomial Impulse Response Time-Varying FIR Filters
This paper introduces multivariate polynomial impulse response time-varying FIR filters for reconstruction of M-periodic nonuniformly sampled signals. The main advantages of these reconstruction filters are that 1) they do not require on-line filter design, and 2) most of their multipliers are fixed and can thus be implemented using low-cost dedicated multiplier elements. This is in contrast to existing filters that require on-line design as well as many general multipliers in the implementation. By using the proposed filters, the overall implementation cost may therefore be reduced in applications where the sampling pattern changes now and then. Design examples are included demonstrating the usefulness of the proposed filters.

### Tue.5.2: Channel Estimation - 5 papers

Chair: Antonio Napolitano (Universita di Napoli Parthenope, Italy)
Joint MIMO Channel Tracking and Symbol Decoding for Orthogonal Space-Time Block Codes
Balasingam Balakumar (McMaster University, Canada); Shahram Shahbazpanahi (University of Ontario Institute of Technology, Canada); Kiruba Kirubarajan (McMaster University, Canada)
In this paper, the problem of channel tracking is considered for multiple-input multiple-output (MIMO) communication systems where the MIMO channel is time-varying. We consider a MIMO system where orthogonal space-time block code is used as the underlying space-time coding scheme. For such a system, a two-step MIMO channel tracking algorithm is proposed. As the first step, Kalman filtering is used to obtain an initial channel estimate for the current block based on the channel estimates obtained for previous blocks. Then, in the second step, the so-obtained initial channel estimate is refined using a decision-directed iterative method. We show that due to specific properties of orthogonal space-time block codes (OSTBCs), both the Kalman filter and the decision-directed algorithm can be significantly simplified. Simulation results show that the proposed tracking method can provide accurate enough channel estimates.
Wavelet domain channel estimation for Multiband OFDM UWB communications
Sajad Sadough (Ecole Natonale Supérieure de Techniques Avancées, France); Emmanuel Jaffrot (Ecole Nationale Supérieure de Techniques Avancées, France); Pierre Duhamel (LSS SUPELEC, France)
This paper presents a receiver that combines semi-blind channel estimation with the decoding process for multiband OFDM UWB communications. We particularly focus on reducing the number of estimated channel coefficients by taking advantage of the sparsity of UWB channels in the wavelet domain. The EM algorithm is used to estimate the channel without any need to pilot symbols inside the data frame. Channel estimation performance is enhanced by integrating a thresholding/denoising scheme within the EM algorithm leading at the same time to a reduction of the estimator complexity. Simulation results using IEEE UWB channel models show 3 dB of SNR improvement at a BER of $10^{-3}$ compared to training sequence based channel estimation.
Blind subspace-based channel identification for quasi-synchronous MC-CDMA systems employing improper data symbols
Giacinto Gelli (University of Napoli - Federico II, Italy); Francesco Verde (Università  degli Studi di Napoli Federico II, Italy)
The problem of blind channel identification in quasi-synchronous (QS) multicarrier code-division multiple-access (MC-CDMA) systems is considered. When improper modulation schemes are adopted, improved subspace-based algorithms, which process both the received signal and its complex-conjugate version, must be employed in order to exploit also the channel information contained in the conjugate correlation function of the channel output. An improved subspace-based algorithm for QS-MC-CDMA systems is devised herein which, compared with a recently proposed subspace-based identification method [1], allows one to achieve improved performances. The identifiability issues concerning the proposed method are addressed in detail, and translated into explicit conditions regarding the maximum number of users, their corresponding channels, and their spreading codes. Finally, numerical simulations are provided to assess the performances of the considered algorithm, in comparison with those of [1].
LOW COMPLEXITY MIMO CHANNEL ESTIMATION FOR FEXT PRECOMPENSATION IN VECTORED xDSL SYSTEMS
Pawel Turcza (AGH University of Science and Technology, Poland)
Far-end Crosstalk (FEXT) is the major limiting factor in further increase of data rate in xDSL systems. Hopefully, FEXT can be easily removed by modelling transmission channel as MIMO channel and applying vectored transmis-sion. Having the knowledge of MIMO channel transfer function, FEXT can be completely cancelled by appropriate pre-distortion of transmitted signals. Therefore, in the paper we propose low complexity MIMO channel estimator employing Set-Membership Adaptive Recursive Techniques (SMART) with additional DFT based interpolation. As it is shown, proposed method is very accurate enabling systems with FEXT pre-compensation to approach performance of the ones operating on FEXT free channel. It is also efficient in terms of required amount of training data.
A Low Complexity Iterative Channel Estimation and Equalisation Scheme for (Data-Dependent) Superimposed Training
Syed Moosvi (The University of Leeds, United Kingdom); Des McLernon (The University of Leeds, United Kingdom); Enrique Alameda-Hernandez (The University of Leeds, United Kingdom); Aldo Orozco-Lugo (CINVESTAV-IPN, Mexico); M. Mauricio Lara (CINVESTAV-IPN, Mexico)
Channel estimation/symbol detection methods based on superimposed training (ST) are known to be more bandwidth efficient than those based on traditional time-multiplexed training. In this paper we present an iterative version of the ST method where the equalised symbols obtained via ST are used in a second step to improve the channel estimation, approaching the performance of the more recent (and improved) data dependent ST (DDST), but now with less complexity. This iterative ST method (IST) is then compared to a different iterative superimposed training method of Meng and Tugnait (LSST). We show via simulations that the BER of our IST algorithm is very close to that of the LSST but with a reduced computational burden of the order of the channel length. Furthermore, if the LSST iterative approach (originally based on ST) is now implemented using DDST, a faster convergence rate can be achieved for the MSE of the channel estimates.

### Tue.1.2: MIMO: Prototyping, VLSI and Testbeds I (Special session) - 5 papers

Room: Auditorium
Chair: Steffen Paul (Infineon AG, Germany)
Chair: Markus Rupp (TU Wien, Austria)
System Level Design Considerations for HSUPA User Equipment
Moritz Harteneck (Aeroflex Inc, United Kingdom)
Within Release 6 of the 3GPP standards, one of the most important features is High Speed Uplink Packet Access (HSUPA) or enhanced DCH (E-DCH), which is the uplink counterpart for High Speed Downlink Packet Access (HSDPA). Most notable improvements, when compared to the R99 specification, are the achievable peak data rate of 5.76 Mbps, reduced latency due to a shortened transmission time interval and increased uplink cell throughput. This has been achieved by the use of multi-code transmission on the uplink, together with an improved forward error correction scheme including the use of hybrid automatic repeat request operating between the UE and the nodeB and a tighter (nodeB based) control of the uplink resources. In this paper, system level design considerations are de-rived which point out the design problems one faces when designing a HSUPA compliant UE. First, the HSUPA system is explained, then the receiver is analysed in more detail and finally, considerations for the RF transmitter block are shown.
IEEE 802.11n MIMO-Prototyping with Dirty RF Using the Hardware-in-the-Loop Approach
Matthias Stege (Signalion GmbH, Germany); Tim Hentschel (Signalion GmbH, Germany); Michael Löhning (Signalion GmbH, Germany); Gerhard Fettweis (Technische Universitaet Dresden, Germany); Marcus Windisch (Technische Università¤t Dresden, Germany)
Modern wireless systems employ highly integrated hardware. Especially for the processing at radio frequencies this high integration causes many undesired effects of signal distortion and degradation that must be simulated comprehensively before finalizing the system design. However, often the model accuracy is not sufficient to obtain sound results of the simulations; and in the case of sufficiently accurate models the simulation times get immense. A way out is to use real radio frequency hardware and digital physical layer simulations together in a hardware-in-the-loop system. Short simulation times and real-world radio characteristics are the unbeatable advantage of the hardware-in-the-loop approach.
Real-Time Implementation of a Sphere Decoder-Based MIMO Wireless System
Mikel Mendicute (University of Mondragon, Spain); Luis Barbero (University of Edinburgh, United Kingdom); Gorka Landaburu (University of Mondragon, Spain); John. S Thompson (University of Edinburgh, United Kingdom); Jon Altuna (University of Mondragon, Spain); Vicente Atxa (University of Mondragon, Spain)
This contribution analyzes the integration of the sphere decoder (SD) in a complete field-programmable gate array (FPGA)-based real-time multiple input-multiple output (MIMO) platform. The algorithm achieves the performance of the maximum likelihood detector (MLD) with reduced complexity. However, its non-deterministic complexity, depending on the noise level and the channel conditions, hinders its integration process. This paper evaluates the performance and limitations of the SD in a real-time environment where signal impairments, such as symbol timing, imperfect channel estimation or quantization effects are considered.
Real-Time Experiments on Channel Adaptive Transmission in the Multi-User Up-link at very high Data Rates using MIMO-OFDM
Thomas Haustein (Heinrich Hertz Institut Berlin, Germany); Andreas Forck (HHI, Germany); Holger Gäbler (FhG-HHI, Germany); Volker Jungnickel (Fraunhofer Institut fà¼r Nachrichtentechnik (Heinrich-Hertz-Institut) Berlin, Germany); Stefan Schiffermueller (FhG-HHI, Germany)
In this paper we focus on channel adaptive transmission in the multi-user OFDM uplink where the base station uses multiple antennas. The additional degree of freedom in space requires extra signal processing effort which becomes challenging especially for a high data rate implementation in real-time. Our MIMO-OFDM experimental system which is capable to transmit data rates beyond 1~Gbit/s, was enhanced by adaptive resource allocation, where the modulation on each antenna and each sub-carrier was controlled by a narrow-band feed-back channel. We present experimental results for the total rate achieved at the base station and the individual rates per user terminal in line-of-sight and non-line-of-sight scenarios. We compare the rates expected from theory on the measured indoor channels with rates achieved in the experiments.
An 8x8 RLS based MIMO detection ASIC for broadband MIMO-OFDM wireless transmissions
Jingming Wang (Marvell Semiconductor, USA); Babak Daneshrad (University of California, Los Angeles, USA)
This paper presents the architecture and VLSI implementation of a highly flexible MIMO detection engine which supports a wide array of configurations ranging from 1x1 to 8x8 square, as well as all possible non-symmetric MIMO configurations. The chip is specifically designed to work with an underlying OFDM modulation scheme, and can cover the range of 64 to 1024 subchannels. The chip implements an RLS based MIMO solution which provides a good balance between hardware complexity and overall system performance. To further reduce the complexity, frequency domain linear interpolation is also used. The actual implementation is based upon the highly scalable inverse QR decomposition based systolic array architecture. A single systolic array is time-multiplexed for all OFDM subchannels. This naturally overcomes the pipelining difficulty in traditional single channel systolic arrays without doubling the array size. In conjunction with the array design, a unique input tagging scheme is incorporated to allow dynamic reconfiguration of the ASIC on a per packet basis, and also to reduce power consumption when only a sub-array is needed for the operation. The final implementation of the MIMO detection engine supports up to an 8x8 configuration in 12.5 MHz of bandwidth. A 4x4 or any smaller array can also be supported at up to 25 MHz of bandwidth. The chip was fabricated using a 3.3V/1.8V 0.18um CMOS technology. The resulting core layout measures 29.2mm^2 and clocks at a maximum clock frequency of 58MHz. The power consumption of the chip in a 2x2-25 MHz configuration is 360 mW, whereas the 12.5 MHz 8x8 mode consumes 830mW.

### Poster: Feature extraction in image and video - 10 papers

Room: Poster Area
Chair: Sevket Gumustekin (Izmir Institute of Technology, Turkey)
Morphological Image Processing for Echo Detection on Ultrasonic Signals: An application to foreign bodies detection in the alimentary industry
Ramón Miralles-Ricós (Universidad Politécnica de Valencia, Spain); Maria Jover-Andreu (Universidad Politécnica de Valencia, Spain); Ignacio Bosch-Roig (Universidad Politécnica de Valencia, Spain)
Echo detection on time-varying signals is a typical problem of signal processing. A not so typical application of this problem is the detection of foreign bodies in the alimentary industry. In this work we are going to present some results of a project whose objective was to develop an ultrasonic automatic system for detection of foreign bodies. The algorithm presented merge some ideas of time frequency representation (TFR) and morphological image processing to get an easy to implement and highly customizable algorithm that could be applied to many different products and situations.
Impact of noise on the polarimetric imaging based shape recovery of transparent objects
Romain Roux (Ecole supérieure de physique de Strasbourg, France); Jihad Zallat (Ecole supérieure de physique de Strasbourg, France); Alex Lallement (Ecole supérieure de physique de Strasbourg, France); Ernest Hirsch (Ecole supérieure de physique de Strasbourg, France)
In the field of computer vision, specifically for applications aiming at the accurate 3D reconstruction of either natural or hand-made objects, only a few methods are able to recover the shape of transparent objects. Among these, polarimetric imagery has already proved its ability to deal with such objects. In this paper, after a short presentation of the theoretical background leading to the proposed approach for recovering the shape of transparent surfaces, different processing techniques for polarimetric data are described. In particular, the advantages implied by the use of measures relying on the whole so-called Stokes vector is shown. Secondly, the method used for denoising the multichannel polarimetric image data, in order to improve the following surface recovery process, is introduced. Lastly, the efficiency of the suggested method is demonstrated through experimental results obtained using simulated surface data.
Short Reference Image Quality Rating Based on Angular Edge Coherence
Licia Capodiferro (Fondazione Ugo Bordoni, Italy); Elio Di claudio (University of Rome La Sapienza, Italy); Giovanni Jacovitti (INFOCOM Dpt. University of Rome, Italy)
A very concise comparison technique for objective image quality assessment is described. The method is based on a angular edge coeherence measure defined by local image expansion into a set of harmonic angular functions. Using the angular edge coherence it is possible to estimate the relative quality of a reproduced image with respect to the original one comparing only the values of a single statistical index disjointly extracted from each image. The present contribution briefly illustrates the mathematical support of this technique and provides some significant experimental examples.
Objetive Video Quality Metrics: A Performance Analysis
Jose Luis Martinez Martinez (University of Castilla-La Mancha, Spain); Pedro Cuenca (Universidad Castilla La Mancha, Spain); Francisco Delicado (University of Castilla la Mancha, Spain)
In the last years, the development of novel video coding technologies has spurred the interest in developing digital video communications. The definition of evaluation mechanisms to assess the quality of video will play a major role in the overall design of video communication systems. It is well-known that simple energy based metrics such as the Peak Signal Noise Ratio (PSNR) are not suitable to describe the subjective degradation perceived by a viewer. Recently, new video quality metrics have been proposed in the literature that emulates human perception of video quality since it produces results which are similar to those obtained from the subjective methods. The new models have higher prediction accuracy than the PSNR method, produce consistent results in the range of the data from the subjective tests and are stable across a varying range of video sequences. In this paper, we analyze the capabilities of these new quality measures when are applied to the most popular Hypothetical Reference Circuits (HRC) such as: video compression standards, bit-error transmissions and packet losses.
Application of advanced image processing techniques to automatic Kikuchi lines detection
Rafal Fraczek (AGH, University of Science and Technology, Poland); Tomasz Zielinski (AGH University of Science and Technology, Cracow, Poland)
Automated crystal orientation measurement (ACOM) in the scanning electron microscope (SEM) is a standard technique of texture analysis (pattern recognition) that is used in mate-rials science. The measurement is carried out by interpreting backscatter Kikuchi patterns, in particular by the extraction of the position of so-called Kikuchi bands, i.e. pairs of paral-lel lines. Their detection strongly depends on appropriate processing of a source image, which usually is highly cor-rupted by noise and has uneven background illumination. Such advanced processing is addressed in this paper. It ex-ploits wavelet transform based de-noising as well as curve modification and curvelet transform based contrast en-hancement methods. Additionally, directional, ridge detection type 2D filters are used for searching lines missing to pairs.
Interactive Detection of 3D Models of Building's Roofings for the Estimation of the Solar Energy Potential
Sergio Brofferio (Politecnico Di Milano, Italy); Mariaelena Gaetani (Politecnico Di Milano, Italy); Daniele Passoni (Politecnico Di Milano, Algeria); Massimo Spertino (CNR-IEIIT, Italy)
The paper presents the work in progress of the design and implementation of an interactive system for the detection of the buildings roofing characteristic. For each of its pitches data concerning height, shape, orientation, slope and useful area are estimated at different precision levels. The system operates on a cartography and two pre-processed aerial photographs that are aligned for building selection, image segmentation and 3D modelling. Each building roofing is automatically classified and its characteristics are used for disparity computation in the two stereoscopic views and the final quantitative 3D modelling. Different disparity meas-urement algorithms are being experimented to measure their precision based on reference of test buildings. The 3D model is the input to standard roofing solar energy potential soft-ware packages.
Points of interest extraction using pairwise clustering and spatial features
Dimitrios Besiris (University of Patras, Greece); Nikolaos Laskaris (Artificial Intelligence & Information Analysis Laboratory, Department of Informatics, Aristotle Univ, Greece); Evagellos Zigouris (University of Patras, Greece); George Economou (University of Patras, Greece); Spiros Fotopoulos (University University of Patras, Greece)
In this work, the idea of local features extraction from image data based on points of interest, is revised. The method is based on a nonparametric pairwise clustering algorithm and the application of Huberts test statistic. The clustering algorithm iteratively partitions the input image data until it finally converges to 2 classes. On the other hand the use of Huberts test guarantees that the 2 classes in the feature space are associated with a well organized structure in the image plane. Both algorithms utilize the dissimilarity matrix of the input data. The validity of the approach is demonstrated by applying the method to an image retrieval system.
A blind image super-resolution algorithm for pure translational motion
Fatih Kara (TUBITAK-UEKAE, Turkey); Cabir Vural (Sakarya University, Turkey)
In almost all super-resolution methods, the blur operator is assumed to be known. However, in practical situations this operator is not available or available only within a finite extend. In this paper, a super-resolution algorithm is presented in which the assumption of availability of the blur parameters is not necessary. It is a two-dimensional and single-input multiple-output extension of the well-known constant modulus algorithm which is widely used for blind equalization in communication systems. The algorithm consists of determining a set of deconvolution filters to be applied on interpolated low-resolution and low-quality images and is suitable for pure translational motion only and shift-invariant blur. Experimental results have shown that the proposed method can satisfactorily reconstruct the high-resolution image and remove the blur especially for five or less-bit images.
Automatic Matching of Aerial Images with Map Data using Dynamic Programming
Metin Kahraman (Izmir Institute of Technology, Turkey); Sevket Gumustekin (Izmir Institute of Technology, Turkey)
Matching aerial images with map data is an important task in several remote sensing applications such as autonomous navigation, cartography, oceanography. The unique and distinctive shapes of coastlines can be effectively utilized to solve this problem. In this study a completely automatic scheme is proposed to detect coastlines using multi-resolution texture analysis and to match the detected coast-lines to a map database. A shape matching method using dynamic programming is used and tested on real satellite images of the western coast of Turkey.
A novel eye-detection algorithm utilizing edge-related geometrical information
Stylianos Asteriadis (Aristotle University of Thessaloniki, Greece, Greece); Nikos Nikolaidis (Aristotle University of Thessaloniki, Greece); Andras Hajdu (Aristotle University of Thessaloniki, Greece, Greece); Ioannis Pitas (ARISTOTLE UNIVERSITY OF THESSALONIKI, Greece)
A novel method for eye detection and eye center localization, based on geometrical information is described in this paper. First, a face detector is applied to detect the facial region, and the edge map of this region is extracted. A vector pointing to the closest edge pixel is then assigned to every pixel. Length and slope information for these vectors is used to detect the eyes. For eye center localization, intensity information is used. The proposed method can work on low-resolution images and has been tested on two face databases with very good results.

### Poster: Image Restoration and Enhancement - 12 papers

Room: Poster Area
Chair: Stephen Marshall (University of Strathclyde, United Kingdom)
Image Sharpening by Coupled Nonlinear-Diffusion on the Chromaticity-Brightness color Representation
Takahiro Saito (Kanagawa University, Japan); Reina Nosaka (Kanagawa University, Japan); Takashi Komatsu (Kanagawa University, Japan)
Previously we presented a selective image sharpening method with a coupled nonlinear diffusion process, and it sharpens only blurred edges without enhancing noise. Our prototypal color-image sharpening methods were formulated on the linear color models, namely, the channel-by-channel model and the 3D vector model. Our prototypal methods sharpen blurred color edges, but they do not necessarily enhance contrasts of signal variations in complex texture regions so well as in simple step-edge regions. To remedy the drawback, we extend our coupled nonlinear-diffusion sharpening method to the nonlinear non-flat color model, namely, the chromaticity-brightness model, which is known to be closely relating to human color perception. We modify our time-evolution PDEfs for the non-flat space of the chromaticity vector, and present its digital implementations.
Stereo-Polarimetric Measurement of pair of Mueller images for Three Dimensional Partial Reconstruction
Jawad Elsayed Ahmad (University Louis Pasteur, France); Yoshitate Takakura (Université Louis pasteur-Strasbourg, France)
Stereoscopy is well adapted tool for performing three dimensional partial reconstruction. Classical stereoscopy uses conventional scalar images for representing three dimensional objects. Thus all the necessary tasks (segmentation, classification, edge detection, correspondence matching) are performed on the classical scalar images. For some particular cases like improperly illuminated scenes, camera blindness by a bright edge response and for transparent objects detection, scalar images do not provide us with the reliable foundation from which precise three dimensional partial reconstruction can be performed (bad segmentation, hidden contour, undetected region, false classification). The object of this paper is to show how, very simply, by controlling the polarization state of the imaging system we can overcome the above mentioned problems. On the conceptual level, the contribution of polarimetry to the stereoscopy will be highlighted by a quantitative analysis of the precision of threedimensional reconstruction of objects.
Improving Quality of Autofluorescence Images Using Non-rigid Image Registration
Libor Kubecka (Brno University of Technology, Czech Republic); Jiri Jan (Brno University of Technology, Czech Republic); Radim Kolar (Brno University of Technology, Czech Republic); Radovan Jirik (Brno University of Technology, Czech Republic)
This work concerns quality improvement of auto-fluorescence retinal images by averaging of non-rigidly reg-istered images. The necessity of using the elastic spatial transformation model is documented as well as the need for similarity criterion capable of dealing with the non-homogenous and variable illumination of retinal images. The presented multilevel registration algorithm provides parameters of primarily affine and then B-spline free-form spatial transformation optimal with respect to the mutual information similarity criterion. The registration was tested on three modeled image sets of 100 images. The difference of artificially introduced pre-deformation displacement field and the displacement field found by our algorithm clearly showed the ability to compensate for the diverse modeled distortions. Further, the registration algorithm was used for improving quality of realistic retinal images using averaging of registered frames of image sequences. The whole method was verified by processing of 16 time series of real images. The gain in signal to noise ratio in the averaged registered images with respect to individual frame reach the expected about 4dB, without introducing a visible blur. The final im-age was substantially less blurred than the non-registered averaged image, which is documented by comparison of the autocorrelation functions of both images.
An Iterative Super-Resolution Reconstruction of Image Sequences using a Bayesian Approach and Affine Block-Based Registration
Vorapoj Patanavijit (Chulalongkorn University, Thailand)
Due to translational registration, traditional super-resolution re-constructions can apply only on the sequences that have simple translation motion. This paper reviews the super-resolution algo-rithm in these two decades and proposes a novel super-resolution reconstruction that that can apply on real sequences or complex motion sequences. The proposed super-resolution reconstruction uses a high accuracy registration algorithm, the fast affine block-based registration [42], in the stochastic regularization technique of Bayesian MAP estimation used to compensate the missing meas-urement information. The experimental results show that the pro-posed reconstruction can apply on real sequence such as Suzie, Mobile Calendar and Foreman.
Automated Removal of Overshoot Artefact from Images
Naomi Harte (Trinity College Dublin, Ireland); Anil Kokaram (Trinity College Dublin, Ireland)
This paper presents a simple but robust technique for overshoot removal from video sequences. The artefact of overshoot is modelled as a filter which distorts the step response of edges within the affected image. By detecting strong vertical edges in an image and measuring the distorted step response, an equalising filter can be designed to restore the step response closer to the ideal response and hence minimise the overshoot visible in the picture. The algorithm is presented along with results for both real archived images and images with known distortions. This allows performance in terms of Peak Signal to Noise Ratio to be assessed. The performance is shown to be good both quantitatively and qualitatively. The limitation of the approach is shown to be the lack of preservation of finer detail in some cases. Performance could be enhanced by first assigning areas of fine detail in the image which should not be processed. Application to video sequences is also considered where the equaliser update is weighted between the current and previous frame to reduce flicker artefacts.
Resolution Enhancement of Digital Images Based on Interpolation with Biasing
Yasuaki Okamoto (Musashi Institute of Technology, Japan); Akira Taguchi (Musashi Institute of Technology, Japan)
Enlargement of digital image is a prime technique in the digital image processing field. Generally, an image enlargement is realized by linear interpolation method. However, the enlargement image by linear interpolation will appear blurred. The heavy blurred areas, those include high-frequency components, such as step-edges and peaks. In order to realize the edgepreserving and the peak-creating interpolation, we have introduced the concept of the warped distance among the pixels and biasing the signal amplitude, respectively. This paper presents a novel interpolation method with only biasing of the signal amplitude. The proposed method can create the peak point of concavo-convex shape signals. Furthermore, the proposed biasing method can preserve the signal edge. Thus, it is not necessary to combine the biasing and the warping.
Fuzzy Metrics and Peer Groups for Impulsive Noise Reduction In Color Images
Samuel Morillas (Technical University of Valencia, Spain)
A new method for removing impulsive noise pixels in color images is presented. The proposed method applies the peer group concept defined by means of fuzzy metrics in a novel way to detect the noisy pixels. Then, a switching filter between the identity operation and the Arithmetic Mean Filter (AMF) is defined to perform a computationally efficient filtering operation over the noisy pixels. Comparisons in front of classical and recent vector filters are provided to show that the presented approach reaches a very good relation between noise suppression and detail preserving.
Adaptive-3D-Wiener for hyperspectral image restoration: influence on detection strategy
Jean-Michel Gaucel (Institut Fresnel, France); Mireille Guillaume (Institut Fresnel, France); Salah Bourennane (Institut Fresnel, France)
In this paper we consider the problem of multichannel restoration. Current multichannel least squares restoration filters utilize the assumption that the signal autocorrelation, describing the between-channel and within-channel relationship, is separable. We propose a Wiener solution for a multichannel restoration scheme, the Adaptive-3D-Wiener filter, based on a local signal model, without using the assumption of spectral and spatial separability. Moreover, when the number of channels is superior to 3, the restoration is in many cases the preprocessing to a given application such as classification, segmentation or detection, so it seems to be important to perform a restoration which suits to the application in fine. In this aim, the proposed filter is developed to be used as a preprocessing step for detection in hyperspectral imagery. Tests on real data show that the proposed filter enables to enhance detection performance in target detection and anomaly detection applications with two well-known detection algorithms in hyperspectral imagery.
Grain Noise Reduction using the Human Visual System characteristics
Valery Naranjo Ornedo (Polytechnic University of Valencia, Spain); Soledad Gómez (Polytechnic University of Valencia, Spain); Antonio Albiol (Polytechnic University of Valencia, Spain); Jose M. Mossi (Polytechnic University of Valencia, Spain)
This paper deals with grain noise artifact reduction on archived films. Concretely our research has been focused on finding a technique that let us not only to reduce the grain noise but also to preserve the original image quality as accurate as possible. The paper investigates the influence of different types of noise reduction filters in order to introduce the problem. Moreover, for each technique some conclusions are exposed. Finally a linear spatial-temporal technique is presented and described. This technique adapts a spatial-temporal filter response according with the human visual system behavior. Taking advantage of it, we get great results reducing the grain noise of several image sequences and preserving their original quality as much as possible. The filter does not need motion estimation and it is implemented using separable filters, resulting a computational efficient implementation.
Restoration of astrophysical images - The case of Poisson Gamma data
Celine Theys (Universite de Nice Sophia Antipolis, France); Henri Lanteri (Universit de Nice Sophia Antipolis, France)
Very recent technology proposes to acquire astrophysic data with L3CCD cameras in order to avoid the read-out noise due to the classical CCD acquisition. The physical process leading to the data has been previously described by a "Poisson Gamma" density. We propose to discuss the model and to derive an iterative algorithm for the deconvolution of such data. Some simulation results are given on synthetic astrophysic data pointing out the interest of L3CCD cameras for acquisition of very low intensity images.
Image Interpolation Using an Adaptive Invertible Approach
Olivier Crave (Ecole Nationale Supérieure des Télécommunications, France); Gemma Piella (Universitat Pompeu Fabra, Spain); Béatrice Pesquet-Popescu (Ecole Nationale Supérieure des Télécommunications, France)
In this paper we present a new image interpolation algorithm based on the adaptive update lifting scheme described in the article "Building nonredundant adaptive wavelets by update lifting". This scheme allows us to build adaptive wavelets that take into account the characteristics of the underlying signal. Inspired by this technique, we propose an image resizing scheme which has the ability to adapt itself to discontinuities like edges and assures a perfect reconstruction going from low to high resolution and then back to low resolution. Such a feature is highly desirable for example for forth and back conversion between the two existing High Definition Television formats, in order to preserve the integrity of the original image in a chain of successive transformations. The proposed algorithm adaptively updates one polyphase component of the original image and then computes the rest of the components of the output image by means of a gradient-driven interpolation method.
Color Histogram Equalization using Probability Smoothing
Nikoletta Bassiou (Aristotle Univ. of Thessaloniki, Greece); Constantine Kotropoulos (Aristotle University of Thessaloniki, Greece)
A novel color histogram equalization approach is proposed that not only takes into consideration the correlation between color components in the color space, but it is also enhanced by a multi-level smoothing technique adopted from the field of language modeling. In this way, the correlation between color components is taken into account and the problem of unseen values for a color component, either considered independently or in combination with others, is efficiently dealt with. The proposed method is conducted in the HSI color space for intensity (I) component and saturation (S) component given the I component. The quality of the visually appealing equalized images was confirmed by means of the entropy and the Kullback-Leibler divergence estimates between the resulted color histogram and the multivariate uniform probability density function.

### Poster: Signal Processing Education - 3 papers

Room: Poster Area
Chair: Marek Domañski (Poznañ University of Technology, Poland)
Teaching Signal Processing Applications With joPAS: Java To Octave Bridge
Javier Vicente (University of Deusto, Spain); Begonya Garcia (University of Deusto, Spain); Amaia Mendez (University of Deusto, Spain); Ibon Ruiz (University of Deusto, Spain); Oscar Lage (University of Deusto, Spain)
This paper introduces the "joPAS" programming API, which has been developed by the PAS research team at the University of Deusto. joPAS enables the use of octave variables and functions through a JAVA program. Therefore, this API makes it possible to not only develop signal processing applications quickly by implementing the application's graphic interfaces in Java language but also to carry out the scientific calculation in Octave. Students can easily learn the implementation of digital signal processing applications with this API.
E-Learning on Multimedia
Marek Domañski (Poznañ University of Technology, Poland)
This paper presents e-learning course on multimedia pre-pared within European Commission Leonardo da Vinci Pro-gramme Invocom: Internet-based vocational training of communication students, engineers, and technicians. The Invocom project is already finished, but the prepared courses are still available. Basic aims, target groups and prepared interactive lessons of Multimedia Course are de-scribed. Described is also application of the ActiveX tech-nology for the interactive exercises. The project proves that low-cost efficient distant learning tools may be prepared even for highly resource-demanding lessons on image, video and audio processing and compression.
A Simulink and Texas Instruments C6713 based Digital Signal Processing Laboratory
Sharon Gannot (Bar-Ilan University, Israel); Vadim Avrin (Bar-Ilan University, Israel)
In this contribution a digital signal processing educational lab, established at the School of Electrical and Computers Engineering at Bar-Ilan University, Israel is presented. A unique educational approach is adopted. In this approach sophisticated algorithms can be implemented in an intuitive top-level design using Simulink. Simultaneously, our approach gives the students the opportunity to conduct hands-on experiments with real signals and hardware, using Texas instruments (TI) C6713 evaluation boards. By taking this combined approach, we tried to focus the efforts of the students on the DSP problems themselves rather than on the actual programming. A comprehensive ensemble of experiments, which expose the students to a wide spectrum of DSP concepts, is introduced in this paper. The experiments were designed to enable the illustration and demonstration of theoretical aspects, already acquired by several DSP courses in the curriculum.

### Poster: Speech Enhancement - 9 papers

Room: Poster Area
Chair: John Mathews (The University of Utah, USA)
Speech Enhancement with Kalman Filtering the Short-Time DFT Trajectories of Noise and Speech
Esfandiar Zavarehei (Brunel University, United Kingdom); Saeed Vaseghi (Brunel University, United Kingdom); Qin yan (Brunel University, United Kingdom)
This paper presents a time-frequency estimator for enhancement of noisy speech in the DFT domain. The time-varying trajectories of the DFT of speech and noise in each channel are modelled by low order autoregressive processes incorporated in the state equation of Kalman filters. The parameters of the Kalman filters are estimated recursively from the signal and noise in DFT channels. The issue of convergence of the Kalman filters to noise statistics during the noise-dominated periods is addressed and a method is incorporated for restarting of Kalman filters after long periods of noise-dominated activity in each DFT channel. The performance of the proposed method is compared with cases where the noise trajectories are not explicitly modelled. The sensitivity of the method to voice activity detector is evaluated. Evaluations show that the proposed method results in substantial improvement in perceived quality of speech.
Predictive deconvolution and kurtosis maximization for speech dereverberation
David Fee (The Queen's University of Belfast, United Kingdom); Colin Cowan (The Queen's University of Belfast, United Kingdom); Stefan Bilbao (The University of Edinburgh, United Kingdom); Izzet Ozcelik (Queens University of Belfast, United Kingdom)
A predictive deconvolution, based on the LP residual of speech, is used to extract an estimate of the inverse of the minimum phase component of a room impulse response. This inverse is applied as a prefiltering stage to a kurtosis maximizing adaptive filter to equalise the remaining non-minimum phase component. It was found that this improved the stability and performance of the KM filter for male speech but it was found that when the first stage LP order was increased the performance improved for both male and female speech.
A Novel Voiced Speech Enhancement Approach Based on Modulated Periodic Signal Extraction
Mahdi Triki (CNRS - Eurecom Institute, France); Dirk Slock (Eurecom Institute, France)
Most of the existing speech coding and speech enhancement techniques are based on the AR model and hence apply well to unvoiced speech. These same techniques are then applied to the voiced case as well by extrapolation. However, voiced speech is very structured so that a proper approach allows to go further than for unvoiced speech. We model a voiced speech segment as a periodic signal with (slow) global variation of amplitude and frequency (limited time warping). The bandlimited variation of global amplitude and frequency gets expressed through a subsampled representation and parameterization of the corresponding signals. Assuming additive white Gaussian noise, a Maximum Likelihood approach is proposed for the estimation of the model parameters and the optimization is performed in an iterative (cyclic) fashion that leads to a sequence of simple least-squares problems. Particular attention is paid to the estimation of the basic periodic signal, which can have a non-integer period, and the estimation of the amplitude signal with guaranteed positivity.
Parametric Approach for Speech Denoising Using Multitapers
Werayuth Charoenruengkit (IBM, USA)
Spectral estimation is a major component of obtaining high quality speech in many speech denoising techniques. Autoregressive spectral estimation using Multitaper Autoregressive data (ARMT) is a parametric approach that generates AR filter coefficients from multitaper autocorrelation estimates. The ARMT proves to be a best fit smooth curve to the mutitaper spectral estimates (MTSE) hence has very low high frequency bias and has even less variance than the standard MTSE. As such, ARMT is a smoother and less computationally intensive alternative to wavelet domain reduction (denoising) of the MTSE error. In this paper, the ARMT is used to derive the optimal gain parameters in the signal subspace approach to reducing environmental noise. Objective measures and informal listening tests demonstrate that results are indistinguishable from its successful predecessor that uses the non-parametric approach for speech denoising.
Subband Particle Filtering for Speech Enhancement
Ying Deng (University of Utah, USA); V. John Mathews (University of Utah, USA)
Particle filters have recently been applied to speech enhancement when the input speech signal is modeled as a time-varying autoregressive process with stochastically evolving parameters. This type of modeling results in a nonlinear and conditionally Gaussian state-space system that is not amenable to analytical solutions. Prior work in this area involved signal processing in the fullband domain and assumed white Gaussian noise with known variance. This paper extends such ideas to subband domain particle filters and colored noise. Experimental results indicate that the subband particle filter achieves higher segmental SNR than the fullband algorithm and is effective in dealing with colored noise without increasing the computational complexity.
A Harsh Noise Assessment Measure for Speech Enhancement
Kohei Yamashita (Saitama University, Japan); Tetsuya Shimamura (Saitama University, Japan)
In this paper, we derive a new assessment measure, which is suitable for harsh noise like musical noise. The measure uses an image processing model and a filter based on auditory loudness. The aim of the measure is to evaluate the harshness of noise without subjective experiments. The effectiveness of the proposed measure is shown with experiments on real speech data.
Computation of common acoustical poles in subbands by means of a clustering technique
Pedro Zuccarello (Universidad Politécnica de Valencia, Spain); Alberto Gonzalez (Universidad Politecnica de Valencia, Spain); Juan Domingo (Universitat de Valencia, Spain); Guillermo Ayala (Universitat de Valencia, Spain)
This paper presents a new and simple method for calculating the common poles of a room. A set of impulse responses is measured inside a room. ARMA models are computed for each of these measured impulse responses. Since the measured linear systems are stable, the poles of the ARMA models are located inside the unit circule of the Z plane. A clustering technique can be used to group these points into clusters. The centroids of these clusters can be interpreted as the common poles of the room. The analysis was carried out in a sub-band basis considering only the poles inside the pass-band of the bandpass filter. The clustering technique shows a considerable improvement compared to the averaging method proposed by Haneda et al., and the stability of the new common acoustical poles models can be theoretically assured.
Spectral Peaks Enhancement For Extracting Robust Speech Features
Babak Nasersharif (Iran university of science and technology, Iran); Ahmad Akbari (Iran university of science and technology, Iran); M. Mehdi Homayounpour (Amirkabir University of Technology, Iran)
It is generally believed that the external noise added to speech signal corrupts speech spectrum and so speech features. This feature corruption degrades speech recognition systems performance. One solution to cope with the speech feature corruption is reducing the noise effects on the speech spectrum. In this paper, we propose to filter speech spectrum to enhance its spectral peaks in presence of noise. Then, we extract robust features from the spectrum with enhanced peaks. In addition, we apply the proposed filtering to another form of speech spectral representation known as modified group delay function (GDF). Phone and word recognition results show that MFCC features extracted from spectrum with enhanced peaks are more robust to noise than MFCC derived from main noisy spectrum. In addition, MFCC features extracted form filtered GDF, are more robust to noise than other MFCC features, especially in low SNR values.
Measurement of the Effects of Nonlinearities on the Network-Based Linear Acoustic Echo Cancellation
Ted Wada (Georgia Institute of Technology, USA); Fred Juang (Georgia Institute of Technology, USA); Rafid Sukkar (Tellabs, Inc., USA)
It is well known that an over-driven loudspeaker would produce a nonlinearity that limits the performance of an acoustic echo canceler (AEC). In contrast, only a handful of studies have been documented on the effect of speech coding nonlinearity on the AEC. This paper investigates the combined effect of both types of nonlinearities in the network-based AEC framework as opposed to when the AEC is performed at the source of echo such as a cellular handset. The simulation results show that while a mild saturation-type loudspeaker nonlinearity causes the echo return loss enhancement (ERLE) to go down significantly, it is the nonlinear speech coding distortion on the acoustic echo signal that ultimately reduces the achievable ERLE. The results also point to the fact that a low bit-rate speech codec is capable of synthesizing a perceptually acceptable speech signal but does it in a way that is untractable by traditional linear AEC algorithms.

### Tue.6.2: Transforms analysis and implementation - 5 papers

Room: Room 4
Chair: Ljubisa Stankovic (University of Montenegro, Serbia and Montenegro)
Relations between Gabor Transforms and Fractional Fourier Transforms and Their Applications for Signal Processing
Soo-Chang Pei (National Taiwan University, Taiwan); Jian-Jiun Ding (Department of Electrical Engineering, National Taiwan University, Taiwan)
Many wonderful relations between the Gabor transform and the fractional Fourier transform (FRFT), which is a generalization of the Fourier transform, are derived. First, we find that, as the Wigner distribution function (WDF), the FRFT is also equivalent to the rotation operation of the Gabor transform. We also derive the shifting, the projection, the power integration, and the energy sum relations between the Gabor transform and the FRFT. Since the Gabor transform is closely related to the FRFT, we can use it for analyzing the effect of the FRFT. Compared with the WDF, the Gabor transform does not have the problem of cross terms. It makes the Gabor transform a very powerful assistant tool for fractional sampling and designing the filter in the FRFT domain. Moreover, we show that any combination of the WDF and the Gabor transform also has the rotation relation with the FRFT.
Efficient Computation Of The Discrete Pascal Transform
Athanassios Skodras (Hellenic Open University, Greece)
The recently proposed discrete Pascal transform possesses a computational complexity for an N-point vector of the order of N2 for both multiplications and additions. In the present work an efficient structure is proposed, which eliminates the multiplications and halves the number of additions.
No Minimum Rate Multisampling of a Fourier Series
Michael Grotenhuis (University of Minnesota, USA)
I examine the possibility of sampling a Fourier series with multiple, uniform rates that are not required to be larger than any particular frequency. This is allowed because convolution of a Fourier series with a train of delta functions in the Fourier domain causes overlap in the Fourier domain only in isolated cases. Furthermore, I can restrict this overlap to not occur in more than one sampled transform. I use three different sampling rates, not required to be greater than any particular frequency, yet satisfying certain irrational relationships, which I specify. The three separate Fourier domains from each rate are compared, and a filter is used which outputs only those terms which are common to all three. In some cases, it might be necessary to introduce a fourth sampling rate. The result is that the original Fourier series is obtained from the filter and inverse transform.
Extending Laplace and z Transform Domains
Michael Corinthios (Ecole Polytechnique de Montreal, Canada)
A generalisation of the Dirac-delta function and its family of derivatives recently proposed as a means of introducing impulses on the complex plane in Laplace and z transform domains is shown to extend the applications of Bilateral Laplace and z transforms. Transforms of two-sided signals and sequences are made possible by a extending the domain of distributions to cover generalized functions of complex variables. The domains of Bilateral Laplace and z transforms are shown to extend to two-sided exponentials and fast-rising functions, which, without such generalized impulses have no transform. Applications include generalized forms of the sampling theorem, a new type of spatial convolution on the s and z planes and solutions of differential and difference equations with two-sided infinite duration forcing functions and sequences.
On Improving the Performance of Multiplierless DCT Algorithms
Raymond Chan (The Chinese University of Hong Kong, Hong Kong); Moon-Chuen Lee (The Chinese University of Hong Kong, Hong Kong)
This paper investigates a number of issues having an impact on the performance of an approximated multiplierless DCT, which include: types of inverse transforms; types of nor-malizations; algorithm structures; and assignment of signed digits for approximating constants. Based on our experiment results, we have the following findings: (1) a transform based on a reversible inverse generally performs better than a version based on a traditional inverse; (2) a transform with a delayed or post normalization can achieve a much better performance; (3) uniform normalization can be con-sidered a useful feature; (4) a lifting structure transform can achieve better accuracy than a non-lifting structure version; (5) an optimal configuration for the assignment of signed digits to the constants could help to boost the performance of the approximated DCT. It is believed that such findings should provide useful insights into making the proper design choices when converting a fast DCT into a multiplierless version.

### Tue.4.2 - Speech and Audio Source Separation - 5 papers

Room: Sala Onice
Chair: Augusto Sarti (DEI - Politecnico di Milano, Italy)
Blind source separation of anechoic mixtures in time domain utilizing aggregated microphones
Mitsuharu Matsumoto (Waseda University, Japan); Shuji Hashimoto (Waseda University, Japan)
This paper introduces a blind source separation (BSS) algorithm in time domain based on the amplitude gain difference of two directional microphones located at the same place, namely aggregated microphones. A feature of our approach is to treat the BSS problem of the anechoic mixtures in time domain. Sparseness approach is one of the attractive methods to solve the problem of sound separation. If the signal is sparse in the frequency domain, the sources rarely overlap. Under this condition, it is possible to extract each signal using time-frequency binary masks. In this paper, we treat the non-stationary, partially disjoint signals. In other words, most of the signals overlap in time domain and frequency domain though there exist some intervals where the sound is disjoint. We firstly show the source separation problem can be described not as a convolutive model but as an instantaneous model in spite of the anechoic mixing when the aggregated microphones are assumed. We then show the necessary conditions and show the algorithm with the experimental results. In this method, we can treat the problem not in time-frequency domain but in time domain due to the characteristics of the aggregated microphones. In other words, we can consider the problem not in the complex space but in the real space. The mixing matrix can be directly identified utilizing the observed signals without estimating the intervals where the signal is disjoint through all the processes.
Blind Source Separation for Convolutive Mixtures Using Spatially Resampled Observations
Johan Synnevåg (University of Oslo, Norway); Tobias Dahl (University of Oslo, Norway)
We propose a new technique for separation of sources from convolutive mixtures based on independent component analysis (ICA). The method allows coherent processing of all frequencies, in contrast to the traditional treatment of individual frequency bands. The use of an array enables resampling of the signals in such a way that all frequency bands are effectively transformed onto the centre frequency. Subsequent separation is performed all-bands-in-one''. After resampling, a single matrix describes the mixture, allowing use of standard ICA algorithms for source separation. The technique is applied to the cocktail-party problem to obtain an initial estimate of the separating parameters, which may further be processed using cross-talk removal or filtering. Experiments with two sources of speech and a four element microphone array show that the mixing matrix found by ICA is close to the theoretically predicted, and that 15 dB separation of the sources is achieved.
Reducing reverberation effects in convolutive blind source separation
Radoslaw Mazur (University of Oldenburg, Germany); Alfred Mertins (Signal Processing Group, Dept. of Physics, University of Oldenburg, Germany)
In this paper, we propose a new method for reducing the reverberation effects in convolutive blind source separation which lead to reduced intelligibility of the separated sources (e.g., speech signals). The existing methods mainly try to maximize the separating performance without paying much attention to the linear distortion in the separated signals. We propose a modification to the existing algorithms that reduces the distortions introduced by the demixing filters. In particular we investigate the possibilities of modifying the frequency responses of the demixing filters to have no spectral peaks, which leads to near allpass character of the overall system. The good performance of the modified algorithm will be demonstrated on real-world data.
Source Extraction from Two-Channel Mixtures by Joint Cosine Packet Analysis
Andrew Nesbit (Queen Mary, University of London, United Kingdom); Mike Davies (Queen Mary, University of London, United Kingdom); Mark Plumbley (Queen Mary, University of London, United Kingdom); Mark Sandler (Queen Mary University of London, United Kingdom)
This paper describes novel, computationally efficient approaches to source separation of underdetermined instantaneous two-channel mixtures. A best basis algorithm is applied to trees of local cosine bases to determine a sparse transform. We assume that the mixing parameters are known and focus on demixing sources by binary time-frequency masking. We describe a method for deriving a best local cosine basis from the mixtures by minimising an $l^1$ norm cost function. This basis is adapted to the input of the masking process. Then, we investigate how to increase sparsity by adapting local cosine bases to the expected output of a single source instead of to the input mixtures. The heuristically derived cost function maximises the energy of the transform coefficients associated with a particular direction. Experiments on a mixture of four musical instruments are performed, and results are compared. It is shown that local cosine bases can give better results than fixed-basis representations.
Parametric Pearson approach based independent component analysis for freqeucny domain blind speech separation
Hiroko Kato (NTT Communication Science Laboratories, Japan); Yuichi Nagahara (Meiji University, Japan); Shoko Araki (NTT communication Science Laboratories, Japan); Hiroshi Sawada (NTT communication Science Laboratories, Japan); Shoji Makino (NTT Communication Science Laboratories, Japan)
In frequency-domain blind source separation (BSS) of speech with an independent component analysis (ICA) sys-tem, a parametric Pearson distribution system improved separation performance. ICA adaptation rules have a score function determined by approximated source distribution, and better approximation positively affects separation per-formance. Previously, conventional hyperbolic tangent (tanh) or generalized Gaussian distribution (GGD) was uniformly applied to the score function for all frequency bins despite the fact that a wide band speech signal has different distributions in different frequencies. To obtain better score functions, we propose a parametric Pearson distribution system approach to ICA learning rules. The score function is estimated by appropriate Pearson distribution parameters for each frequency bin. We concretely consider three meth-ods for Pearson distribution parameter estimation and con-ducted separation experiments with real speech signals con-volved with actual room impulse responses. The signal-to-interference ratio (SIR) performance of the proposed meth-ods significantly improved over three dB compared to con-ventional methods.

### Tue.3.2: Signal Synthesis and Reconstruction - 5 paper

Room: Sala Verde
Chair: Abdelhak Zoubir (Darmstadt University of Technology, Germany)
Improving the Accuracy of Noise Estimation when Using Decision Feedback
Stefan Edinger (University of Mannheim, Germany); Markus Gaida (University of Mannheim, Germany); Norbert Fliege (University of Mannheim, Germany)
In discrete multitone transmissions, decision feedback is an attractive and simple method to gain information about transmission channel conditions without having to use pilot subcarriers or training sequences. Decision feedback, however, is prone to errors caused by wrong decisions. A Misestimation of the true channel conditions often occurs when the signal-to-noise ratio is suddenly decreased by a disturbance. In this paper, we present novel measures to increase the overall accuracy of channel estimation. In particular, we identify a special statistic whose evaluation yields reliable noise estimates even for very low signal to noise ratios.
Spectral correlations and synthesis of multidimensional fractional Brownian motion
Tor Oigard (University of Tromso, Norway); Louis Scharf (Colorado State, USA); Alfred Hanssen (University of Tromsà¸, Norway)
Fractional Brownian motions (fBm) provide important models for a wide range of physical phenomena whose empirical spectra obey power laws of fractional order. Extensions of fBm to higher dimension has become increasingly important. In this paper we study isotropic $d$-dimensional fBm in the framework of inhomogeneous random fields, and we derive exact expressions for the dual-wavenumber spectrum of fractional Brownian fields (fBf). Based on the spectral correlation structure of fBf we develop an algorithm for synthesizing fBf. The proposed algorithm is accurate and allow us to generate fractional Brownian motions of arbitrary dimension.
Optimal Estimation of a Random Signal from Partially Missed Data
Anatoli Torokhti (University of South Australia, Australia); Phil Howlett (University of South Australia, Australia); Charles Pearce (The University of Adelaide, Australia)
We provide a new technique for random signal estimation under the constraints that the data is corrupted by random noise and moreover, some data may be missed. We utilize nonlinear filters defined by multi-linear operators of degree $r$, the choice of which allows a trade--off between the accuracy of the optimal filter and the complexity of the corresponding calculations. A rigorous error analysis is presented.
New fast recursive algorithms for simultaneous reconstruction and identification of AR processes with missing observations
Rawad Zgheib (Ecole Supérieure d'Electricité, France); Gilles Fleury (Ecole Supérieure d'Electricité, France); Elisabeth Lahalle (Ecole Supérieure d'Electricité, France)
This paper deals with the problem of adaptive reconstruction and identification of AR processes with randomly missing observations. The performances of a previously proposed real time algorithm are studied. Two new alternatives, based on other predictors, are proposed. They offer an unbiased estimation of the AR parameters. The first algorithm, based on the $h$-step predictor, is very simple but suffers from a large reconstruction error. The second one, based on the incomplete past predictor, offers an optimal reconstruction error in the least mean square sense.
A Real-Time Algorithm for Time Decoding Machines
Aurel Lazar (Columbia University, USA); Ern\H{o} K. Simonyi (Ministry of Defense Electronics, Hungary); Laszlo T. Toth (Budapest Univ. of Technology and Economics, Algeria)
Time-encoding is a real-time asynchronous mechanism of mapping the information contained in the amplitude of a bandlimited signal into a time sequence. Time decoding algorithms recover the signal from the time sequence. Under an appropriate Nyquist-type rate condition the signal can be perfectly recovered. The algorithm for perfect recovery calls, however, for the computation of a pseudo-inverse of an infinite dimensional matrix. We present a simple algorithm for local signal recovery and construct a stitching algorithm for real-time signal recovery. We also provide a recursive algorithm for computing the pseudo-inverse of a family of finite-dimensional matrices.

Room: Auditorium

## 2:10 PM - 3:10 PM

### Plenary: Distributed signal processing for sensor networks

Room: Auditorium
Chair: Martin Vetterli (EPFL, Switzerland)

## 3:10 PM - 4:50 PM

### Tue.2.3: Wavelet Image Processing - 5 papers

Chair: Ian Jermyn (INRIA, France)
Full Exploitation of Wavelet Coefficients in Radar Imaging for Improving the Detection of a Class of Sources in the Context of Real Data
Mohamed Tria (UNIVERSITY OF PARIS XI, France); Jean-Philippe Ovarlez (ONERA, France); Luc Vignaud (ONERA, France); Juan-Carlos Castelli (ONERA, France); Messaoud Benidir (Laboratoire des Signaux et Systmes, France)
As a time-frequency tool, the Continuous Wavelet Transform (CWT) was applied in radar imaging to reveal that the reflectors' response varies as a function of frequency and aspect angle (orientation of the wave vector). To do so, we constructed a hyperimage expressed as the squared modulus of the wavelet coefficients, allowing to access to the energy distribution of each reflector, in the frequency-angle plane. Exploiting the hyperimage, our recent researches were devoted to the classification of the reflectors in function of theirs energy distributions with the objective of discriminating a type of target in the radar image. Althought acceptable results were obtained, the method is not reliable in some cases. The purpose of this paper is to show that exploiting not only the modulus but also the argument of the wavelet coefficients, can improve the detection of a certain class of reflectors. Results are presented at the end of this article.
A geometrical wavelet shrinkage approach for image denoising
Bruno Huysmans (University of Ghent, Belgium)
In this paper a denoising technique for digital gray value images corrupted with additive Gaussian noise is presented. We studied a recently proposed hard thresholding technique which uses a two stage selection procedure in which coefficients are selected based on their magnitude, spatial connectedness and interscale dependencies. We construct a shrinkage version of the algorithm which outperforms the original one. We also present a new hard thresholding algorithm which incorporates the spatial connectivity information in a more simple and efficient way and construct a shrinkage version of it. The new algorithms are faster and lead to better denoising performances compared to the original one, both visually and quantitatively.
Nonlinear models for the statistics of adaptive wavelet packet coefficients of texture
Johan Aubray (INRIA, France); Ian Jermyn (INRIA, France); Josiane Zerubia (INRIA, Sophia Antipolis, France)
Probabilistic adaptive wavelet packet models of texture provide new insight into texture structure and statistics by focusing the analysis on significant structure in frequency space. In very adapted subbands, they have revealed new bimodal statistics, corresponding to the structure inherent to a texture, and strong dependencies between such bimodal subbands, related to phase coherence in a texture. Existing models can capture the former but not the latter. As a first step towards modelling the joint statistics, and in order to simplify earlier approaches, we introduce a new parametric family of models capable of modelling both bimodal and unimodal subbands, and of being generalized to capture the joint statistics. We show how to compute MAP estimates for the adaptive basis and model parameters, and apply the models to Brodatz textures to illustrate their performance.
A wavelet domain filter for correlated speckle
Stian Solbø (Norut IT, Norway); Torbjørn Eltoft (University of Tromsà¸, Norway)
In this paper we assume that both the radar cross section and the speckle noise in SAR images are spatially correlated. We develop a wavelet domain linear minimum mean square error filter to remove the speckle noise. The autocorrelation functions in the wavelet domain are estimated from the single look complex SAR image. Preliminary studies show that proposed filter introduces less bias compared to performed the filtering in the image domain, but does not achieve the same level of smoothing. The proposed filter does not depend on sliding windows to avoid boundary effects.
Construction of Triangular Biorthogonal Wavelet Filters for Isotropic Image Processing
Susumu Sakakibara (Tokyo Denki University, Japan); Oleg Vasilyev (University of Colorado, USA)
We propose new non-separable two dimensional biorthogonal wavelets whose filters are defined on the regular triangular lattice on a plane. Their construction uses lifting in the polyphase representation of the filters. While filters of arbitrary orders may be obtained by following the same way as in 1D case, in this paper we focus our attention mainly on the construction method using a few simple examples. Applying one of the example filters to simple images, we show that isotropy of images are improved in the wavelet transform as compared with the usual tensor product wavelet transform.

### Tue.5.3: Signal Detection - 5 papers

Chair: Anthony Quinn (Trinity College, Dublin, Ireland)
Low-Complexity Gain and Phase I/Q Mismatch Compensation using Orthogonal Pilot Sequences
Luca Giugno (WISER srl, Italy); Vincenzo Lottici (University of Pisa, Italy); Marco Luise (University of Pisa, Italy)
In up-to-date receiver architectures the gain and phase mismatch between the I and Q signal paths may degrade significantly the overall link performance. An alternative cost effective solution to expensive analog components with small tolerances, which make the I/Q mismatch effect negligible, consists in estimating and compensating it through appropriate digital signal processing techniques. In this paper we derive a novel low-complexity data-aided scheme for jointly estimating the carrier phase offset, the I/Q phase mismatch, and the gain of the I and Q branches following the maximum likelihood criterion and adopting as training symbols an orthogonal sequence. The performance analysis proves that the proposed estimator is low-complexity, as-ymptotically efficient and capable of compensating consid-erable I/Q mismatch values in the demanding scenario of both uncoded and coded multi-level QAM transmissions.
A robust signal detection scheme
Jean-Jacques Fuchs (irisa/université de Rennes, France)
We consider the detection of a signal in the presence of both interferences that lie in a known subspace and white noise. The signal to be detected is the product of an unknown amplitude and a known signature-vector that is itself subject to additive white Gaussian noise. We develop the maximum likelihood estimates (MLE) of the problem in order to apply the generalized maximum likelihood test (GLRT) to our detection problem. The performances of the proposed detector are illustrated by means of numerical simulations and it is compared to the standard matched subspace detector.
Soft-Output Detection of Differentially Encoded M-PSK Over Channels with Phase Noise
Alan Barbieri (University of Parma, Italy); Giulio Colavolpe (University of Parma, Italy)
We consider a differentially encoded M-PSK signal transmitted over a channel affected by phase noise. For this problem, we derive the exact maximum a posteriori (MAP) symbol detection algorithm. By analyzing its properties, we demonstrate that it can be implemented by a forward-backward estimator of the phase probability density function, followed by a symbol-by-symbol completion to produce the a posteriori probabilities of the information symbols. To practically implement the forward-backward phase estimator, we propose a couple of schemes with different complexity. The resulting algorithms exhibit an excellent performance and, in one case, only a slight complexity increase with respect to the algorithm which perfectly knows the channel phase. The application of the proposed algorithms to repeat and accumulate codes is assessed in the numerical results.
Detection of Unknown Signals Based on Spectral Correlation Measurements
The problem of detecting an unknown signal embedded in white Gaussian noise is addressed. A CFAR detector based on combining the information of the whole frequency-cyclefrequency plane is proposed. An analytical characterization of the detector is provided, and its detection capability evaluated. The FFT-Accumulation Method (FAM) is used to measure the SCF. Approximate analytic expressions of the probability of false alarm are provided.
Blind Phase Noise Estimation and Data Detection Based on SMC Technique and Unscented Filtering
Erdal Panayirci (Bilkent University, Turkey); Hakan Cirpan (Istanbul University, Turkey); Marc Moeneclaey (Ghent University, Belgium); Nele Noels (Ghent University, Belgium)
In this paper, a computationally efficient algorithm is presented for tracing phase noise with linear drift and blind data detection jointly, based on a sequential Monte Carlo(SMC) method. Tracing of phase noise is achieved by Kalman filter and the nonlinearity of the observation process is taken care of by unscented filter rather that using extended Kalman technique. On the other hand,SMC method treats the transmitted symbols as  missing data" and draw samples sequentially of them based on the observed signal samples up to time $t$. This way, the Bayesian estimates of the phase noise and the incoming data are obtained through these samples, sequentially drawn, together with their importance weights. The proposed receiver structure is seen to be ideally suited for high-speed parallel implementation using VLSI technology.

### Tue.1.3: MIMO: Prototyping, VLSI and Testbeds II (Special session) - 5 papers

Room: Auditorium
Chair: Steffen Paul (Infineon AG, Germany)
Chair: Markus Rupp (TU Wien, Austria)
Design of WARP: A Wireless Open-Access Research Platform
Patrick Murphy (Rice University, USA); Ashutosh Sabharwal (Rice University, USA); Behnaam Aazhang (Rice University, USA)
This paper presents the design of WARP, a custom platform for research in advanced wireless algorithms and applications. The platform consists of both custom hardware and FPGA implementations of key communications blocks. The hardware consists of FPGA-based processing boards coupled to wideband radios and other I/O interfaces; the algorithm implementations already include a flexible OFDM physical layer. Both the hardware design and algorithm implementations will be freely available to academic researchers to enable the development of a widely disseminated, highly capable platform for wireless research.
High-throughput multi-rate LDPC decoder based on architecture-oriented parity check matrices
Predrag Radosavljevic (Rice University, USA); Alexandre de Baynast (Rice University, USA); Marjan Karkooti (Rice University, USA); Joseph Cavallaro (Rice University, USA)
High throughput pipelined LDPC decoder that supports multiple code rates and codeword sizes is proposed. In order to increase memory throughput, irregular block structured parity-check matrices are designed with the constraint of equally distributed odd and even nonzero block-columns in each horizontal layer for the pre-determined set of code rates. Designed decoder achieves data throughput of approximately~1 Gb/s without sacrificing error-correcting performance of capacity-approaching irregular block codes. The prototype architecture is implemented on FPGA and synthesized for the ASIC design.
Fast Prototyping of Digital Signal Processing Systems by Means of a Model-based Codesign Environment
Leonardo Reyneri (Politecnico di Torino, Italy); Fabio Ancona (Sundance Italia s.r.l., Italy)
This paper presents a novel tool, based on Simulink, for model-based high-level HW/SW codesign of high-performance digital signal processing systems. The tool has been tailored to support HW/SW configurable platforms, in particular those from Sundance Microprocessor Technology Ltd. targeting DSP and FPGA components. A Software Defined Radio (SDR) demo based on Sundance's SMT8036 or SMT8096 hardware platforms will be shown.
MIMO Signal Processing on a reconfigurable architecture
Klaus Hueske (University of Dortmund, Germany); Juergen Goetze (University of Dortmund, Germany)
In this paper the implementation of multiple-input multiple-output (MIMO) signal processing on a reconfigurable hardware architecture is discussed. The implementation of MIMO systems is usually determined by the parameters of the application at hand, e.g. the number of input signals, number of output signals, number of users or word length. Furthermore, there is also a flexibility in terms of the algorithms, which are used for computing the required task. We will present the implementation of the linearly constrained MVDR beamformer on a reconfigurable hardware architecture. A virtual parallel implementation of the used algorithms is mapped to a reconfigurable hardware architecture, where the used processor elements can execute different modes (configuration modes). We will discuss the configuration in terms of change of parameters and change of algorithm, respectively. Furthermore, bit true simulations of the BER for different configurations are presented for various word lengths. Finally, the trade-off between performance and reconfiguration effort is discussed.
VLSI Implementation of Pipelined Sphere Decoding with Early Termination
Andreas Burg (ETHZ, Switzerland); Markus Wenk (IIS/ETH-Zurich, Switzerland); Wolfgang Fichtner (ETHZ, Switzerland)
The sphere decoding algorithm allows to implement MIMO detection with maximum likelihood error rate performance while complexity is far below an exhaustive search. This paper addresses two important problems associated with the practical implementation of sphere decoding: the mitigation of the bit error rate performance loss when the runtime of the decoder is constrained and the introduction of pipelining into the recursive depth-first sphere decoding algorithm. The result of this work is a sphere decoder implementation for a 4x4 system with 16-QAM modulation in a 0.13 um technology that achieves a guaranteed minimum throughput of 761 Mbps.

### Poster: BSS and ICA - 16

Room: Poster Area
Chair: Ana Maria Tom (University of Aveiro, Portugal)
Separation of Correlated Signals Using Signal Canceler Constraints in a Hybrid CM Array Architecture
Vishwanath Venkataraman (University of California, Santa Barbara, USA); John Shynk (University of California, Santa Barbara, USA)
In this paper, we present a hybrid implementation of the multistage constant modulus (CM) array for separating correlated signals. Using a cascade architecture of the CM array with a series of adaptive signal cancelers, we derive a parallel set of constrained beamformers. The canceler weights provide estimates of the direction vectors of the captured signals across the cascade stages, which are used in the parallel implementation of a linearly constrained CM (LCCM) array. Since the direction vectors are obtained directly from the canceler weights the hybrid implementation does not require prior knowledge of the array response matrix, and is independent of the type of antennas used in the receiver. If the source signals are sufficiently separated in angle, then they can be captured separately across the parallel stages. When the sources are correlated, the cascade CM array does not completely cancel the captured signals, and previous versions of the parallel CM array do not always capture different sources across the stages. These problems are solved by the proposed hybrid LCCM architecture based on the signal canceler constraints. Computer simulations for example cochannel scenarios are provided to illustrate various properties of the system.
ICA-Based Semi-Blind Spatial Equalization for MIMO Systems
Zhiguo Ding (Queen's University Belfast, United Kingdom); Tharmalingam Ratnarajah (Queens University of Belfast, United Kingdom); Colin Cowan (The Queen's University of Belfast, United Kingdom); Yu Gong (Queen's University of Belfast, United Kingdom)
Blind equalization is one challenging problem for multiple-input multiple-output systems, to which independent component analysis (ICA) is applicable. However direct application of ICA could yield low convergence speed and poor performance. In this paper we propose two semi-blind ICA-based algorithms, which incorporate information both from the training and unknown sequences. During each iterative/adaptive step, the training information is utilized to supervise the unconstrained blind ICA-based measure. Simulation results show that the two proposed semi-blind approaches can outperform both the training-based and conventional ICA method. Furthermore we report a special case of MIMO systems which does not require the algorithm of source separation, whose proof is also provided.
The Effectiveness of ICA-based Representation: Application to Speech Feature Extraction for Noise Robust Speaker Recognition
Xin Zou (University of Birmingham, United Kingdom); Peter Jancovic (University of Birmingham, United Kingdom); Ju Liu (Shandong University, P.R. China)
In this paper, we present a mathematical derivation demonstrating that a feature representation obtained by using the Independent Component Analysis (ICA) is an effective representation for non-Gaussian signals when being both clean and corrupted by Gaussian noise. Our findings are experimentally demonstrated by employing the ICA for speech feature extraction; specifically, the ICA is used to transform the logarithm filter-bank-energies (instead of the DCT which provides MFCC features). The evaluation is presented for a GMM-based speaker identification task on the TIMIT database for clean speech and speech corrupted by white noise. The effectiveness of ICA is analysed individually for signals corresponding to each phoneme. The experimental results show that the ICA-based features can provide significantly better performance than traditional MFCCs and PCA-based features in both clean and noisy speech.
Contributions to ICA of natural images
Ruben Martin-Clemente (University of Sevilla, Spain); Susana Hornillo-Mellado (University of Sevilla, Spain)
In this paper we analyze the results provided by the popular algorithm FastICA when it is applied to natural images, using the kurtosis as non-linearity. In this case show that the so-called ICA filters can be expressed in terms of the eigenvectors associated to the smallest eigenvalues of the data correlation matrix, meaning that these filters are all high-pass. From this property emerges the sparse distribution of the independent components. On the other hand, the use of the kur-tosis as contrast function causes the appearance of spikes in the independent components that make that the ICA bases are very similar to patches of the images analyzed. Some experiments are included to illustrate the results.
A Learning Algorithm with Distortion Free Constraint and Comparative Study for Feedforward and Feedback BSS
Akihide Horita (Kanazawa University, Japan); Kenji Nakayama (Kanazawa Univ., Japan); Akihiro Hirano (, ? ); Yasuhiro Dejima (Kanazawa University, Japan)
Source separation and signal distortion are theoretically analyzed for the FF-BSS systems implemented in both the time and frequency domains and the FB-BSS system. The FF-BSS systems have some degree of freedom, and cause some signal distortion. The FB-BSS has a unique solution for complete separation and signal distortion free. Next, the condition for complete separation and signal distortion free is derived for the FF-BSS systems. This condition is applied to the learning algorithms. Computer simulations by using speech signals are carried out for the conventional methods and the new learning algorithms employing the proposed distortion free constraint. The proposed method can drastically suppress signal distortion, while maintaining high separation performance. The FB-BSS system also demonstrates good performances. The FF-BSS systems and the FB-BSS system are compared based on the transmission time difference in the mixing process. Location of the signal sources and the sensors are rather limited in the FB-BSS system.
A DOA estimation method for 3D multiple source signals using independent component analysis
Hao Yuan (KDDI R&D Laboratories Inc., Japan); Makoto Yamada (KDDI R&D Loboratories, Inc., Japan); Hisashi Kawai (KDDI R&D Laboratories Inc., Japan)
A new DOA (direction of arrival) estimation method is pro-posed for 3D (three-dimensional) multiple source signals using independent component analysis (ICA). The multiple source signals travel and mix in a reverberant environment and are observed at a sensor array. These observed signals are separated based on independent component analysis and the DOAs of source signals are estimated. This method can deal with signals up to the number of sensors while the conventional method based on subspace analysis, such as the well-known MUSIC algorithm, can merely be applied to those cases where the number of source signals is less than that of the sensors. A two step estimation method is also proposed to improve the estimation accuracy and the disper-sion of the estimated DOAs. Experimental results reveal that the proposed method is better than the MUSIC algorithm from the perspective of small dispersion.
Multi-Speaker Voice Activity Detection using ICA and Beampattern Analysis
Suneeth Maraboina (IIT Guwahati India, India); Dorothea Kolossa (TU Berlin, Germany); Prabin Kumar Bora (IIT Guwahati, India, India); Reinhold Orglmeister (TU Berlin, Germany)
Voice activity detection is a necessary preprocessing step for many applications like channel identification or speech rec-ognition. The problem can be solved even under noisy con-ditions by exploiting characteristics of speech and noise signals. However, when more speakers are active simultane-ously, these methods are generally unreliable, since multiple speech signals may overlap completely in the time-frequency plane. Here, a new approach is suggested which is applica-ble in multi-speaker scenarios also, owing to its incorpora-tion of higher order statistics. Here, independent component analysis is used to obtain estimates of the clean speech and the angles of incidence for each speaker. Subsequently, these estimates can help to correctly identify the active speaker and perform voice activity detection. The suggested approach is robust to noise as well as to interfering speech and can detect the presence of single speakers in mixtures of speech and noise, even under highly reverberant conditions at 0dB SIR.
Iterative blind source separation by decorrelation: algorithm and performance analysis
Abdeldjalil Aïssa-El-Bey (ENST, France); Karim Abed-Meraim (Dept TSI, Télécom Paris, France); Yves Grenier (ENST-Paris, TSI department, France)
This paper presents an iterative blind source separation method using second order statistics (SOS) and natural gradient technique. The SOS of observed data is shown to be sufficient for separating mutually uncorrelated sources provided that the considered temporal coherence vectors of the sources are pairwise linearly independent. By applying the natural gradient, an iterative algorithm is derived that has a number of attractive properties including its simplicity and 'easy' generalization to adaptive or convolutive schemes. Asymptotic performance analysis of the proposed method is performed. Several numerical simulations are presented to demonstrate the effectiveness of the proposed method and to validate the theoretical expression of the asymptotic performance index.
Performance indexes of BSS for real-world applications
Ali Mansour (ENSIETA, 29806 Brest cedex 09, (FRANCE)., France); Ayman Alfalou (ISEN BREST, France)
This paper deals with the independence measure problem. Over the last decade, many Independent Component Analysis (ICA) algorithms have been proposed to solve the blind source separation (BSS) of convolutive mixture. However few performance indexes can be found in the literature. The most used performance indexes are described hereafter and three new performance indexes are also proposed.
a simple decoupled estimation of DoA and angular spread for spatially dispersed sources
Anne Ferreol (Thales Communications, France); Eric Boyer (ENS de Cachan, France); Xuefeng Yin (Aalborg University, Denmark); Pascal Larzabal (ENS de Cachan, France); Bernard Fleury (Aalborg University, Denmark)
In wireless communications, local scattering in the vicinity of the mobile station results in angular spreading. A few estimators (of direction of arrival (DoA) and angular spread) have already been developed, but they suffer from high computational load or are developed in very specific contexts. In this paper, we present a new simple low-complexity decoupled estimator of DoA and angular spread for a spatially dispersed source. The proposed MDS (Music for Dispersed Sources) algorithm does not assume any particular sensor array geometry nor temporal independence hypothesis. Moreover, the estimation of DoA\ and angular spread does not require knowledge of the angular and temporal distribution shapes of the sources. In addition, the low computational cost of this method makes it attractive.
Source Separation Using Multiple Directivity Patterns Produced by ICA-based BSS
Takashi Isa (Waseda University, Japan); Toshiyuki Sekiya (Waseda University, Japan); Tetsuji Ogawa (Waseda University, Japan); Tetsunori Kobayashi (Waseda University, Japan)
In this paper, we propose a multistage source separation method constructed by combining blind source separation (BSS) based on independent component analysis (ICA) and segregation using multiple directivity patterns (SMDP) introduced in our previous paper. We obtain the directivity patterns needed in SMDP by ICA-based BSS. In the SMDP, simultaneous equations of amplitudes of sound sources are generated by using these multiple directivities. The solution of these equations gives good disturbance estimates. We apply spectral subtraction using these disturbance estimates and the speech enhancement of the target source is performed. We conducted experimentation in a real room in the source-number-given condition where there is no priori information about the sound sources and the characteristics of room acoustics. The experimental results of double talk recognition show that the proposed technique is effective in reducing the error rate by 30\% compared to frequency domain BSS.
A Method for Solving the Permutation Problem of Frequency-Domain BSS Using Reference Signal
Takashi Isa (Waseda University, Japan); Toshiyuki Sekiya (Waseda University, Japan); Tetsuji Ogawa (Waseda University, Japan); Tetsunori Kobayashi (Waseda University, Japan)
This paper presents a method for solving the permutation problem. This is a problem specific to frequency domain blind source separation within the framework of independent component analysis. Towards this problem, we propose a method which uses reference signals. For each frequency bin, the permutation alignment is fixed by calculating correlation coefficients between the reference signal and the separated signal. Reference signals are obtained as signals corresponding to each individual original sources. The reference signals are chosen or obtained subjectively, and do not need to be separated well. For example, the conventional beamforming technique gives suitable reference signals. To show the effectiveness of this method, we conducted a experiment of continuous speech recognition in a real room. The experimental results of double talk recognition with 20K vocabulary show that the proposed method is effective to achieve 20% error reduction rate compared with the established DOA-based approach.
Flexible ICA approach to nonlinear blind signal separation in complex domain.
Daniele Vigliano (Univerity of Rome "La Sapienza", Italy); Michele Scarpiniti (Univerity of Rome "La Sapienza", Italy); Raffaele Parisi (Univerity of Rome "La Sapienza", Italy); Aurelio Uncini (Univerity of Rome "La Sapienza", Italy)
This paper introduces an Independent Component Analysis (ICA) approach to the separation of nonlinear mixtures in the complex domain. Source separation is performed by a complex INFOMAX approach. Nonlinear complex functions involved in the processing are realized by pairs of spline neurons called splitting functions, working on the real and the imaginary part of the signal respectively. A simple adaptation algorithm is derived and some experimental re-sults that demonstrate the effectiveness of the proposed method are shown.
Optimal Separation of Polarized Signals by Quaternionic Neural Networks
Sven Buchholz (University of Kiel, Germany); Nicolas Le Bihan (LIS, France)
Statistical description of polarized signals is proposed in terms of proper quaternionic random processes. With this description, it is possible to perform separation between polarized signals by means of quaternionic neural network. Simulation results show the ability of quaternionic approach (statistical model and processing) to perform better separation of polarized signals than can do real neural networks.
Two-Stage Blind Separation of Moving Sound Sources with Pocket-Size Real-Time DSP Module
Yoshimitsu Mori (Graduate School of Information Science, Nara Institute of Science and Technology, Japan); Tomoya Takatani (Graduate School of Information Science, Nara Institute of Science and Technology, Japan); Hiroshi Saruwatari (Graduate School of Information Science, Nara Institute of Science and Technology, Japan); Kiyohiro Shikano (Graduate School of Information Science, Nara Institute of Science and Technology, Japan); Takashi Hiekata (Kobe Steel, Ltd, Japan); Takashi Morita (Kobe Steel, Ltd, Japan)
A new real-time two-stage blind source separation (BSS) method for convolutive mixtures of speech is proposed, in which a single-input multiple-output (SIMO)-model-based independent component analysis (ICA) and a new SIMO-model-based binary masking are combined. SIMO-model-based ICA can separate the mixed signals, not into monaural source signals but into SIMO-model-based signals from independent sources in their original form at the microphones. Thus, the separated signals of SIMO-model-based ICA can maintain the spatial qualities of each sound source. Owing to this attractive property, novel SIMO-model-based binary masking can be applied to efficiently remove the residual interference components after SIMO-model-based ICA. In addition, the performance deterioration due to the latency problem in ICA can be mitigated by introducing real-time binary masking. We develop a pocket-size real-time DSP module implementing the new BSS method, and report the experimental evaluation of the proposed method's superiority to the conventional BSS methods, regarding moving-sound separation.
Global Signal Elimination from ELF Band Electromagnetic Signals by Independent Component Analysis
Motoaki Mouri (Nagoya Institute of Technology, Japan); Arao Funase (Nagoya Institute of Technology, Japan); Andrzej Cichocki (RIKEN BSI, Laboratory for Advanced Brain Signal Processing, Japan); Ichi Takumi (Nagoya Institute of Technology, Japan); Hiroshi Yasukawa (Aichi Prefectural University, Japan); Masayasu Hata (Chubu University, Japan)
Anomalous radiations of environmental electromagnetic (EM) waves are reported as the portents of earthquakes. We have been measuring the Extremely Low Frequency (ELF) range all over Japan in order to predict earthquakes. The observed signals contain global noise which has stronger power than local signals. Therefore, global noise distorts the results of earthquake-prediction. In order to overcome this distortion, it is necessary to eliminate global noises from the observed signals. In this paper, we propose a method of global noise elimination by Independent Component Analysis (ICA) and evaluate the effectiveness of this method.

### Poster: Multicarrier and OFDM Systems - 10

Room: Poster Area
Chair: Geert Leus (Delft University of Technology, The Netherlands)
Chien-Chang Li (National Chiao Tung University, Hsinchu, Taiwan); Yuan-Pei Lin (National Chiao Tung University, Taiwan)
The DMT (discrete multitone) based VDSL (very high speed digital subscriber line)system is susceptible to interference due to radio frequency transmission. It is known that windowing at the receiver can reduce radio frequency interference (RFI). In this paper, we formulate the interference of individual tones and minimize the total interference. The optimal window can be obtained in a closed form. The proposed windows have faster roll-off in low frequency. As a result, fewer tones will be dominated by RFI. Simulations will be given to demonstrate the usefulness of the proposed design. We also see that not knowing the statistics of the interference source leads to only a minor degradation. Therefore, we can obtain very good suppression effect with interference-independent windows, which has the advantage that the window need not be redesigned when the interference changes.
A Simple Adaptive Filter Method for Cancellation of Coupling Wave in OFDM Signals at SFN Relay Station
Hideaki Sakai (Kyoto University, Japan); Tetsuya Oka (Kyoto University, Japan); Kazunori Hayashi (Kyoto University, Japan)
In recent years digital terrestrial broadcasting systems have been developed where OFDM signals are used for data transmission with single frequency network (SFN). But in a SFN relay station the effect of the coupling wave from the transmitter to the receiving antenna is significant and needs to be cancelled. In this paper a simple adaptive filter method is applied to this problem. The stationary point of the conventional LMS algorithm is first derived and its local stability is examined by using the averaging method. It is found that this algorithm has a bias. Then a modified algorithm is proposed to remove this bias. Simulation results show the validity of the theoretical findings.
Closed-form approximation for the outage capacity of OFDM-STBC and OFDM-SFBC systems
Jesus Perez (University of Cantabria, Spain); Ignacio Santamaria (University of Cantabria, Spain); Jesus Ibañez (University of Cantabria, Spain); Luis Vielva (University of Cantabria, Spain)
The combination of orthogonal frequency-division multiplexing (OFDM) and space-time or space-frequency block coding (STBC or SFBC) has been shown to be a simple and efficient means to exploit the spatial diversity in frequency-selective fading channels. From a general broadband multiple-input-multiple-output (MIMO) channel model, we derive a tight analytical approximation for the outage capacity of such systems assuming that the channel is known at the receiver and unknown at the transmitter. This expression is a simple function of the channel and system parameters. Numerical results are provided to demonstrate the excellent accuracy of the derived approximation in all cases.
Slepian Pulses For Multicarrier OQAM
Ivana Raos (UPM, Spain); Santiago Zazo (Universidad Politecnica Madrid, Spain); Igor Arambasic (Politechnical University of Madrid, UPM, Spain)
OFDM/OQAM is spectrally efficient multicarrier (MC) system, robust to mobile channel (time and frequency) variability thanks to pulse shaping. In this contribution, pulse shaping with discrete non-orthogonal pulse based on Discrete Prolate Spheroidal Sequence (DPSS), is proposed. DPSS, being the most concentrated pulse out of all sequences of the same length, is the natural choice for pulse shaping. The novelty of the analysis lies in the discovery of quasi-orthogonality property of DPSS to its time and frequency shifts. DPSS pulse analysis shows that this property is valid in the case of short pulse of the single MC symbol length. MC/OQAM system performance with DPSS and other pulses types is analyzed by simulation confirming the quasi-orthogonality of DPSS. The advantage of using the most concentrated pulse is shown in terms of interference that two MC/OQAM systems in adjacent frequency bands cause one to another. Simulations also show that DPSS permits smaller frequency separation between systems that are allocated adjacent frequency bands.
EM-based Enhancement of the Wiener Pilot-aided Channel Estimation in MIMO-OFDM Systems
Pilot Aided Channel Estimation (PACE) in OFDM systems uses training sequences to estimate the channel in a subset of the frequency bins, followed by interpolation to recover the rest of the non-pilot subchannels. Particularly, the Wiener interpolation method is based on the knowledge about channel statistics in order to find the optimal channel estimate in a frequency bin without pilot symbols in the linear MMSE sense. Nevertheless, the Wiener interpolation does not utilize the other available information at the receiver such as the received signals and the knowledge about the transmitted symbol alphabet. This paper presents a novel Expectation Maximization (EM) algorithm which optimally utilizes all available information at the receiver and thus enhances the Wiener interpolation. The proposed method is tested successfully on MIMO-OFDM systems in realistic channel conditions.
NP-Hardness of Bit Allocation in Multiuser Multicarrier Communications
Manish Vemulapalli (The University of Iowa, USA); Soura Dasgupta (The University of Iowa, USA)
In this paper, we consider the problem of optimal bit allocation for a multiuser multicarrier communications. Some of the existing papers comment, without a proof, on the intractability of the problem and provide algorithms resulting in suboptimal bit allocation to reduce the computation complexity. A formal proof for classifying this problem as being NP-hard is presented in this article.
A phase noise compensation scheme for OFDM wireless systems
Qiyue Zou (University of California, Los Angeles, USA); Alireza Tarighat (University of California. Los Angeles (UCLA), USA); Nima Khajehnouri (University of California, Los Angeles, USA); Ali Sayed (University of California, Los Angeles, USA)
Phase noise causes significant degradation in the performance of Orthogonal Frequency Division Multiplexing (OFDM) based wireless communication systems. In the proposed compensation scheme, the communication between the transmitter and receiver blocks consists of two stages. In the first stage, block-type pilot symbols are used and the channel coefficients are jointly estimated with the phase noise in the time domain. In the second stage, comb-type OFDM symbols are transmitted such that the receiver can jointly estimate the data symbols and the phase noise. It is shown by computer simulations that the proposed scheme can effectively mitigate the inter-carrier interference caused by phase noise and improve the channel estimation and bit error rate of OFDM systems.
Novel Efficient Weighting Factors for PTS-based PAPR Reduction in Low-Power OFDM Transmitters
Thodoris Giannopoulos (University of Patras, Greece); Vassilis Paliouras (University of Patras, Greece)
Increased Peak-to-Average Power Ratio (PAPR) in an Orthogonal Frequency Division Multiplexing (OFDM) signal poses serious challenges in wireless telecom system design, since increased PAPR stresses the performance of analog and RF parts of a wireless modem. The Partial Transmit Sequence (PTS) algorithm is a flexible and distortionless peak power reduction scheme of low computational complexity. In this paper we derive a new set of weighting factors which improve the performance of the PTS algorithm. Furthermore, this paper examines the complexity of the VLSI implementation of PTS versus the power savings in the analog part due to the PAPR reduction. It is here shown that with the new set of weighting factors the power consumption of PA is reduced by 21.1% in comparison to OFDM without PAPR reduction.
Coupling Channel Identification for SFN OFDM Relay Station
Akira Sano (Keio University, Yokohama, Japan, Japan); Lianming Sun (The University of Kitakyushu, Japan)
The paper is concerned with identification of coupling channel and multipath channel for SFN relay stations in the OFDM transmission systems. To design a coupling canceller in stable manner, identification of the channels is needed, since OFDM transmitted signals are band-limited, then it is an important but difficult issue to identify an overall relay transfer function in full-band. The purpose of this paper is to propose a new identification scheme of the relay transfer function from the transmitted signals of a key station to the re-transmitted signal of a relay station, by efficiently making use of property of the OFDM signals with CP. Moreover, an upper bound on the transfer function estimation error is also evaluated.
Low Complexity Zero-Padding Zero-Jamming DMT Systems
Yuan-Hwui Chung (National Taiwan University, Taiwan); See-May Phoong (National Taiwan Universityiversity, Taiwan)
Discrete multitone (DMT) systems have been widely adopted in broadband communications. When the transmission channel is frequency selective, there will be interblock interference (IBI). IBI can be avoided by {\it zero-padding} (ZP) [1]. Another solution is to allow IBI during transmission, and at the receiver the samples that contain IBI are removed by {\it zero-jamming} (ZJ) [2]. The commonly used ZP DMT system employs the ZP technique whereas the widely adopted CP DMT system where cyclic prefix is added at the transmitter uses the ZJ technique. In both the ZP DMT and CP DMT systems, the number of redundant samples are larger than or equal to the channel order. In this paper, we propose a ZP-ZJ DMT system. By combining ZP and ZJ techniques, we are able to reduce the number or redundant samples needed for IBI elimination by as much as one half. The transmitter of the ZP-ZJ DMT system involves only one IFFT operation and its receiver can be implemented efficiently using a small number of FFT/IFFT operations. Simulation shows that the bandwidth efficient ZP-ZJ DMT system can sometimes outperform the CP DMT system.

### Poster: EEG Signal Analysis - 5

Room: Poster Area
Chair: Saeid Sanei (Cardiff University, United Kingdom)
Parallel Space-Time-Frequency Decomposition of EEG Signals For Brain Computer Interfacing
Kianoush Nazarpour (Cardiff University, United Kingdom); Saeid Sanei (Cardiff University, United Kingdom); Leor Shoker (Cardiff University, United Kingdom); Jonathon Chambers (Cardiff University Wales, United Kingdom)
This paper proposes a hybrid parallel factor analysis-support vector machines (PARAFAC-SVM) method for left and right index imagery movements classification, where spatial-temporal-spectral characteristics of the single trial electroencephalogram (EEG) signal are considered. The proposed scheme is to develop a parallel EEG Space-Time-Frequency (STF) decomposition in mu band (8-13 Hz) at the preprocessing stage of the BCI systems. Results of using PARAFAC shows that two distinct factors in mu band for each EEG trial can be extracted and the factor localized in the contralateral hemisphere is effectively utilized to indicate the subject's intention of left or right index motion imagination. We can reliably distinguish between left and right index movements by using the developed hybrid PARAFAC-SVM method.
Integer Sub-optimal Karhunen-Loeve Transform For Multi-channel Lossless EEG Compression
Yodchanan Wongsawat (University of Texas at Arlington, USA); Soontorn Oraintara (University of Texas at Arlington, USA)
In this paper, a method for approximating the Karhunen-Loeve Transform (KLT) for the purpose of multi-channel lossless electroencephalogram (EEG) compression is proposed. The approximation yields a near optimal transform for the case of Markov process, but significantly reduces the computational complexity. The sub-optimal KLT is further parameterized by a ladder factorization, rendering a reversible structure under quantization of coefficients called IntSKLT. The IntSKLT is then used to compress the multi-channel EEG signals. Lossless coding results show that the degradation in compression ratio using the IntSKLT is about 3% when the computational complexity is reduced by more than 60%.
A Novel Space-Time-Frequency Masking Approach for Quantification of EEG Source Propagation With an Application to Brain Computer Interfacing
Leor Shoker (Cardiff University, United Kingdom); Saeid Sanei (Cardiff University, United Kingdom); Kianoush Nazarpour (Cardiff University, United Kingdom); Alex Sumich (Institute of Psychiatry, United Kingdom)
A robust space-time-frequency signal extraction algorithm has been developed with an application to brain computer interface (BCI). The algorithm is based on extending time-frequency masking methods to accommodate the spatial domain. The space-time-frequency masks are then clustered in order to extract the desired source. Then the motion of the extracted source it tracked over the scalp. Finally, the trials are classified based on their directionality and locations over the scalp. The proposed method outperforms traditional systems by exploiting the motion of the sources.
Coherence Estimation between EEG signals using Multiple Window Time-Frequency Analysis compared to Gaussian Kernels
Johan Sandberg (Lund University, Sweden); Maria Hansson (Lund University, Sweden)
It is believed that neural activity evoked by cognitive tasks is spatially correlated in certain frequency bands. The electroencephalogram (EEG) is highly affected by noise of large amplitude which calls for sophisticated time local coherence estimation methods. In this paper we investigate different approaches to estimate time local coherence between two real valued signals. Our results indicate that the method using two dimensional Gaussian kernels has a slightly better average SNR compared to the multiple window approach. On the other hand, the multiple window approach has a more narrow SNR distribution and seems to perform better in the worst case.
Ranking Features of Wavelet-Decomposed EEG Based On Significance in Epileptic Seizure Prediction
Pedram Ataee (University of Tehran, Iran); Alireza Nasiri Avanaki (University of Tehran, Iran); Hadi Fatemi Shariatpanahi (University of Tehran, Iran); Seyed Mohammadreza Khoee (University of Tehran, Iran)
A method for ranking features of wavelet-decomposed EEG in order of importance in prediction of epileptic seizures is introduced. Using this method, the four most important fea-tures (extracted from each level of wavelet decomposition) are selected from ten features. The proposed set of features is then used to recognize pre-seizure signal, thus predict-ing a seizure. Our feature set outperforms previously used sets by achieving higher class separability index and correct classification rate.

### Poster : Cryptography, Watermarking, and Steganography - 10 papers

Room: Poster Area
Chair: Mauro Barni (University of Siena, Japan)
Dual Watermarking Algorithm Exploiting The Logarithmic Characteristic Of The Human Eye's Sensitivity To Luminance
Zhe Wang (University of Strathclyde, United Kingdom); Stephen Marshall (University of Strathclyde, United Kingdom)
The proposed colour image watermarking algorithm exploits the human visual system to optimise the trade off between the visibility and the robustness of the watermark. A less sensitive colour channel is selected to be watermarked and central watermarking is used to give the algorithm an extra robustness to image cropping. The DWT and DCT transforms are combined to pack the most energy into a few coefficients. The logarithmic sensitivity of the Human eye to luminance is exploited by assigning pixels into groups with different luminance ranges and watermarking them with different insertion powers. Test results show that the proposed algorithm has very good performance and is robust to several common image manipulations.
Asymmetric cryptography as subset of digital hologram watermarking
Michele De Santis (University of Roma Tre, Italy); Giuseppe Schirripa Spagnolo (University of "roma Tre", Italy)
ABSTRACT In this paper we propose an asymmetric cryptography as subset of digital hologram watermarking, which is able to detect malicious tampering while tolerating some incidental distortions. It is a fragile watermark; in fact the mark is readily altered or destroyed when the host image is modified through a linear or nonlinear transformation. The proposed technique could be applied to Color Images as well as to Gray Scale ones. Using digital hologram watermarking, the embedded mark could be easily recovered by means of a Fourier Transform. Due to this fact the host image can be tampered and watermarked with the same holographic pattern. To avoid this possibility we have introduced an encryption method using an asymmetric Cryptography. The proposed schema is based on the knowledge of original mark from the Authentication Entity, for applying Image Correlation between this and the extracted one.
A Multidimensional Map For A Chaotic Cryptosystem
Rhouma Rhouma (Ecole nationale d'ingénieur de tunis, Tunisia); Safya Belghith (ecole nationale d'ingnieurs de Tunis, Tunisia)
In this paper, we propose a new cryptosystem that is faster than the Baptista one and present a uniform distribution in his generated ciphertext. To increase the security, we use the logistic map and a 3-dimensional piecewise linear chaotic map in the generation of the associations tables.
Benchmarking Image Watermarking Algorithms with OpenWatermark
Benjamin Michiels (Université catholique de Louvain, Belgium); Benoit Macq (Universite catholique de Louvain, Belgium)
In this paper, we present the Openwatermark framework, which is an open-source web-based system dedicated to the benchmarking of watermarking algorithms. We show how to use this system with practical examples and demonstrate how its may significantly improve collaboration between members of the watermarking community, as it does not impose the usage of a specific operating system or programming language to the user.
Substitutive Watermarking Algorithms based on Interpolation
Vincent Martin (IRIT/ENSEEIHT, Toulouse, France); Marie Chabert (IRIT/ENSEEIHT, France); Bernard Lacaze (IRIT/INSA, France)
Imperceptibility is a concern in all watermarking techniques. Consequently, most algorithms use a psychovisual mask. Interpolation techniques offer interesting perceptual properties and have been abundantly studied in image processing. This article aims at defining a class of watermarking algorithms that take advantage of this property. This class generalizes previous work on bilinear interpolation. A theoretical performance study is proposed. Moreover, optimal decoding as well as objective imperceptibility and security measures are provided for the whole class. An application to spline interpolation is studied.
Halftone Visual Cryptography via Direct Binary Search
Gonzalo Arce (University of Delaware, USA); Zhongmin Wang (University of Delaware, USA); Giovanni Di Crescenzo (Telcordia Technologies, USA)
This paper considers the problem of encoding a secret binary image into n shares of meaningful halftone images. The secret image can be visually decoded by stacking together the transparencies associated to shares from a qualified subset. Secret pixels encoded into the shares introduce noise to the halftone images. We extend our previous work on halftone visual cryptography and propose a new method that can encode the secret pixels into the shares via the direct binary search (DBS) method. The perceptual errors between the halftone shares and the continuous-tone images are minimized with respect to an alpha stable human visual system (HVS) model. Simulation results show that our proposed method can improve significantly the halftone image quality for the encoded shares compared with previous algorithms.
A Fibonacci LSB Data Hiding Tecnique
Diego De Luca Picione (University of Roma TRE, Italy); Federica Battisti (University of Roma TRE, Italy); Karen Egiazarian (Tampere University of Technology, Finland); Marco Carli (University of Roma TRE, Italy); Jaakko Astola (Tampere University of Technology, Finland)
In this paper, a novel data-hiding technique based on the Fibonacci representation of digital images is presented. A generalization of the classical Least Significant Bit (LSB) embedding method is performed. The Fibonacci representa-tion of grey level images requires 12 bit planes instead of the usual 8 planes of binary representation. Experimental results show that, such a redundant scheme outperforms the classi-cal LSB method resulting in marked images having less per-ceptual distortion even if different planes from LSB one are selected for the embedding. Furthermore, the robustness of LSB scheme against some type of attacks, generally very weak, is increased. Computational cost of the embedding scheme is compatible with the classical LSB one.
Video Watermarking in 3D DCT Domain
Marco Carli (University of Roma TRE, Italy); Riccardo Mazzeo (University of Roma TRE, Italy); Alessandro Neri (University of ROMA TRE, Italy)
In this work a novel scheme of embedding data in a video sequence is presented. The method is based on a modified Quantization Index Modulation (QIM) embedding algorithm in the 3D-DCT domain. Different zones in the 3D DCT cube have been analyzed in order to obtain better performances in term of robustness and imperceptibility. Analysis of the results shows the effectiveness of the proposed scheme.
Data hiding in fingerprint images
Lahouari Ghouti (Queen's University Belfast, United Kingdom); Ahmed Bouridane (Queen's University, United Kingdom)
Following the emergence and success of biometric identification systems as a powerful and irrefutable tool for security-aware architectures, it became evident that ensuring the authenticity of the biometric data itself is a challenging research problem. Furthermore, the "open" nature of this type of data makes it extremely vulnerable against typical processing attacks present in biometric acquisition hardware and software. In this paper, we propose a robust watermarking system to authenticate the biometric data. To enable the watermark robustness, we embed the watermark payload using the most important features of the biometric (fingerprint and shoeprint) images. We propose image edges, determined by the wavelet maxima points, to quantify the robust feature points pertaining to the biometric images. The feature points, wavelet maxima, are modified in a way to ensure watermark imperceptibility while achieving high robustness. Also, geometric-invariance is achieved because the wavelet maxima representations are capable of adjusting the sampling grid when the host images are translated. Finally, performance results are reported to illustrate the robustness of the proposed watermarking system.
A DCT-Based Data-Hiding Method to Embed the Color Information in a JPEG Grey Level Image
Marc Chaumont (University of Montpellier, LIRMM, France); William Puech (University of Montpellier, France)
In this paper, we propose an original method to embed the color information of an image in a corresponding compressed grey-level image. The objective of this work is to allow free access to the compressed grey-level image and give color image access only if you own a secret key. This method is made of three major steps which are the color quantization, the ordering and the DCT-based data hiding. The novelty of this paper is to build an indexed image which is, in the same time, a semantically intelligible grey-level image. In order to obtain this particular indexed image, which should be robust to data hiding, we propose an original K color ordering algorithm. Finally, the DCT-based data-hiding method benefits from the used of an hybrid JPEG coder which allows to compress images with a Word Wide Web standard format and in the same time proposes a data-hiding functionality.

### Tue.6.3: Signal Representation and Filter Analysis - 5 papers

Room: Room 4
Chair: Gaetano Giunta (University of Roma Tre, Italy)
On Signal Representation From Generalized Samples: Minmax Approximation with Constraints
Hagai Kirshner (Technion - Israel Institute of Technology, Israel); Tsvi Dvorkind (Technion, Israel); Yonina Eldar (Technion---Israel Institute of Technology, Israel); Moshe Porat (Technion- Israel Institute of Technology, Israel)
Signal representation plays a major role in many DSP applications. In this paper we consider the task of calculating representation coefficients of an analog signal, where the only available data is its samples based on practical non-ideal acquisition devices. We adopt a minmax approach and further incorporate regularity constrains on the original continuous-time signal. These constraints stem from the nature of applications where smooth signals serve as input data. The ensued solution is shown to consist of an orthogonal projection within a Sobolev space. Illustrating examples are given, utilizing this constrained minmax approach. Our conclusion is that this new approach to signal representation could improve presently available systems, especially in non-ideal situations.
Structural Equivalences in Wave Digital Systems based on Dynamic Scattering Cells
Augusto Sarti (Politecnico di Milano, Italy)
In this article we prove that a computable tree-like interconnection of parallel/series Wave Digital adaptors with memory (which are characterized by reflection filters instead of reflection coefficients) is equivalent to a standard (instantaneous) multi-port adaptor whose ports are connected to mutators (2-port adaptors with memory). We prove this by providing a methodology for extracting the memory from a macro-adaptor, which simplifies the implementation of WD structures.
Parametric Phase Equalizers for Warped Filter-Banks
Heinrich Loellmann (RWTH Aachen University, Germany); Peter Vary (RWTH Aachen University, Germany)
In this paper, phase equalizers are investigated which aim to compensate the non-linear phase response of a frequency warped filter-bank and a new design is proposed. The frequency resolution of a warped filter-bank based on an allpass transformation can be adjusted by a single allpass coefficient. Thus, parametric phase equalizers are of special interest as their coefficients are given by a closed-form expression in dependence on this allpass coefficient. The approximation error for parametric FIR phase equalizers is analyzed. They can achieve a low phase error but introduce magnitude distortions. These distortions can be avoided by using allpass phase equalizers. A new parametric allpass phase equalizer is proposed. It has a lower complexity than a general allpass phase equalizer and leads to an equiripple approximation error for the desired phase response and group-delay.
Fast reconstruction from non-uniform samples in shift-invariant spaces
Laurent Condat (Laboratory of Images and Signals, France); Annick Montanvert (Laboratory of Images and Signals, France)
We propose a new approach for signal reconstruction from non-uniform samples, without constraints on their locations. We look for a function that belongs to a linear shift-invariant space, and minimizes a variational criterion that is a weighted sum of a least-squares data term and a quadratic term penalizing the lack of smoothness. This leads to a resolution-dependent solution, that can be computed exactly by a fast non-iterative algorithm.
Design Trade-Offs for Linear-Phase FIR Decimation Filters and Sigma-Delta-Modulators
In this paper we examine the relation between signal-to-noise-ratio, oversampling ratio, transition bandwidth, and filter order for some commonly used sigma-delta-modulators and corresponding decimation filters. The decimation filters are equi-ripple finite impulse response filters and it is demonstrated that, for any given filter order, there exists an optimum choice of the stopband ripple and stopband edge which minimizes the signal-to-noise-ratio degradation.

### Tue.4.3: Bayesian Methods for Inverse Problems in Image and Signal Processing I (Special session) - 5 papers

Room: Sala Onice
Chair: Nikolaos Galatsanos (University of Ioannina, Greece)
Analysis versus Synthesis in Signal Priors
Ron Rubinstein (Technion, Israel Institute of Technology, Israel); Michael Elad (Technion, Israel); Peyman Milanfar (University of California, Santa Cruz, Canada)
The concept of prior probability for signals plays a key role in the successful solution of many inverse problems. Much of the literature on this topic can be divided between analysis-based and synthesis-based priors. Analysis-based priors assign probability to a signal through various forward measurements of it, while synthesis-based priors seek a reconstruction of the signal as a combination of atom signals. In this paper we describe these two prior classes, focusing on the distinction between them. We show that although when reducing to the complete and under-complete formulations the two become equivalent, in their more interesting over-complete formulation the two types depart. Focusing on the L1 denoising case, we present several ways of comparing the two types of priors, establishing the existence of an unbridgeable gap between them.
Adaptive Bayesian/Total-Variation Image Deconvolution: A Majorization-Minimization Approach
José Bioucas-Dias (Instituto Superior Técnico, Portugal); Mario Figueiredo (Instituto Superior Técnico, Portugal); João Oliveira (Instituto Superior Técnico, Portugal)
This paper proposes a new algorithm for total variation (TV) image deconvolution under the assumptions of linear observations and additive white Gaussian noise. By adopting a Bayesian point of view, the regularization parameter, modeled with a Jeffreys' prior, is integrated out. Thus, the resulting crietrion adapts itself to the data and the critical issue of selecting the regularization parameter is sidestepped. To implement the resulting criterion, we propose a {\em majorization-minimization} approach, which consists in replacing a difficult optimization problem with a sequence of simpler ones. The computational complexity of the proposed algorithm is O(N) for finite support convolutional kernels. The results are competitive with recent state-of-the-art methods.
A Perceptual Bayesian Estimation Framework and its Application to Image Denoising
Javier Portilla (Consejo Superior de Investigaciones Cientà­ficas (CSIC), Spain)
We present a generic Bayesian framework for signal estimation that incorporates into the cost function a perceptual metric. We apply this framework to image denoising, considering additive noise of known density. Under certain assumptions on the way local differences in visual responses add up into a global perceptual distance, we obtain analytical solutions that exhibit interesting theoretical properties. We demonstrate through simulations, using an {\em infomax} non-linear perceptual mapping of the input and a local Gaussian model, that in the absence of a prior the new solutions provide a significant improvement on the visual quality of the estimation. Furthermore, they also improve in Mean Square Error terms w.r.t. their non-perceptual counterparts.
Adaptive regularization of noisy linear inverse problems
Lars Kai Hansen (Technical University of Denmark, Denmark); Kristoffer Madsen (Tecnical University of Denmark, Denmark); Tue Lehn-Schioler (Technical University of Denmark, Denmark)
In the Bayesian modeling framework there is a close relation between regularization and the prior distribution over parameters. For prior distributions in the exponential family, we show that the optimal hyperparameter, i.e., the optimal strength of regularization, satisfies a simple relation: The expectation of the regularization function, takes the same value in the posterior and prior distribution. We present three examples: Two simulations and application in fMRI neuroimaging.
Parameter Estimation and Order Selection for Linear Regression Problems
Yngve Selén (Uppsala University, Sweden); Erik G. Larsson (Royal Institute of Technology, Sweden)
Parameter estimation and model order selection for linear regression models are two classical problems. In this article we derive the minimum mean-square error (MMSE) parameter estimate for a linear regression model with unknown order. We call the so-obtained estimator the Bayesian Parameter estimation Method (BPM). We also derive the model order selection rule which maximizes the probability of selecting the correct model. The rule is denoted BOSS---Bayesian Order Selection Strategy. Our estimators have several advantages: They satisfy certain optimality criteria, they are non-asymptotic and they have low computational complexity. We also derive empirical Bayesian'' versions of BPM and BOSS, which do not require any prior knowledge nor do they need the choice of any user parameters''. We show that our estimators outperform several classical methods, including the AIC and BIC for order selection.

### Tue.3.3: Signal Processing for Music - 5 papers

Room: Sala Verde
Chair: Jesper Jensen (Delft University of Technology, NL)
A spectral difference approach to downbeat extraction in musical audio
Matthew Davies (Queen Mary, University of London, United Kingdom); Mark Plumbley (Queen Mary, University of London, United Kingdom)
We introduce a method for detecting downbeats in musical audio given a sequence of beat times. Using musical knowledge that lower frequency bands are perceptually more important, we find the spectral difference between band-limited beat synchronous analysis frames as a robust downbeat indicator. Initial results are encouraging for this type of system.
New methods in structural segmentation of musical audio
Mark Levy (Queen Mary, University of London, United Kingdom); Mark Sandler (Queen Mary University of London, United Kingdom)
We describe a simple model of musical structure and two related methods of extracting a high-level segmentation of a music track from the audio data, including a novel use of hidden semi-Markov models. We introduce a semi-supervised segmentation process which finds musical structure with improved accuracy given some very limited manual input. We give experimental results compared to existing methods and human segmentations.
Evaluation of MFCC estimation techniques for music similarity
Jesper Jensen (Aalborg University, Denmark); Mads Christensen (Aalborg University, Denmark); Manohar Murthi (University of Miami, USA); Søren Holdt Jensen (Aalborg University, Denmark)
Spectral envelope parameters in the form of mel-frequency cepstral coefficients are often used for capturing timbral information of music signals in connection with genre classification applications. In this paper, we evaluate mel-frequency cepstral coefficient (MFCC) estimation techniques, namely the classical FFT and linear prediction based implementations and an implementation based on the more recent MVDR spectral estimator. The performance of these methods are evaluated in genre classification using a probabilistic classifier based on Gaussian Mixture models. MFCCs based on fixed order, signal independent linear prediction and MVDR spectral estimators did not exhibit any statistically significant improvement over MFCCs based on the simpler FFT.
Speech/music discrimination using a Warped LPC-based feature and a fuzzy expert system for intelligent audio coding
Nicolas Ruiz Reyes (University of Jaen, Spain); Sebastian Garcia Galan (University of Jaen, Spain); Jose Enrique Muñoz Exposito (University of Jaen, Spain); Pedro Vera Candeas (University of Jaen, Spain); Fernando Rivas Peña (University of Jaen, Spain)
Automatic discrimination of speech and music is an important tool in many multimedia applications. This paper presents an evolutionary fuzzy rules-based speech/music discrimination approach for intelligent audio coding. A low complexity but effective feature, called Warped LPC-based Spectral Centroid (WLPC-SC), is defined for the analysis stage of the discrimination system. The final decision is made by a fuzzy expert system, which improves the accuracy rate provided by a Gaussian Mixture Model (GMM) classifier taking into account the audio labels assigned by the GMM classifier to past audio frames. Comparison between WLPC-SC and most timbral features proposed in [8] is performed, aiming to assess the good discriminatory power of the proposed feature. The accuracy rate improvement due to the fuzzy expert system is also reported. Experimental results reveal that our speech/music discriminator is robust and fast, making it suitable for intelligent audio coding.
Use of Continuous Wavelet-Like Transform in Automated Music Transcription
Aliaksandr Paradzinets (Ecole Centrale de Lyon, France)
This paper describes an approach to the problem of automated music transcription. The Continuous Wavelet-like Transform is used as a basic time-frequency analysis of a musical signal due to its flexibility in time-frequency resolutions. The signal is then sequentially modeled by a number of tone harmonic structures; on each iteration a dominant harmonic structure is considered to be a pitch candidate. The transcription performance is measured on test data generated by MIDI wavetable synthesis both from MIDI files and played on a keyboard. Three cases: monophonic, polyphonic and complicated polyphonic are examined.

Room: Auditorium

## 5:10 PM - 6:30 PM

### Tue.2.4: Bioinformatics - 4 papers

Chair: Ercan Kuruoglu (CNR, Pisa, Italy)
A Method For Analysing A Gene Expression Data Sequence Using Probabilistic Boolean Networks
Stephen Marshall (University of Strathclyde, United Kingdom); Le Yu (University of Strathclyde, United Kingdom)
This paper describes a new method for analysing a gene ex-pression data sequence using Probabilistic Boolean Net-works. Switch-like phenomena within biological systems re-sult in difficulty in the modelling of gene regulatory networks. To tackle this problem, we propose an approach based on so called purity functions to partition the data sequence into sections each corresponding to a single model with fixed parameters, and introduce a method based on reverse engineering for the identification of predictor genes and functions. Furthermore, we develop a new model extending the PBN concept for the inference of gene regulatory networks from gene expression time-course data under different biological conditions. We then apply it for inferring Macrophage gene regulation in the interferon pathway. In conjunction with this, a new approach based on constrained prediction and Coefficient of Determination to identify the model from real expression data is presented in the paper. In addition, we also include a discussion of network switching probability, selection probabilities and mutation rates.
On Probe-Level Interference and Noise Modeling in Gene Expression Microarray Experiments
Paul Flikkema (Northern Arizona University, USA)
This paper describes a signal processing model of gene expression microarray experiments using oligonucleotide technologies. The objective is to estimate the expression transcript concentrations modeled as an analog signal vector. This vector is received via a cascade of two noisy channels that model noise (uncertainty) before, during, and after hybridization. The second channel is also mixing since transcript-probe hybridization is not perfectly specific. The gene expression levels are estimated based on a second-order statistical model that incorporates biological, sample preparation, hybridization, and optical detection noises. A key feature is the explicit modeling of gene-specific and non-specific hybridization in which both have deterministic and random components. The model is applied to the processing of probe pairs as used in Affymetrix arrays, and comparison of currently used methods with the optimum Gauss-Markov estimator. In general, the estimation performance is a function of the hybridization noise characteristics, probe set design and number of experimental replicates, with implications for integrated design of the experimental process
Exactly Periodic Subspace Decomposition Based Approach for Identifying Tandem Repeats in DNA Sequences
Ravi Gupta (Indian Institute of Technology Roorkee, India); Divya Sarthi (Indian Institute of Technology, India); Ankush Mittal (Indian Institute of Technology Roorkee, India); Kuldip Singh (Indian Institute of Technology Roorkee, India)
The identification and analysis of tandem repeats is an active area of biological and computational research. Tandem repetitive structures in telomeres plays a role in cancer and hyper-variable, trinucleotide tandem repeats are linked to over a dozen major degenerative diseases. They also play a very crucial role in DNA fingerprinting. In this paper we present an algorithm to identify the exact and inexact tandem repeats in DNA sequences based on orthogonal exactly periodic subspace decomposition technique. The algorithm uses a sliding window approach to identify the location of tandem repeats and other patterns that are present in DNA sequence due to repetition of individual nucleotides. Our algorithm also resolves the problems that were present in periodicity explorer algorithm for identifying tandem repeats. The time complexity of the algorithm, when searching for repeats for window size W in a DNA sequence S of length N is O(NWlgW). We present some experimental results concerning to sensitivity of our algorithm.
Identifying Inverted Repeat Structure in DNA Sequences using Correlation Framework
Ravi Gupta (Indian Institute of Technology Roorkee, India); Ankush Mittal (Indian Institute of Technology Roorkee, India); Kuldip Singh (Indian Institute of Technology Roorkee, India)
The detection of inverted repeat structure is important in biology because it has been associated with large biological function. This paper presents a framework for identifying inverted repeat structure present in DNA sequence. Based on the correlation framework, the algorithm is divided into two stages. In the first stage the position and length of contiguous inverted repeats are identified based on the input parameters using correlation function. Later on in the second stage maximal inverted repeats are constructed by merging of continuous inverted repeats. The advantage of the framework is that it can be successfully used for identifying both exact and inexact inverted repeats, returning maximal inverted repeat. Additionally, the framework does not need the user to specify parameters which require knowledge of system details. Experiments were performed on various chromosomes of Saccharomyces cerevisiae (bakers yeast) genome data available at NCBI website and some of the typical results are presented in this paper.

### Tue.5.4: Spread Spectrum, CDMA, and MC-CDMA - 4 papers

Chair: Filippo Giannetti (University of Pisa, Italy)
Jacek Leskow (Wyzsa Szkola Biznesu WSB-NLU, Poland); Antonio Napolitano (Universita di Napoli Parthenope, Italy)
In this paper, a new technique to design signals for secure transmissions is proposed. The proposed technique does not allow to an unauthorized third party to discover the modulation format and, hence, to demodulate the signal. The proposed signal-design technique consists in adopting a non relatively measurable sequence as spreading sequence for a direct-sequence spread-spectrum signal. Non relatively measurable sequences are such that the appropriate time averages do not converge as the data-record length approaches infinite. Thus, none of their statistical functions defined in terms of infinite-time averages is convergent. Therefore, also the modulated continuous-time signal exhibits non convergent statistical functions. Consequently, all modulation classification methods based on measurements of statistical functions such as the autocorrelation function, moments, and cumulants fail to identify the characteristics of the modulation format. Simulation results are provided to show the lack of convergence, as the data record is increased, of the estimators of the second-order cyclic statistics of the designed signal.
Subband Adaptive Array for MIMO-CDMA Space-Time Block Coded System
Nordin Ramli (University of Electro-communications, Japan); Tetsuki Taniguchi (University of Electro-Communications, Japan); Yoshio Karasawa (University of Electro-Communications, Japan)
This paper presents an interference supression using subband adaptive array for space-time block coding (STBC) code division multiple access (CDMA) under the frequency selective fading (FSF) channel. The proposed scheme utilizes CDMA with STBC and receive array antenna with subband adaptive array (SBAA) processing at receiver. The received signal is converted into the frequency domain before despreading and adaptive processing is done at each subband. A novel construction of SBAA is introduced to process CDMA signal based on STBC. Furthermore, to improve the performance of proposed scheme, we also introduce STBC-SBAA adopting spreading codes cyclic prefix (CP). Simulation results demonstrate improved performance of the proposed system in the case of single and multiusers environment compare to competing relatives.
Minimizing the Delay Estimation Error of a Spread Spectrum Signal for Satellite Positioning
Giacomo Bacci (University of Pisa, Italy); Marco Luise (University of Pisa, Italy)
As is known, satellite positioning is based on measuring the delay experienced by a Spread Spectrum (SS) signal that propagates from the satellite to the receiver. In such a scenario, the more accurate the delay estimation is, the more precise user position computation will be. This paper derives a criterion to improve position accuracy, based on minimizing the variance of time-delay estimation at the receiver. In particular, it focuses on designing sequences with specified constraints on the aperiodic auto-correlation sequence. The techniques used to meet such constraints are based on difference sets obtained from power residue classification.
Performance Analysis of a MC-CDMA Forward Link over Nonlinear Channels
Filippo Giannetti (University of Pisa, Italy); Vincenzo Lottici (University of Pisa, Italy); Ivan Stupia (University of Pisa, Italy)
This paper presents an analytical framework for the performance evaluation of an MC-CDMA forward-link in the presence of nonlinear distortions induced by transmitter's high-power amplifiers. The statistical characterization of the decision variable used for data detection is carried out through the application of the Bussgang theorem. Simulation results validate the accuracy of the proposed method under typical operating conditions.

### Tue.1.4: Color Image Processing (Invited special session) - 4 papers

Room: Auditorium
Chair: Eli Saber (Rochester Institute of Technology, USA)
Chair: Mark Shaw (Hewlett Packard Company, Boise, USA)
A Kernel Approach to Gamut Boundary Computation
Joachim Giesen (Swiss Federal Institute of Technology in Zurich, Switzerland); Eva Schuberth (Swiss Federal Institute of Technology in Zurich, Switzerland); Klaus Simon (EMPA, Switzerland); Peter Zolliker (EMPA, Switzerland)
We present a kernel based method to associate an image gamut given as a point cloud in three-dimensional Euclidean space with a continuous shape. The shape we compute is implicitly given as the zero-set of a smooth function that we compute from the point cloud using an efficient optimization method. The feasibility of our approach is demonstrated on a couple of examples.
Thin-plate splines for printer data interpolation
Gaurav Sharma (University of Rochester, USA); Mark Shaw (Hewlett Packard Company, Boise, USA)
Thin-plate spline models have been used extensively for data-interpolation in several problem domains. In this paper, we present a tutorial overview of their theory and highlight their advantages and disadvantages, pointing out specific characteristics relevant in printer data interpolation applications. We evaluate the accuracy of thin-plate splines for printer data interpolation and discuss how available knowledge of printers physical characteristics may be beneficially exploited to improve performance.
HDR CFA Image Rendering
David Alleysson (University Pierre Mendes-France, France); Sabine Susstrunk (EPFL, Switzerland); Laurence Meylan (EPFL, Switzerland)
We propose a method for high dynamic range (HDR) mapping that is directly applied on the color filter array (CFA) image instead of the already demosaiced image. This rendering is closer to retinal processing where an image is acquired by a mosaic of cones and where adaptive non- linear functions apply before interpolation. Thus, in our framework, demosaicing is the final step of the rendering. Our method, inspired by retinal sampling and adaptive processing is very simple, fast because only one third of operations are needed, and gives good result as shown by experiments.
Recent advances in acquisition and reproduction of multispectral images
Jon Hardeberg (Gjà¸vik University College, Norway)
Conventional color imaging science and technology is ba\-sed on the paradigm that three variables are sufficient to characterize a color. Color television uses three color channels, and silver-halide color photography uses three photo-sensitive layers. However, in particular due to metamerism, three color channels are often insufficient for high quality imaging e.g. for museum applications. In recent years, a significant amount of color imaging research has been devoted to introducing imaging technologies with more than three channels - a research field known as multispectral color imaging. This paper gives an overview of this field and presents some recent advances concerning acquisition and reproduction of multispectral images.

### Tue.6.4: Deconvolution Methods - 4 papers

Room: Room 4
Chair: Mats Viberg (Chalmers University of Technology, Sweden)
Frequency domain blind deconvolution in multiframe imaging using anisotropic spatially-adaptive denoising
Vladimir Katkovnik (Tampere University of Technology, Finland); Dmitriy Paliy (Tampere University of Technology, Finland); Karen Egiazarian (Tampere University of Technology, Finland); Jaakko Astola (Tampere University of Technology, Finland)
In this paper we present a novel method for multiframe blind deblurring of noisy images. It is based on minimization of the energy criterion produced in the frequency domain using a recursive gradient-projection algorithm. For filtering and regularization we use the local polynomial approximation (LPA) of both the image and blur operators, and paradigm of the intersection of confidence intervals (ICI) applied for selection adaptively varying scales (window sizes) of LPA. The LPA-ICI algorithm is nonlinear and spatially-adaptive with respect to the smoothness and irregularities of the image and blur operators. Simulation experiments demonstrate efficiency and good performance of the proposed deconvolution technique.
Comparison of supergaussianity and whiteness assumptions for blind deconvolution in noisy context
Anthony Larue (National Polytechnic Institute of Grenoble, France); Dinh Tuan Pham (National Polytechnic Institute of Grenoble, France)
We propose a frequency blind deconvolution algorithm based on mutual information rate as a measure of whiteness. In the case of seismic data, the algorithm ofWiggins based on kurtosis, which is a supergaussianity criterion, is often used. We study the robustness in noisy context of these two algorithms, and compare them with Wiener filtering. We provide some theoretical explanations on the effect of the additive noise. The theoretical arguments are illustrated with a simulation of seismic signals. For such signal, the supergaussianity criterion appears more robust to noise contamination than the whiteness criterion.
A Channel Deflation Approach for the Blind Deconvolution of a Complex FIR Channel with Real Input
Konstantinos Diamantaras (Technological Education Institute of Thessaloniki, Greece); Theophilos Papadimitriou (Democritus University of Thrace, Greece); Efthimios Kotsialos (Technological Education Institute of Thessaloniki, Greece)
In this paper we present a novel second-order statistics method for the blind deconvolution of a real signal propagating through a complex channel. The method is computationally very efficient since it involves only one SVD computation for the time-delayed output covariance matrix. No other optimization is involved. The subspaces corresponding to the left and right singular value matrices can be used to deflate'' the channel: projecting the output signal on either subspace reduces the filter length to 1, thus source reconstruction is simplified. The method is suitable for any channel length and it offers performance improvement compared to well established methods.
Iterative Image Deconvolution Using Overcomplete Representations
Chaux Caroline (Univ Marne la Vallee, France); Patrick Combettes (Universite Paris 6, France); Jean-Christophe Pesquet (Univ. Marne la Vallee, France); Wajs Valerie (Univ Paris 6, France)
We consider the problem of deconvolving an image with a priori information on its representation in a frame. Our variational approach consists of minimizing the sum of a residual energy and a separable term penalizing each frame coefficient individually. This penalization term may model various properties, in particular sparsity. A general iterative method is proposed and its convergence is established. The novelty of this work is to extend existing methods on two distinct fronts. First, a broad class of convex functions are allowed in the penalization term which, in turn, yields a new class of soft thresholding schemes. Second, while existing results are restricted to orthonormal bases, our algorithmic framework is applicable to much more general overcomplete representations. Numerical simulations are provided.

### Tue.4.4: Bayesian Methods for Inverse Problems in Image and Signal Processing II (Special session) - 4 papers

Room: Sala Onice
Chair: Nikolaos Galatsanos (University of Ioannina, Greece)
Variational Bayesian Blind Image Deconvolution Based on a Sparse Kernel Model for the Point Spread Function
Dimitris Tzikas (University of Ioannina, Greece); Aristidis Likas (University of Ioannina, Greece); Nikolaos Galatsanos (University of Ioannina, Greece)
In this paper we propose a variational Bayesian algorithm for the blind image deconvolution problem. The unknown point spread function (PSF) is modeled as a sparse linear combination of kernel basis functions. This model offers an effective mechanism to estimate for the first time both the support and the shape of the PSF. Numerical experiments demonstrate the effectiveness of the proposed methodology.
Hierarchical Bayesian Super Resolution Reconstruction of Multispectral Images
Rafael Molina (Universidad de Granada, Spain); MIguel Vega (University of Granada, Spain); Javier Mateos (University of Granada, Spain); Aggelos K. Katsaggelos (Northwestern University, USA)
In this paper we present a super resolution Bayesian methodology for pansharpening of multispectral images which: a) incorporates prior knowledge on the expected characteristics of the multispectral images, b) uses the sensor characteristics to model the observation process of both panchromatic and multispectral images, c) includes information on the unknown parameters in the model, and d) allows for the estimation of both the parameters and the high resolution multispectral image. Using real data, the pansharpened multispectral images are compared with the images obtained by other parsharpening methods and their quality assessed both qualitatively and quantitatively.
Hirarchical Markovian Models for 3D Computed Tomography in Non Destructive Testing Applications
Ali Mohammad-Djafari (Centre national de la recherche scientifique (CNRS), France); Lionel Robillard (EDF, France)
Computed Tomography (CT) has become an usual technic in Non Destructive Testing (NDT) applications, in particular in detection and characterization of defaults in metalic objects. One of the characteristics of such applications is that, in general the number and angles of of projections are very limited, but at the other hand, we know a priori the number of the kind of materials we can found, mainly metal and air or metal, air and a composite material. In this work, we first propose a particular hierarchical Markov-Potts a priori model which takes into account for the specificty of the NDT CT. Then, we give details of a Bayesian estimation computation based on MCMC and EM technics. Finally, we show the performances of the proposed 3D CT reconstruction method with a very limited number and angles of projections with very low signal to noise ratio simulating a real application of NDT in power plants industry.
Bayesian Inference for Multidimensional NMR image reconstruction
Ji Won Yoon (University of Cambridge, United Kingdom); Simon Godsill (University of Cambridge, United Kingdom)
Reconstruction of an image from a set of projections has been adapted to generate multidimensional nuclear magnetic resonance (NMR) spectra, which have discrete features that are relatively sparsely distributed in space. For this reason, a reliable reconstruction can be made from a small number of projections. This new concept is called Projection Reconstruction NMR (PR-NMR). In this paper, the multidimensional NMR spectra are reconstructed by Reversible Jump Markov Chain Monte Carlo (RJMCMC). This statistical method generates samples under the assumption that each peak consists of a small number of parameters: position of peak centres, peak amplitude, and peak width. In order to find the number of peaks and shape, RJMCMC has several moves: birth, death, merge, split, and invariant updating. The reconstruction schemes are tested on a set of six projections derived from the three-dimensional 700 MHz HNCO spectrum of a protein HasA.

### Tue.3.4: Speech Recognition - 4 papers

Room: Sala Verde
Chair: Jaakko Astola (Tampere University of Technology, Finland)
Neural Networks with Random Letter Codes for Text-To-Phoneme Mapping and Small Training Dictionary
Eniko Bilcu (Tampere University of Technology, Finland); Jaakko Astola (Tampere University of Technology, Finland)
In this paper we address the problem of text-to-phoneme (TTP) mapping implemented by neural networks. One important disadvantage of the neural networks is the convergence interval which can be in some situations very large. Even when the neural networks are trained in off line mode a shorter convergence interval would be of interest due to various reasons. In the TTP mapping, decreasing the number of necessary iterations is equivalent to relaxing the requirements for the dictionary size. In this paper, we show that proper letter encoding can increase the convergence speed of the multilayer perceptron neural network for the task of TTP mapping. Experimental results that compare the performance of several techniques that speed-up the convergence of the multilayer perceptron, in the context of TTP mapping are also presented.
Multimodal speaker localization in a probabilistic framework
Mihai Gurban (Swiss Federal Institute of Technology (EPFL), Switzerland); Jean-Philippe Thiran (Swiss Federal Institute of Technology (EPFL), Switzerland)
A multimodal probabilistic framework is proposed for the problem of finding the active speaker in a video sequence. We localize the current speaker's mouth in the image by using the video and the audio channels together. We propose a novel visual feature that is well-suited for the analysis of the movement of the mouth. After estimating the joint probability density of the audio and visual features, we can find the most probable location of the current speaker's mouth in a sequence of images. The proposed method is tested on the CUAVE audio-visual database, yielding improved results, compared to other approaches from the literature.
Optimal Selection of Bitstream Features for Compressed-Domain Automatic Speaker Recognition
Matteo Petracca (Politecnico di Torino, Italy); Antonio Servetti (Politecnico di Torino, Italy); Juan Carlos De Martin (Politecnico di Torino, Italy)
Low-complexity compressed-domain automatic speaker recognition algorithms are directly applied to the coded speech bitstream to avoid the computational burden of decoding the parameters and resynthesizing the speech waveform. The objective of this paper is to further reduce the complexity of this approach by determining the smallest set of bitstream features that has the maximum effectiveness on recognition accuracy. For this purpose, recognition accuracy is evaluated with various sets of medium-term statistical features extracted from GSM AMR compressed speech coded at 12.2 kb/s. Over a database of 14 speakers the results show that, using 20 seconds of active speech, a recognition ratio of 100% can be achieved with only nine of the 18 statistical features under analysis. This is a complexity reduction by a factor of two with respect to previous works. Moreover, the robustness of the proposed system has been assessed using test samples of different length and varying levels of frame losses, and proved to be the same of previous approaches.
Hybrid Approach for Unsupervised Audio Speaker Segmentation
Hachem Kadri (Ecole Nationale d'Ingénieurs de Tunis, Tunisia); Zied Lachiri (INSAT, Tunisia); Noureddine Ellouze (ENIT, Tunisia)
This paper deals with a new technique, DIS-T2-BIC, for audio speaker segmentation when no prior knowledge of speakers is assumed. This technique is based on a hybrid concept which is organized in two steps: the detection of the most probable speaker turns and the validation of turns already detected. For the detection our new technique uses a new distance measure algorithm based on the Hotellings T2-Statistic criterion. The validation is obtained by applying the Bayesian Information Criterion (BIC) segmentation algorithm to the detected speaker turns. For measuring the performance we compare the segmentation results of the proposed method versus recent hybrid techniques. Results show that DIS_T2_BIC method has the advantage of high accuracy speaker change detection with a low computation cost.

## 8:40 AM - 11:00 AM

### Wed.2.1: Image and Video Quality Evaluation (Invited special session) - 7 papers

Chair: Alessandro Neri (Università  degli Studi "Roma TRE", Italy)
H.264 Coding Artifacts And Their Relation To Perceived Annoyance
Tobias Wolff (Harman Becker Automotive Systems, Germany); Hsin-Han Ho (University of California Santa Barbara, USA); John Foley (University of California Santa Barbara, USA); Sanjit K. Mitra (UCSB, USA)
In this study we investigate coding artifacts in H.264 baseline profile. A psychophysical experiment was conducted that collected data about the subjectively perceived annoyance of short video sequences as well as the perceived strength of three coding artifacts. The data provided by 52 subjects is analyzed with respect to bitrate and intra period of the encoded sequences. A new data analysis method is presented which is based on a granular data representation and enables the detection of multidimensional functional dependencies in data sets. This method is employed to establish a model for the perceived annoyance as a function of artifact strength.
Task impact on the visual attention in subjective image quality assessment
Alexandre Ninassi (University of Nantes, France); Olivier Le Meur (Thomson R&D, France); Patrick Le Callet (University of Nantes, France); Dominique Barba (Institut de Recherche en Communications et Cybernétique de Nantes, France); Arnaud Tirel (University of Nantes, France)
Visual attention is a main feature of the human visual system components. Knowing and using the mechanisms of the visual attention could help improving image quality assessment. But, which kind of saliency should be taken into account? A free-task visual selective attention or a quality oriented visual selective attention. We recorded and evaluated the discrepancy between these two types of visual attention. The results will be given to show the impact of the viewing task on visual strategy.
No-Reference perceptual quality assessment of colour image
Benjamin Bringier (SIC, Université de Poitiers, France); Noël Richard (Université de Poitiers, France); Chaker Larabi (SIC, Université de Poitiers, France); Christine Fernandez-Maloigne (SIC, Université de Poitriers, France)
Image quality assessment plays an important role in various image processing applications. In recent years, some objective image quality metrics correlated with perceived quality measurement have been developed. Two categories of metrics can be distinguished: with full-reference and noreference. Full-reference looks at decrease in image quality from some reference of ideal. No-reference approach attempts to model the judgment of image quality without the reference. Unfortunately, the universal image quality model is not on the horizon and empirical models establishes on psychophysical experimentation are generally used. In this paper, we present a new algorithm for quality assessment of colour reproduction based on human visual system modeling. A local contrast definition is used to assign quality scores. Finally, a good correlation is obtained between human evaluations and our method.
Estimation of accesible quality in noisy image compression
Nikolay Ponomarenko (National Aerospace University, Kharkov, Ukraine); Mikhail Zriakhov (National Aerospace University, Ukraine); Vladimir Lukin (National Aerospace University, Kharkov, Ukraine); Jaakko Astola (Tampere University of Technology, Finland); Karen Egiazarian (Tampere University of Technology, Finland)
A task of lossy compression of noisy images providing ac-cessible quality is considered. By accessible quality we mean minimal distortions of a compressed image with re-spect to the corresponding noise-free image that are ob-served for the case of optimal operation point (OOP). The ways of reaching OOP for noisy images are discussed. It is shown that this can be done in automatic mode with appro-priate accuracy. Investigations are performed for efficient DCT-based AGU coder for a set of test images. We also demonstrate that the proposed approach can be applied to automatic selection of compression ratio for lossy compres-sion of noise-free images.
No reference quality assessment of Internet multimedia services
Alessandro Neri (University of ROMA TRE, Italy); Marco Carli (University of Roma TRE, Italy); Marco Montenovo (HP C&I, Italy); Francesco Comi (University of Roma TRE, Italy)
In this paper an objective No Reference metric for assessing the quality degradations introduced by transmission over a heterogeneous IP network is presented. The proposed ap-proach is based on the analysis of the interframe correlation measured at the output of the rendering application. It does not require information about the kind of errors, delays and latencies that affected the link and countermeasures intro-duced by decoders in order to face the potential quality loss. Experimental results show the effectiveness of the proposed algorithm in approximating the assessments obtained with full reference metrics.
Intelligent Sharpness Enhancement for Video Post-Processing
Jorge Caviedes (Intel Corporation, USA)
Sharpness enhancement is one of the post-processing stages in the consumer electronics video chain that operates in an open-loop mode. Although adaptive behavior is possible, in general there is no feedback system aimed at maximizing perceived quality. In this paper we introduce a control system and metric for sharpness enhancement algorithms. We also discuss the options of implementing an internal or local control loop, i.e., to control the basic sharpness enhancement engine at the pixel or region level, and an external or global control loop for sharpness enhancement module.
Application Dependent Video Segmentation Evaluation - A Case Study for Video Surveillance -
Evaluation of the performance of video segmentation algorithms is important in both theoretical and practical considerations. This paper addresses the problem of video segmentation assessment, through both subjective and objective approaches, for the specific application of video surveillance. After an overview of the state of the art technique in video segmentation objective evaluation metrics, a general framework is proposed to cope with application dependent evaluation assessment. Finally, the performance of the proposed scheme is compared to state of the art technique and various conclusions are drawn.

### Wed.2.1: Transceiver Processing for Fast Time-Varying Channels (Invited special session) - 7 papers

Chair: Franz Hlawatsch (Vienna University of Technology, Austria)
Time-Varying Communication Channels: Fundamentals, Recent Developments, and Open Problems
Gerald Matz (Vienna University of Technology, Austria); Franz Hlawatsch (Vienna University of Technology, Austria)
In many modern wireless communication systems, the assumption of a locally time-invariant (block-fading) channel breaks down due to increased user mobility, data rates, and carrier frequencies. Fast time-varying channels feature significant Doppler spread in addition to delay spread. In this tutorial paper, we review some characterizations and sparse models of time-varying channels. We then discuss several models and methods recently proposed for communications over time-varying wireless channels, and we point out some related open problems and potential research directions.
Estimation of Doubly-Selective Channels in Block Transmissions
Mounir Ghogho (University of Leeds, United Kingdom); Ananthram Swami (Army Research Lab., USA)
We propose to estimate time-varying frequency-selective channels using data-dependent superimposed training (DDST) and a basis expansion model (BEM). The proposed method is an extension of the DDST-based method recently proposed for time-invariant channels. The superimposed training consists of the sum of a known sequence and a data-dependent sequence, which is unknown to the receiver. The data-dependent sequence cancels the effects of the unknown data on channel estimation. Symbol detection is performed using MMSE equalization.
Minimum-Energy Bandlimited Time-Variant Channel Prediction With Dynamic Subspace Selection
Thomas Zemen (ftw. Forschungszentrum Telekommunikation Wien, Austria); Christoph Mecklenbräuker (FTW, Austria); Bernard Fleury (Aalborg University, Denmark)
In current cellular communication systems the time-selective fading process is highly oversampled. We exploit this fact for time-variant flat-fading channel prediction by using dynamically selected predefined low dimensional subspaces spanned by discrete prolate spheroidal (DPS) sequences. The DPS sequences in each subspace exhibit a subspace-specific bandwidth matched to a certain Doppler frequency range. Additionally, DPS sequences are most energy concentrated in a time interval matched to the channel observation interval. Both properties enable the application of DPS sequences for minimum-energy (ME) bandlimited prediction. The dimensions of the predefined subspaces are in the range from one to five for practical communication systems. The subspace used for ME bandlimited prediction is selected based on a probabilistic bound on the reconstruction error. By contrast, time-variant channel prediction based on non-orthogonal complex exponential basis functions needs Doppler frequency estimates for each propagation path which requires high computational complexity. We compare the performance of this technique under the assumption of perfectly known complex exponentials with that of ME bandlimited prediction augmented with dynamic subspace selection. In particular we analyze the mean square prediction error of the two schemes versus the number of discrete propagation paths.
Direct Equalization of Multiuser Doubly-Selective Channels Based on Superimposed Training
Shuangchi He (Auburn University, USA); Jitendra Tugnait (Auburn University, USA)
Design of doubly-selective linear equalizers for multiuser frequency-selective time-varying communications channels is considered using superimposed training and without first estimating the underlying channel response. Both the time-varying channel as well as the linear equalizers are assumed to be described by a complex exponential basis expansion model (CE-BEM). User-specific periodic (non-random) training sequences are arithmetically added (superimposed) to the respective information sequences at the transmitter before modulation and transmission. There is no loss in information rate. Knowledge of the superimposed training specific to the desired user and properties of the other training sequences are exploited to design the equalizers. An illustrative simulation example is presented.
Banded Equalizers for MIMO-OFDM in fast Time-Varying Channels
Luca Rugini (University of Perugia, Italy); Paolo Banelli (University of Perugia, Italy)
We propose low-complexity equalizers for multiple-input multiple-output (MIMO) orthogonal frequency-division multiplexing (OFDM) systems in frequency-selective time-varying channels, by extending the approach we formerly proposed for single-antenna OFDM systems. Specifically, by neglecting the intercarrier interference (ICI) coming from faraway subcarriers, we design minimum mean-squared error (MMSE) block linear equalizers (BLE) and MMSE block decision-feedback equalizers (BDFE) that employ a band LDL factorization algorithm. The complexity of the proposed banded equalizers is linear in the number of subcarriers, differently from conventional MMSE-BLE and MMSE-BDFE characterized by a cubic complexity. We also consider a receiver window designed to minimize the power of the undesired ICI. Simulation results show that windowing is beneficial in controlling the complexity of the proposed equalizers with acceptable performance loss with respect to the conventional MMSE-BLE and MMSE-BDFE.
Spatial Multiplexing with Linear Precoding in Time-Varying Channels with Limited Feedback
Geert Leus (Delft University of Technology, The Netherlands); Claude Simon (Delft University of Technology, The Netherlands); Nadia Khaled (Interuniversity Micro-Electronics Center (IMEC), Belgium)
Combining spatial multiplexing with linear unitary precoding allows for high data rates, but requires a feedback link from the receiver to the transmitter. We focus on quantizing and feeding back the precoder itself, since it outperforms quantized channel feedback. More specifically, we propose a modified precoder quantization approach that outperforms the conventional one. We investigate both the linear minimum mean square error (LMMSE) detector, which minimizes the mean square error (MSE) between the transmitted and estimated symbols, and the singular value decomposition (SVD) detector, which is a unitary detector that aims at diagonalizing the channel matrix. In this context, we illustrate that the LMMSE detector performs slightly better than the SVD detector. We also study precoder extrapolation, when the precoder is only fed back at a limited number of time instances, as well as a related detector extrapolation scheme for the LMMSE and SVD detector, when the channel is only known at some specific time instances. Simulation results illustrate the efficiency of the proposed extrapolation methods.
Blind CFO estimation for OFDM with constant modulus constellations: performance bounds and algorithms
Timo Roman (Helsinki University of Technology, Finland); Andreas Richter (Helsinki University of Technology, Finland); Visa Koivunen (Helsinki University of Technology, Finland)
In this paper, we derive the Cramer-Rao bound for blind carrier frequency offset (CFO) estimation in orthogonal frequency division multiplexing (OFDM) with constant modulus constellations. A blind maximum likelihood CFO estimator is also proposed. It achieves highly accurate frequency synchronization with a single OFDM block, regardless of multipath fading and without the need for null-subcarriers. The approach is thus very attractive for time and frequency selective channels where the CFO may be time varying. If additional information is available, such as a single pilot symbol, maximum likelihood estimates of channel parameters and transmitted data can be obtained as a byproduct. Finally, performance bounds are evaluated for several commonly encountered scenarios.

### Wed.1.1: Cultural Heritage (Invited special session) - 7 papers

Room: Auditorium
Chair: Vito Cappellini (University of Florence, Italy)
Opportunities and issues of Image Processing for Cultural Heritage Applications
Alessandro Piva (University of Florence, Italy); Vito Cappellini (University of Florence, Italy)
The application of image processing techniques for the analysis, the diagnostic and the restoration of artworks remains a very uncommon practise. Recently, however, there has been a greater interest on acquiring and processing image data of artworks: the efforts in this application field have been characterized by promising results, which proved the advantages that the use of digital image processing may have on several issues. In this paper the peculiarities and the state of the ar of this application field will be described.
Using Spanning Trees for Reduced Complexity Image Mosaicing
Nikos Nikolaidis (Aristotle University of Thessaloniki, Greece); Ioannis Pitas (ARISTOTLE UNIVERSITY OF THESSALONIKI, Greece)
Image mosaicing, i.e., reconstruction of an image from a set of overlapping sub-images, has numerous applications that include high resolution image acquisition of works of art. Unfortunately, optimal mosaicing has very large computational complexity that soon becomes prohibitive as the number of sub-images increases. In this paper, two methods which achieve significant computational savings by applying mosaicing in pairs of two sub-images at a time, without significant reconstruction losses, are proposed. Simulations are used to verify the computational efficiency and good performance in terms of matching error of the proposed techniques.
Automated Investigation of Archeological Vessels
Martin Kampel (Vienna University of Technology, Austria); Hubert Mara (Vienna University of Technology, Austria); Robert Sablatnig (Vienna University of Technology, Austria)
Motivated by the requirements of the present archaeology, we are developing an automated system for archaeological classification and reconstruction of ceramics. This paper shows a method to answer archaeological questions about the manufacturing process of ancient ceramics, which is important to determine the technological advancement of ancient culture. The method is based on the estimation of the pro- file lines of ceramic fragments, which can also be applied to complete vessels. With the enhancements shown in this paper, archaeologists get a tool to determine ancient manufacturing techniques.
Damages of Digitized Historical Images as Objects for Content Based Applications
Edoardo Ardizzone (Università  degli Studi di Palermo, Italy); Haris Dindo (Università  degli Studi di Palermo, Italy); Umberto Maniscalco (Istituto per le Applicazioni del Calcolo (I.A.C.) M. Picone - C.N.R, Italy); Giuseppe Mazzola (Università  degli Studi di Palermo, Italy)
This work presents the preliminary results achieved within a FIRB project aimed to develop innovative support tools for automatic or semi-automatic restoration of damaged digital images concerning archaeological and monumental inheritance of Mediterranean coast. In particular, this paper is focused on a methodology for describing image degradation and its meta-representation for content based storing and retrieval. Our innovative idea is to decompose and store in a conventional RDBMS the images content, considering the damages as objects of the images. Moreover, a set of descriptors(a subset of MPEG7 descriptors) is used for the damage meta representation aimed to content based application. Finally we developed a user-friendly database management tool for manipulating the contents of the database
The image processing system for art specimens: Nephele
Miroslav Benes (1) Institute of Information Theory and Automation 2) Charles University, Czech Republic); Barbara Zitova (Institute of Information Theory and Automation, Czech Republic); Jan Flusser (Institute of Information Theory and Automation, Czech Republic); Janka Hradilova (Academic Laboratory of Materials Research of Paintings, Czech Republic); David Hradil (Academic Laboratory of Materials Research of Paintings, Czech Republic)
In our paper we introduce comprehensive solution for processing and archiving information about artwork specimens used in the course of art restoration - Nephele. The information processing based on image data is used in the procedure of identification of pigment and binder present in the artwork, which is very important issue for restorers. Proposed approach geometrically aligns images of microscopic cross-sections of artwork color layers - image registration method based on mutual information, and then creates preliminary color layer segmentation - modified k-means clustering. The archiving part of the Nephele enables creating database entries for painting materials research database, their storage, and creating text-based queries. In addition to these traditional database functions, advanced report retrieval is supported; based on the similarity of image data, comparing either the ultraviolet and visual spectra images (using co-occurence matrices and color similarity functions), or the energy dispersive X-ray images (using features computed from the wavelet decomposition of the data).
Multispectral UV Fluorescence Analysis of Painted Surfaces
Anna Pelagotti (INOA, Italy); Luca Pezzati (INOA, Italy); Alessandro Piva (University of Florence, Italy); Andrea Del Mastio (University of Florence - Media Integration and Communication Center, Italy)
A novel system has been developed to acquire digital multispectral ultraviolet (UV) induced visible fluorescence images of paintings. We present here the image processing needed to understand and further process the acquired multispectral UV fluorescence images.
Analysis of Multispectral Images of Paintings
Philippe Colantoni (Jean Monnet University, France); Ruven Pillay (C2RMF, France); Christian Lahanier (C2RMF, France); Denis Pitzalis (C2RMF, France)
One hundred paintings conserved in several museums have been scanned by the C2RMF using the multi-spectral CRISATEL camera. These high resolution images allow us to not only generate an accurate colour image under any chosen illuminant, but also allow us to reconstruct the reflectance spectra at each pixel. Such images can be used for a visual qualitative as well as measurement-based quantitative scientific analysis of the work of art. Several image processing tools have been developped to allow us to perform these analyses. The IIPImage system enables us to visualize high resolution multi-spectral 16 bit images, view image details in colour or for each spectral channel and to super-impose and compare different wavelengths. A complementary viewing system uses an innovative 3D graphics hardware-accelerated viewer to allow us to reconstruct the resulting colour dynamically while interactively changing the light spectrum. The system also allows us to perform segmentation, view the colour distribution for a particular colour-space and perform dynamic spectral reconstruction.

### Wed.6.1: Parameter Estimation - 7 papers

Room: Room 4
Chair: Jean-Yves Tourneret (ENSEEIHT/TeSA, France)
Bayesian estimation of mixtures of skewed alpha stable distributions with an unknown number of components
Alpha stable distributions are widely accepted models for impulsive data. Despite their flexibility in modelling varying degrees of impulsiveness and skewness, they fall short of modelling multimodal data. In this work, we present the alpha-stable mixture model which provides a framework for modelling multimodal, skewed and impulsive data. We describe new parameter estimation techniques for this model based on numerical Bayesian techniques which not only can estimate the alpha-stable and mixture parameters, but also the number of components in the mixture. In particular, we employ the reversible jump Markov chain Monte Carlo technique.
Parameter Estimation For Multivariate Gamma Distributions
Florent Chatelain (University of Toulouse, France); Jean-Yves Tourneret (IRIT/ENSEEIHT/TéSA, France); Jordi Inglada (CNES, France); André Ferrari (Nice-Sophia-Antipolis University, France)
Determining similarity measures between two images is a interesting problem for image registration or change detection. Bivariate gamma distributions are good candidates for radar images since their marginals are known to be univariate gamma distributions. This paper addresses the problem of estimating the parameters of these bivariate gamma distributions by using the the maximum likelihood method and the method of moments. The performances of both estimators are compared. Asymptotic expressions for the estimation variances are also derived.
ML approach to radial acceleration estimation and CRLB computation
Luciana Ortenzi (SELEX-SI, Italy); Luca Timmoneri (SELEX-SI, Italy); Alfonso Farina (SELEX-SI, Italy)
Usually only ambiguous radial speed is extracted from Moving Target Detector with an accuracy depending on the bank filters width; then the ambiguity may be eliminated employing several algorithms. The aim of this paper is to illustrate an algorithm, based on Maximum Likelihood Estimator (MLE), to estimate also radial acceleration. This kinematic parameter of the target may be exploited, for instance, to discriminate an ABT (Air Breathing Target) vs. a BT (Ballistic Target). The target radial acceleration is estimated on the basis of the phase evolution of the received signals. The accuracy of the estimation has been analyzed theoretically by means of the CRLB (Cramer Rao Lower Bound) and by means of Monte Carlo simulation.
Consistent Signal Parameter Estimation with 1-Bit Dithered Sampling
Onkar Dabeer (Tata Institute of Fundamental Research, India); Aditya Karnik (University of Waterloo, Canada)
We consider the problem of estimating a parameter \theta of a signal s(x;\theta) corrupted by noise when only 1-bit precision samples are allowed. We propose and analyze a new estimator based on dithered 1-bit samples. Our estimate is consistent and satisfies an asymptotic CLT for a wide class of dither distributions. In particular, uniformly distributed dither leads to only a logarithmic rate loss compared to the case of full precision samples.
On the Influence of Detection Tests on Deterministic Parameters Estimation
Eric Chaumette (Thales Naval France, France); François Vincent (ENSICA Toulouse, France); Jérôme Galy (LIRMM Montpellier, France); Pascal Larzabal (SATIE ENS-Cachan, France)
In non-linear estimation problems three distinct regions of operation can be observed. In the asymptotic region, the Mean Square Error (MSE) of Maximum Likelihood Estimators (MLE) is small and, in many cases, close to the Cramer-Rao bound (CRB). In the a priory performance region where the number of independent snapshots and/or the SNR are very low, the MSE is close to that obtained from the prior knowledge about the problem. Between these two extremes, there is an additional transition region where MSE of estimators deteriorates with respect to CRB. The present paper provides exemples of improvement of MSE prediction by CRB, not only in the transition region but also in the a priori region, resulting from introduction of a detection step, which proves that this refinement in MSE lower bounds derivation is worth investigating.
Estimation of Parameters of Input Traffic Streams From Statistically Multiplexed Output
Rajesh Narasimha (Georgia Tech, USA); Raghuveer Rao (Rochester Institute of Technology, USA); Sohail Dianat (RIT, USA)
This paper examines the problem of determining the degree of mixing of two independent and different types of traffic streams from observations of their statistically multiplexed stream. A common example of a pair of such different stream types in networks would be one conforming to the conventional Poisson model and the other obeying long-range dependence characterized by a heavy-tailed distribution. We provide an expression for the probability density function of the inter-arrival time of the mixed stream in terms of those of the input streams for the general case. An approach is provided to estimate input parameters from the first and second order statistics of the output traffic for the specific case of multiplexing Poisson and heavy-tailed processes.
Efficient time delay estimation based on cross-power spectrum phase
Marco Matassoni (ITC-irst, Italy); Piergiorgio Svaizer (ITC-irst, Italy)
Accurate Time Delay Estimation for acoustic signals acquired in noisy and reverberant environments is an important task in many speech processing applications. The Cross-power Spectrum Phase analysis is a popular method that has been demonstrated to perform well even in moderately adverse conditions. This paper describes an efficient approach to apply it in the case of static sources. It exploits the linearity of the generalized cross-correlation to accumulate information from a plurality of frames in the time domain. This translates into a reduced computational load and a more robust estimation. Several examples drawn from real and simulated data in typical applications are discussed.

### Wed.4.1: Speech Analysis and Synthesis - 7 papers

Room: Sala Onice
Chair: Renato De Mori (LIA - University of Avignon, France)
Robust F0 estimation based on a multi-microphone periodicity function for distant-talking speech
Federico Flego (ITC-irst (Centro per la Ricerca Scientifica e Tecnologica, Italy); Maurizio Omologo (ITC-irst (Centro per la Ricerca Scientifica e Tecnologica), Italy)
This work addresses the problem of deriving F0 from distant-talking speech signals acquired by a microphone network. The method here proposed exploits the redundancy across the channels by jointly processing the different signals. To this purpose, a multichannel periodicity function is derived from the magnitude spectrum of all the channels. This function allows to estimate F0 reliably, even under reverberant conditions, without the need of any post-processing or smoothing technique. Experiments, conducted on real data, showed that the proposed frequency-domain algorithm is more suitable than other time-domain based ones.
A multi-decision sub-band voice activity detector
Alan Davis (Western Australian Telecommunications Research Institute, Australia); Sven Nordholm (Western Australian Telecommunications Research Institute, Australia); Siow Yong Low (Curtin University of Technology, Australia); Roberto Togneri (University of Western Australia, Australia)
In this paper, a new paradigm for voice activity detection (VAD) is introduced. The idea is to exploit the spectral nature of speech to make independent voice activity decisions in separate sub-bands, resulting in multiple decisions for any frame. A potential method to perform multi-decision sub-band VAD is proposed then evaluated with a small test set. The evaluations illustrate the concept and potential benefit of multi-decision sub-band VAD.
Noise Reduction Using Reliable A Posteriori Signal-To-Noise Ratio Features
Cyril Plapous (France Telecom, France); Claude Marro (France Télécom R&D, France); Pascal Scalart (University of Rennes, France)
This paper addresses the problem of single microphone speech enhancement in noisy environments. State of the art short-time noise reduction techniques are most often expressed as a spectral gain depending on Signal-to-Noise Ratio (SNR). The well-known decision-directed approach drastically limits the level of musical noise but the estimated a priori SNR is biased since it depends on the speech spectrum estimated in the previous frame. The consequence of this bias is an annoying reverberation effect. We propose a method, called Reliable Features Selection Noise Reduction (RFSNR) technique, capable of classifying the a posteriori SNR estimates into two categories: the reliable features leading to speech components and the unreliable ones corresponding to musical noise only. Then it is possible to directly enhance speech using these reliable components thus obtaining an unbiased estimator.
Theoretical and experimental bases of a new method for accurate separation of harmonic and noise components of speech signals
Laurent Girin (Institut National Polytechnique de Grenoble, France)
In this paper, the problem of separating the harmonic and aperiodic (noise) components of speech signals is addressed. A new method is proposed, based on two specific processes dedicated to better take into account the non-stationarity of speech signals: first, a period-scaled synchronous analysis of spectral parameters (amplitudes and phases) is done, referring to the Fourier series expansion of the signal, as opposed here to the typically used Short-Term Fourier Transform (STFT). Second, the separation itself is based on a low-pass time-filtering of the parameters trajectory. Additionally to presenting the theoretical basis of the method, preliminary experiments on synthetic speech are provided. These experiments show that the proposed method has the potential to significantly outperform a reference method based on STFT: Signal-to-error ratio gains of 5 dB are typically obtained in the presented experiments. Conditions to go beyond the theoretical framework towards more practical applications on real speech signals are discussed.
Dynamic Selection of Acoustic Features in an Automatic Speech Recognition System
Loïc Barrault (University of Avignon, France); Driss Matrouf (LIA - University of Avignon, France); Renato De Mori (LIA - University of Avignon, France); Roberto Gemello (Loquendo - Italy, Italy); Franco Mana (Loquendo - Italy, Italy)
A general approach for integrating different acoustic feature sets and acoustic models is presented. A strategy for using a feature set as a reference and for scheduling the execution of other feature sets is introduced. The strategy is based on the introduction of feature variability states. Each phoneme of a word hypothesis is assigned one of such states. The probability that a word hypothesis is incorrect given the sequence of its variability states is computed and used for deciding the introduction of new features. Significant WER reductions have been observed on the test sets of the AURORA3 corpus. Using the CH1 portions of the test sets of the Italian and Spanish corpora, word error rate reductions respectively of 16.42% for the Italian and 29.4% for Spanish were observed.
Singing Voice Recognition Considering High-Pitched and Prolonged Sounds
Akira Sasou (National Institute of Advanced Industrial Science and Technology, AIST, Japan)
A conventional Large Vocabulary Continuous Speech Recognition (LVCSR) system has difficulty recognizing singing voices accurately because both the high-pitched and prolonged sounds of singing voices tend to degrade its recognition accuracy. We previously described an Auto- Regressive Hidden Markov Model (AR-HMM) and an accompanying parameter estimation method. We demonstrated that the AR-HMM accurately estimated the characteristics of both articulatory systems and excitation signals from highpitched speech. In this paper, we describe an AR-HMM applied to feature extraction from singing voices and propose a prolonged-sound detection and elimination method.
Natural Sounding TTS Based on Syllable-Like Units
Samuel Thomas (Indian Institute of Technology, Madras, India); Nageshwara Rao (Indian Institute of Technology, Madras, India); Hema Murthy (Indian Institute of Technology, Madras, India); C.S. Ramalingam (Indian Institute of Technology, Madras, India)
In this work we describe a new "syllable-like" speech unit that is suitable for concatenative speech synthesis. These units are automatically generated using a group delay based segmentation algorithm and acoustically correspond to the form C*VC* (C: consonant, V: vowel). The effectiveness of the unit is demonstrated by synthesizing natural-sounding speech in Tamil, a regional Indian language. Significant quality improvement is obtained if bisyllable units are also used, rather than just monosyllables, with results far superior to the traditional diphone-based approach. An important advantage of this approach is the elimination of prosody rules. Since f0 is part of the target cost, the unit selection procedure chooses the best unit from among the many candidates. The naturalness of the synthesized speech demonstrates the effectiveness of this approach.

### Wed.3.1: Advances in Monte Carlo methods for target tracking (Special session) - 7 papers

Room: Sala Verde
Chair: Petar Djuric (State University of New York at Stony Brook, USA)
Chair: Monica Bugallo (Stony Brook University, USA)
Controlling particle filter regularization for GPS/INS hybridization
Audrey Giremus (university of Toulouse, France); Jean-Yves Tourneret (IRIT/ENSEEIHT/TéSA, France)
Coupling GPS with Inertial Navigation Systems (INS) is an interesting way of improving navigation performance in terms of accuracy and continuity of service. This coupling is generally performed by using GPS pseudorange measurements to estimate INS estimation errors and sensor biases. Particle filtering techniques are good candidates to solve the corresponding estimation problem due to the nonlinear measurement equation. However, classical particle filter algorithms tends to degenerate for this application because of the small state noise. Regularized particle filters allow to overcome this limitation at the expense of noisy state estimates. A recent regularized particle filter was proposed to control the regularization process by a Metropolis-Hasting step. The method was shown to increase particle filter robustness while decreasing the variance of the estimates. This paper goes further by introducing an appropriate criterion which measures the degeneracy of the particle cloud. This criterion is used to control the regularization which is not applied systematically reducing the algorithm computational cost. The main idea of the proposed strategy is to monitor on line the mean jumps of the predicted measurement likelihood by means of a CUSUM algorithm. Simulation results are proposed to validate the relevance of the criterion and the performance of the overall algorithm.
Efficient Variable Rate Particle Filters For Tracking Manoeuvring Targets Using An MRF-based Motion Model
William Ng (University of Cambridge, United Kingdom); Sze Kim Pang (University of Cambridge, United Kingdom); Jack Li (University of Cambridge, United Kingdom); Simon Godsill (University of Cambridge, United Kingdom)
In this paper we describe an efficient real-time tracking algorithm for multiple manoeuvring targets using particle filters. We combine independent partition filters with a Markov Random Field motion model to enable efficient and accurate tracking for interacting targets. A Poisson model is also used to model both targets and clutter measurements, avoiding the data association difficulties associated with traditional tracking approaches. Moreover, we present a variable rate dynamical model in which the states change at different and unknown rates compared with the observation process, thereby being able to model parsimoniously the manoeuvring behaviour of an object even though only a single dynamical model is employed. Computer simulations demonstrate the potential of the proposed method for tracking multiple highly manoeuvrable targets in a hostile environment with high clutter density and low detection probability.
A Particle Filter for Beacon-Free Node Location and Target Tracking in Sensor Networks
We address the problem of tracking a maneuvering target that moves along a region monitored by a sensor network, whose nodes, including both the sensors and the data fusion center (DFC), are located at unknown positions. Therefore, the node locations and the target track must be estimated jointly without the aid of beacons. We assume that, when the network is started, each sensor is able to detect the presence of other nodes within its range and transmit the resulting binary data to the DFC. After this startup phase, the sensor nodes just measure some physical magnitude related to the target position and/or velocity and transmit it to the DFC. At the DFC, a particle filtering (PF) algorithm is used to integrate all the collected data and produce on-line estimates of both the (static) sensor locations and the (dynamic) target trajectory. The validity of the method is illustrated by computer simulations of a network of power-aware sensors.
On Low-Power Analog Implementation of Particle Filters for Target Tracking
Rajbabu Velmurugan (Georgia Tech, USA); Shyam Subramanian (Georgia Tech, USA); Volkan Cevher (University of Maryland, USA); David Abramson (Monash University, Australia); Kofi Odame (Georgia Tech, USA); Jordan Gray (Georgia Tech, USA); Haw-Jing Lo (Georgia Tech, USA); James McClellan (Georgia Tech, USA); David Anderson (Georgia Tech, USA)
We propose a low-power, analog and mixed-mode, implementation of particle filters. Low-power analog implementation of nonlinear functions such as exponential and arctangent functions is done using multiple-input translinear element (MITE) networks. These nonlinear functions are used to calculate the probability densities in the particle filter. A bearings-only tracking problem is simulated to present the proposed low-power implementation of the particle filter algorithm.
A Joint Radar-Acoustic Particle Filter Tracker with Acoustic Propagation
Volkan Cevher (University of Maryland, USA); Milind Borkar (Georgia Institute of Technology, USA); James McClellan (Georgia Institute of Technology, USA)
In this paper, a novel particle filter tracker is presented for target tracking using collocated radar and acoustic sensors. Real-time tracking of the target's position and velocity in Cartesian coordinates is performed using batches of range and direction-of-arrival estimates. For robustness, the filter aligns the radar and acoustic data streams to account for acoustic propagation delays. The filter proposal function uses a Gaussian approximation to the full tracking posterior for improved efficiency. To incorporate the aligned acoustic data into the tracker, a two-stage weighting strategy is proposed. Computer simulations are provided to demonstrate the effectiveness of the algorithm.
Monica Bugallo (Stony Brook University, USA); Petar Djuric (State University of New York at Stony Brook, USA)
Recently, we have proposed a particle filtering-type methodology, which we refer to as cost-reference particle filtering (CRPF). Its main feature is that it is not based on any particular probabilistic assumptions regarding the studied dynamic model. The concepts of particles and particle streams, however, are the same in CRPF as in standard particle filtering (SPF), but the probability masses of the particles are replaced with user defined costs. In this paper we propose some modifications of the original CRPF methodology. The changes allow for development of simpler algorithms, which may also be less computationally intensive and possibly more robust. We investigate several variants of CRPF and compare them with SPF. The advantages and disadvantages of the considered algorithms are illustrated and discussed through computer simulations of tracking of multiple targets which move along a two-dimensional space.
Tracking Multiple Acoustic Sources using Particle Filtering
Fabio Antonacci (Politecnico di Milano, Italy); Davide Riva (Politecnico di Milano, Italy); Diego Saiu (Politecnico di Milano, Italy); Augusto Sarti (DEI - Politecnico di Milano, Italy); Marco Tagliasacchi (Politecnico di Milano, Italy); Stefano Tubaro (Politecnico di Milano, Italy)
In this paper we deal with the problem of localizing and tracking multiple acoustic sources by means of microphones pairs. We assume that the propagation takes place in a reverberating environment such as an office room. The problem is tackled by combining two well known techniques. First, for each pair of microphones, source de-mixing is carried out using the TRINICON algorithm. TRINICON exploits the fact that the original sources are statistically independent in order to estimate appropriate de-mixing filters. The impulse responses of such filters exhibit peaks related to the TDOA (Time Difference of Arrival) of each microphones pair. In the second step, such observations are combined using a particle filter with a dynamic model representing the positions and the velocities of the sources. Simulations demonstrate that the proposed system enables to accurately tracking moving acoustic sources in reverberating environments (Â±10cm in a 5mà5m room with T60 < 0.450s).

Room: Auditorium

## 11:20 AM - 1:00 PM

### Wed.2.2: Wireless Sensors Networks - 5 papers

Chair: Mounir Ghogho (University of Leeds, United Kingdom)
Distributed particle filter for target tracking in wireless sensor network
Zhenya Yan (Nanjing University of Posts & Telecommunications, P.R. China); Baoyu Zheng (Nanjing Univ. of Posts and Telecommunications, P.R. China); JingWu Cui (Nanjing University of Posts And Telecomm, P.R. China)
In this paper, distributed particle filter for target tracking in wireless sensor network was proposed. The key issue of distributed particle filter is sensor selection. Because of the limited resources, the fusion of the observation by the selected sensor must reduce uncertainty of the target state distribution without receiving any actual sensor obser-vations. Our sensor selection is computationally much simpler than other sensor selection approaches. Simulation results show that our approach can reach very well performance for target tracking.
Ring Based Wavelet Transform with Arbitrary Supports in Wireless Sensor Networks
Siwang Zhou (Hunan University, P.R. China); Yaping Lin (College of Computer and Communication, Hunan University, P.R. China); Yonghe Liu (UT Arlington, USA)
In this paper, we propose a general transform for wavelet based data compression in wireless sensor networks. By employing a ring topology, our transform is capable of supporting a broad scope of wavelets rather than specified ones. At the same time, the scheme is capable of simultaneously exploring the spatial and temporal correlations among the sensory data. Furthermore, the ring based topology is in particular effective in eliminating the border effect" generally encountered by wavelet based schemes. Theoretically and experimentally, we show the proposed wavelet transform can effectively explore the spatial and temporal correlation in the sensory data and provide significant reduction in energy consumption compared to other schemes.
Locally Optimum Estimation in Wireless Sensor Networks
Stefano Marano (University of Salerno, Italy); Vincenzo Matta (University of Salerno, Italy); Peter Willett (University of Connecticut, USA)
A locally optimum approach for estimating a nonrandom parameter lying in some small neighborhood of a known nominal value is considered. Reference is made to a decentralized estimation problem in the context of wireless sensor networks, and particular attention is paid to the design of the quantizers used by the remote sensors.
Training a SVM-based Classifier in Distributed Sensor Networks
Kallirroe Flouri (University of Crete, Greece); Baltasar Beferull-Lozano (Universidad de Valencia, Spain); Panagiotis Tsakalides (University of Crete, Greece)
The emergence of smart low-power devices (motes), which have micro-sensing, on-board processing, and wireless communication capabilities, has impelled research in distributed and on-line learning under communication constraints. In this paper, we show how to perform a classification task in a wireless sensor network using distributed algorithms for Support Vector Machines (SVMs), taking advantage of the sparse representation that SVMs provide for the decision boundaries. We present two energy-efficient algorithms that involve a distributed incremental learning for the training of a SVM in a wireless sensor network, both for stationary and non-stationary sample data (concept drift). Through analytical studies and simulation experiments, we show that the two proposed algorithms exhibit similar performance to the traditional centralized SVM training methods, while being much more efficient in terms of energy cost.
Accurate Sequential Weighted Least Squares Algorithm for Wireless Sensor Network Localization
Kit Wing Chan (City University of Hong Kong, Hong Kong); Hing Cheung So (City University of Hong Kong, Hong Kong)
Estimating the positions of sensor nodes is a fundamental and crucial problem in ad hoc wireless sensor networks (WSNs). In this paper, an accurate node localization method for WSNs is devised based on the weighted least squares technique with the use of time-of-arrival measurements. Computer simulations are included to evaluate the performance of the proposed approach by comparing with the classical multidimensional scaling method and Cramer-Rao lower bound.

### Wed.5.2: Video Analysis and Understanding - 5 papers

Chair: A. Murat Tekalp (Koc University, Turkey)
Hierarchical video summaries by dendrogram cluster analysis
Sergio Benini (University of Brescia, Italy); Aldo Bianchetti (University of Brescia, Italy); Riccardo Leonardi (University of Brescia, Italy); Pierangelo Migliorati (University of Brescia, Italy, Italy)
In the current video analysis scenario, effective summarization of video sequences through shot clustering facilitates the access to the content and helps in understanding the associated semantics. This paper introduces a generic scheme to produce hierarchical summaries of the video document starting from a dendrogram representation of clusters of shots. The evaluation of the cluster distortions, and the exploitation of the dependency relationships between clusters on the dendrograms, allow to obtain only a few semantically significant summaries of the whole video. Finally the user can navigate through summaries and decide which one best suites his/her needs for eventual post-processing. The effectiveness of the proposed method is demonstrated by testing it on a collection of video-data from different kinds of programmes, using and comparing different visual features on color information. Results are evaluated in terms of metrics that measure the content representational value of the summarization technique.
A geometric segmentation approach for the 3D reconstruction of dynamic scenes in 2D video sequences
Sebastian Knorr (Technische Università¤t Berlin, Germany); Evren Imre (Middle East Technical University, Turkey); Aydin Alatan (Middle East Technical University, Turkey); Thomas Sikora (Technische Università¤t Berlin, Germany)
In this paper, an algorithm is proposed to solve the multi-frame structure from motion (MFSfM) problem for monocular video sequences with multiple rigid moving objects. The algorithm uses the epipolar criterion to segment feature trajectories belonging to the background scene and each of the independently moving objects. As a large baseline length is essential for the reliability of the epipolar geometry, the geometric robust information criterion is employed for a key-frame selection within the sequences. Once the features are segmented, corresponding objects are reconstructed individually using a sequential algorithm that is capable of prioritizing the frame pairs with respect to their reliability and information content. The experimental results on synthetic and real data demonstrate that our approach has the potential to effectively deal with the multi-body MFSfM problem.
Dense Optical Flow Field Estimation using Recursive LMS Filtering
Mejdi Trimeche (Nokia Research Center, Finland); Marius Tico (Nokia Reseach Center, Finland); Moncef Gabbouj (Tampere University of Technology, Finland)
In this paper, we present a novel recursive method for pixel-based motion estimation. Assuming small displacements, we use an adaptive LMS filtering scheme to match the intensity values and calculate the displacement between two adjacent video frames. Using a sliding window from the template image, the proposed algorithm employs a simple 2-D LMS filter to adapt the corresponding set of coefficients in order to match the pixel value in the reference frame. The peak value in the resulted coefficient distribution points to the displacement between the frames at each pixel position. The experiments demonstrate good results because the filtering takes advantage of the localized correlation of image data in adjacent frames, and produces refined estimates of the displacements at sub-pixel accuracy. One particular advantage is that the proposed method is flexible and well suited for the estimation of small displacements within video frames. The proposed method can be applied in several applications such as super-resolution, video stabilization and denoising of video sequences.
Spatiotemporal Algorithm for Joint Video Segmentation and Foreground Detection
Sevket Derin Babacan (Northwestern University, USA); Thrasyvoulos Pappas (Northwestern University, USA)
We present a novel algorithm for segmenting video sequences into objects with smooth surfaces. The segmentation of image planes in the video is modeled as a spatial Gibbs-Markov random field, and the probability density distributions of temporal changes are modeled by a Mixture of Gaussians approach. The intensity of each spatiotemporal volume is modeled as a slowly varying function distorted by white Gaussian noise. Starting from an initial spatial segmentation, the pixels are classified using the temporal probabilistic model and moving objects in the video are detected. This classification is updated by Markov random field constraints to achieve smoothness and spatial continuity. The temporal model is updated using the segmentation information and local statistics of the image frame. Experimental results show the performance of our algorithm.
Contour Based Smoke Detection in Video Using Wavelets
Behcet Toreyin (Bilkent University, Turkey); Yigithan Dedeoglu (Bilkent University, Turkey); A. Enis Cetin (Bilkent University, Turkey)
This paper proposes a novel method to detect smoke in video. It is assumed the camera monitoring the scene is stationary. The smoke is semi-transparent at the early stages of a fire. Therefore edges present in image frames start loosing their sharpness and this leads to a decrease in the high frequency content of the image. The background of the scene is estimated and decrease of high frequency energy of the scene is monitored using the spatial wavelet transforms of the current and the background images. Edges of the scene produce local extrema in the wavelet domain and a decrease in the energy content of these edges is an important indicator of smoke in the viewing range of the camera. Moreover, scene becomes grayish when there is smoke and this leads to a decrease in chrominance values of pixels. Periodic behavior in smoke boundaries is also analyzed using a Hidden Markov model (HMM) mimicking the temporal behavior of the smoke. In addition, boundary of smoke regions are represented in wavelet domain and high frequency nature of the boundaries of smoke regions is also used as a clue to model the smoke flicker.

### Wed.1.2: Image restoration and denoising - 5 papers

Room: Auditorium
Chair: Giovanni Ramponi (University of Trieste, Italy)
Signal and Image Denoising via Scale- Space Atoms
Vittoria Bruni (National Council of Researches - C.N.R, Italy); Benedetto Piccoli (C.N.R., Italy); Domenico Vitulano (C.N.R., Italy)
In this paper a scale-space model for signal and image de-noising is presented. The wavelet coefficients are split into overlapping atoms and their evolution law through scales is formally derived. This formalism accounts for both inter and intra-scale dependencies of wavelet coefficients and then it can be exploited for denoising. Experimental results show that the proposed algorithm outperforms the most effective wavelet based denoising techniques.
Maximum a Posteriori Super Resolution Based on Simultaneous Non-Stationary Restoration, Interpolation and Fast Registration
Giannis Chantas (University of Ioannina, Greece); Nikolaos Galatsanos (University of Ioannina, Greece); Nathan Woods (Binary Machines, Inc, 1320 Tower Rd., Schaumburg, IL 60173, USA)
In this paper we propose a maximum a posteriori (MAP) framework for the super resolution problem, i.e. recon-structing high-resolution images from shifted, low-resolution degraded observations. In this framework the restoration, interpolation and registration subtasks of this problem are preformed simultaneously. The main novelties of this work are the use of a new hierarchical non-stationary edge adaptive prior for the super resolution problem, and an efficient implementation of this methodology in the dis-crete Fourier transform (DFT) domain. We present exam-ples with real data that demonstrate the advantages of this methodology.
Psycho-visual Quality Assessment of state-of-the-art Denoising Schemes
Ewout Vansteenkiste (Ghent University, Belgium)
In this paper we compare the quality of 7 state-of-the-art denoising schemes based on human visual perception. 3 of those are wavelet-based filter schemes, 1 is Discrete Cosine Transform-based, 1 is Discrete Fourier Transform-based, 2 are Steerable Pyramid-based and 1 is Fuzzy Logic based. A psycho-visual experiment was set up in which 37 subjects were asked to score and compare denoised images coming from 3 different scenes. A Multi-Dimensional Scaling framework was then used to process the data of this experiment. This lead to a ranking of the filters in perceived overall image quality. In a follow-up experiment other attributes such as the noisiness, bluriness and artefacts present in the denoised images allowed us also to determine why people choose one filter over the other.
A Novel Technique For Reducing Demosaicing Artifacts
Daniele Menon (University of Padova, Italy); Stefano Andriani (University of Padova, Italy); Giancarlo Calvagno (University of Padova, Italy)
Demosaicing is the process of reconstructing the full dimension representation of an image captured by a digital camera with a color filter array. The color filter array allows for only one color measuring for each pixel and the missing two colors have to be estimated. In literature many demosaicing techniques have been proposed but the reconstructed images are affected by some visible and annoying artifacts. In this paper we propose a new effective algorithm to reduce these artifacts. This algorithm improves the performances of the demosaicing reconstruction, increasing the visual quality of the resulting images. It can be applied directly after the color interpolation, or as an off-line post-processing to improve the image provided by the digital camera.
Virtual Restoration of Faded Photographic Prints
Vittoria Bruni (National Council of Researches - C.N.R, Italy); Giovanni Ramponi (University of Trieste, Italy)
Antique photographic prints are subject to fading due to the action of time and of diverse chemical agents. A method for the automated virtual restoration of digital images obtained from scanned photographic prints is proposed in this paper. The effects of film grain noise are also taken into consideration. Experimental results on archive photographic material show the performances of the proposed technique.

### Poster: Hardware and Implementation - 9

Room: Poster Area
Chair: Kutluyil Dogancay (University of South Australia, Australia)
A New Efficient Implementation of TDAC Synthesis Filterbank Based On Radix-2 FFT
Anup K.c (Emuzed - A Flextronics Company, India); Ajay Kumar Bangla (Emuzed-A Flextronics Company, India)
An improved implementation of the TDAC synthesis filter-bank is proposed which is well suited for fixed point imple-mentation on embedded platforms. The new implementation is based on computation of IMDCT by Radix-2 FFT. A new fast and regular structure has been derived by modifying the pre-processing, FFT and post-processing stages of the implementation. An efficient implementation of windowing for embedded platforms is also suggested.
An architecture for real-time design of the system for multidimensional signal analysis
Veselin Ivanovic (University of Montenegro, Serbia and Montenegro); Radovan Stojanovic (University of Montenegro, Serbia and Montenegro); Srdjan Jovanovski (University of Montenegro, Serbia and Montenegro); Ljubisa Stankovic (University of Montenegro, Serbia and Montenegro)
Multiple clock cycle hardware implementation (MCI) of a flexible system for space/spatial-frequency signal analysis is proposed. Designed special purpose hardware can realize almost all commonly used two-dimensional space/spatial-frequency distributions (2-D S/SFDs) based on the 2-D Short-time Fourier transformation (2-D STFT) elements. The flexibility and the ability of sharing functional kernel, known as STFT-to-SM gateway, [1], within S/SFDs execution represent major advantages of this approach. These abilities enable one to optimize critical design performances of the multidimensional system, such as hardware complexity, energy consumption, and cost.
Analysis of Steady-State Excess Mean-Square-Error of the Least Mean Kurtosis Adaptive Algorithm
Junibakti Sanubari (Satya Wacana University, Indonesia)
In this paper, the average of the steady state excess mean square error (ASEMSE) of the least mean kurtosis (LMK) adaptive algorithm is theoretically derived. It is done by applying the energy conservation behavior of adaptive filters and it is based on the $n$-th order correlations and cumulants theory. By doing so, the behavior of the recently proposed LMK can be predicted, so that it can be widely used. The behavior is compared with the various adaptive algorithms. Our study shows that it is possible to adjust the performance of the LMK. When the step size $\mu$ is carefully selected, the performance of the LMK can outperform the LMS.
Bandpass/ Wideband ADC architecture using parallel Delta Sigma modulators
Ali Beydoun (Supelec, France); Philippe Benabes (Supelec, France)
This paper presents a new method for digitizing wideband signals. It is based on the use of parallel analog delta sigma modulators, where each modulator converts a part of the spectrum of the input signal. A major benefit of the architecture is that it widens the conversion band of the input signal and increases its dynamic range. Two solutions are proposed to reconstruct the signal: the first one uses bandpass filters without demodulation and the second demodulates the signal of each modulator, and then processes it in a lowpass filter. This paper focuses essentially on the digital part of the system and the overall performances are compared by using simulation results.
On the Equivalence of a Reduced-Complexity Recursive Power Normalization Algorithm and the Exponential Window Power Estimation
Kutluyil Dogancay (University of South Australia, Australia)
The transform-domain least-mean-square (TD-LMS) algorithm provides significantly faster convergence than the LMS algorithm for coloured input signals. However, a major disadvantage of the TD-LMS algorithm is the large computational complexity arising from the unitary transform and power normalization operations. In this paper we establish the equivalence of a recently proposed recursive power normalization algorithm and the traditional exponential window power estimation algorithm. The proposed algorithm is based on the matrix inversion lemma and is optimized for implementation on a digital signal processor (DSP). It reduces the number of divisions from N to one for a TD-LMS adaptive filter with N coefficients. This provides a significant reduction in computational complexity for DSP implementations. The equivalence of the reduced-complexity algorithm and the exponential window power estimation algorithm is demonstrated in simulation examples.
Efficient implementations of operations on runlength-represented images
Øyvind Ryan (University of Oslo, Norway)
Performance enhancements which can be obtained from processing images in runlength-represented form are demonstrated. A runlength-based image processing framework, optimized for handling bilevel images, was developed for this. The framework performs efficiently on modern computer architectures, and can apply both raster-based and runlength-based methods. Performance enhancements for runlength-based image processing in terms of memory access and cache utilization are explained.
Tradeoff Between Complexity and Memory Size in the 3GPP enhanced aacPlus Decoder: Speed-Conscious and Memory-Conscious Decoders on a 16-bit Fixed-Point DSP
Osamu Shimada (NEC Corporation, Japan); Toshiyuki Nomura (NEC Corporation, Japan); Akihiko Sugiyama (NEC Corporation, Japan); Masahiro Serizawa (NEC, Japan)
This paper investigates tradeoff between complexity and memory size in the 3GPP enhanced aacPlus decoder based on 16-bit fixed-point DSP implementation. In order to inves-tigate this tradeoff, the speed- and the-memory conscious decoders are implemented. The maximum number of opera-tions for the implemented speed-conscious decoder is 29.3 million cycles per second (MCPS) for a 32 kb/s bitstream. The maximum number of operations for the memory-conscious decoder, where 70% of the data are allocated to an external memory area, increases by 5.7 MCPS (19%) for the bitstream. The investigation of this tradeoff provides an actual relationship between the computational complexity and the internal memory size of the 3GPP enhanced aacPlus decoder. The implemented decoders enable music download and streaming services on next-generation mobile terminals.
Realization of Block Robust Adaptive Filters using Generalized Sliding Fermat Number Transform
Hamze Alaeddine (University of Brest, France); El-Houssain Baghious (Université de Bretagne Occidentale, France); Guillaume Madre (Université de Bretagne Occidentale, France); Gilles Burel (Université de Bretagne Occidentale, France)
This paper is about an efficient implementation of adaptive filtering for echo cancelers. First, a realization of an improved Block Proportionate Normalized Least Mean Squares (BPNLMS++) using Generalized Sliding Fermat Number Transform (GSFNT) is presented. Unfortunately, during the double-talk mode, the echo cancelers often diverge. We can cope with this problem by employing a double-talk detector formed by two Voice Activity Detectors (VAD's). We propose a general system based on the Robust-Block-PNLMS++ (RBPNLMS++) adaptive filter combined with a post-filter. The general system was implemented with GSFNT which can significantly reduce the computation complexity of the filter implantation on Digital Signal Processing (DSP).
FPGA Architecture for Object Segmentation in Real Time
Jozias de Oliveira (Genius Institute of Technology, Brazil); Andre Printes (Genius Institute of Technology, Algeria); Raimundo Freire (Federal University of Campina Grande, Brazil); Elmar Melcher (Federal University of Campina Grande, Algeria); Ivan Sebastião (Federal University of Para, Algeria)
Object segmentation from a video sequence is a function necessary for many applications of artificial vision systems such as: video surveillance, traffic monitoring, detection and tracking for video teleconferencing, video editing, etc. In this paper, we present an architecture for object segmen-tation, taking advantage of the data and logical parallel opportunities offered by a field programmable gate array (FPGA) architecture. At a clock rate of 40 MHz, the archi-tecture can process 30 frames per second, where the image resolution is 240 x 120.

### Poster: Video and Image Coding - 19 papers

Room: Poster Area
Chair: Murat Tekalp (Koc University, Istanbul, Turkey)
QoS Mapping for Fine Granular Scalability with Base Layer Scaling
Masayuki Inoue (NTT Corporation, Japan)
Scalable video coding is receiving increasing attention. However, few papers address QoS mapping with regard to scalable video coding. This paper uses Fine Granular Scalability with base layer scaling that offers application-level QoS control. An experiment is conducted that uses the DSIS method to subjectively assess Fine Granular Scalability. The results show that the maximum of the mean grading point fell as the base-layer Size value decreased. Our results also show that Fine Granular Scalability provides only SNR scalability, not subjective image quality scalability. A multiple regression analysis shows that we can establish QoS mapping between user and application-levels by using both application-level QoS parameters and a feasible human factor. Furthermore, we show that FGS can adjust to not only a wide range of channel capacities but also a wide variety of users by using the indicated QoS mapping.
Quality evaluation model using local features of still picture
Yuukou Horita (University of Toyama, Japan)
The objective picture quality evaluation model for coded still picture without using the reference is very useful for quality oriented image compression. In this paper, a new objective no-reference (NR) picture quality evaluation model for JPEG is presented, which is easy to calculate and applicable to various image coding applications. The proposed model is based on the local features of the picture such as edge, flat and texture area and also on the blockiness, activity measures, and zero-crossing rate within block of the picture. Our experiments on various picture distortion types indicate that it performs significantly better than the conventional model.
An Adaptive Error Concealment Mechanism for H.264 Encoded Low-Resolution Video Streaming
Olivia Nemethova (Vienna University of Technology, Austria); Ameen Al-Moghrabi (Vienna University of Technology, Austria); Markus Rupp (TU Wien, Austria)
H.264 video codec is well suited for the real-time error resilient transport over packet oriented networks. In real-time communications, lost packets at the receiver cannot be avoided. Therefore, it is essential to design efficient error concealment methods which allow to visually reduce the degradation caused by the missing information. Each method has its own quality of reconstruction. We implemented various efficient error concealment techniques and investigated their performance in different scenarios. As a result, we propose and evaluate an adaptive error concealment mechanism that accomplishes both -- good performance and low complexity enabling the deployment for mobile video streaming applications. This mechanism selects suitable error concealment method according to the amount of instantaneous spatial and temporal information of the video sequence and according to the type of the frame.
An Adaptive Color Transform Approach and its Application in 4:4:4 Video Coding
Detlev Marpe (Fraunhofer HHI, Germany); Heiner Kirchhoffer (Fraunhofer HHI, Germany); Valeri George (Fraunhofer HHI, Germany); Peter Kauff (Fraunhofer HHI, Germany); Thomas Wiegand (HHI/FhG, Germany)
This paper deals with an approach that extends block-based video-coding techniques by an adaptive color space trans-form. The presented technique allows the encoder to switch between two given color space representations with the ob-jective to maximize the overall rate-distortion gain. Simula-tions based on the current draft of the H.264/MPEG4-AVC 4:4:4 extensions demonstrate that our technique guarantees a rate-distortion performance equal or better than that ob-tained when coding in any of the two fixed color spaces.
Spatio-Temporal Filter for ROI Video Coding
Linda Karlsson (Mid Sweden University, Sweden); Mårten Sjöström (Mid Sweden University, Sweden); Roger Olsson (Mid Sweden University, Sweden)
Reallocating resources within a video sequence to the regions-of-interest increases the perceived quality at limited bandwidths. In this paper we combine a spatial filter with a temporal filter, which are both codec and standard independent. This spatio-temporal filter removes resources from both the motion vectors and the prediction error with a computational complexity lower than the spatial filter by itself. This decreases the bit rate by 30-50% compared to coding the original sequence using H.264. The released bits can be used by the codec to increase the PSNR of the ROI by 1.58 4.61 dB, which is larger than for the spatial and temporal filters by themselves.
Pdf sharpening for multichannel predictive coders
Predictive coders that split the prediction decision into con-texts depending on the local image behaviour have proved to be practically useful and successful in image coding appli-cations. Such predictive coders can be named as multi-channel. LOCO is a simple, yet successful example of such coders. Due to its success, a fair amount of attention has been paid for the improvement of multichannel predictive coders. The common task for these coders is to split the pixel layout around the pixel of interest into a list of contexts or prediction rules that specifically succeeds in predicting the value in a reasonable way. The improvement proposed in this work is due to the well known observation that the pre-diction error pdfs are not identically or evenly distributed for each channel output. Although several methods have been proposed for the compensation of this situation, they mostly perturb the low complexity behaviour. In this work, it is shown that a two-pass coder is a simple, yet efficient im-provement that perfectly determines channel pdf bias amounts, and the adjustment produces up to 5% compres-sion improvement over the test images.
An Improved Error Concealment Strategy Driven by Scene Motion Properties for H.264/AVC Decoders
Susanna Spinsante (Polytechnic University of Marche, Italy); Ennio Gambi (Università  Politecnica delle Marche, Italy); Franco Chiaraluce (Universita' Politecnica delle Marche, Italy)
This paper deals with the possibility of improving the concealment effectiveness of an H.264 decoder, by means of the integration of a scene change detector. This way, the selected recovering strategy is driven by the detection of a change in the scene, rather than by the coding features of each frame. The scene detection algorithm under evaluation has been chosen from the technical literature, but a deep analysis of its performance, over a wide range of video sequences having different motion properties, has allowed the suggestion of simple but effective modifications, which provide better results in terms of final perceived video quality.
H.264 Encoding of Videos with Large Number of Shot Transitions Using Long-Term Reference Pictures
Nukhet Ozbek (Ege University, Turkey); A. Murat Tekalp (Koc University, Turkey)
Long-term reference prediction is an important feature of the H.264 standard, which provides a trade-off between gain and complexity. A simple long-term reference selection method is presented for videos with frequent shot/view tran-sitions in order to optimize compression efficiency at the shot boundaries. Experimental results show up to 50% reduction in the number of bits, at the same PSNR, for frames at the border of transitions.
Embedded Image Processing/Compression For High-Speed CMOS Sensor
Romuald Mosqueron (University of Burgundy, France); Julien Dubois (university of Burgundy, France); Michel Paindavoine (Université de Bourgogne, France)
High-speed video cameras are powerful tools for investigating for instance the biomechanics analysis or the movements of mechanical parts in manufacturing processes. In the past years, the use of CMOS sensors instead of CCDs has made possible the development of high-speed video cameras offering digital outputs, readout flexibility and lower manufacturing costs. In this paper, we proposed a high-speed camera based on CMOS sensor with embedded processing. Two types algorithms have been implemented. The compression algorithm represents the first class for our camera and allows to transfer images using serial link output. The second type is dedicated to feature extraction like edge detection, markers extraction, or image analysis, wavelet analysis and object tracking. These image processing algorithms have been implemented into a FPGA embedded inside the camera. This FPGA technology allows us to process in real time 500 images per second with a 1,280H à1,024V resolution. Keywords: CMOS Image Sensor, FPGA, Image Compression, High-speed Video.
Improving Wyner-Ziv Video coding by block-based distortion estimation
Zouhair Belkoura (Technische Università¤t Berlin, Germany); Thomas Sikora (Technische Università¤t Berlin, Germany)
Conventional video coding uses motion estimation to perform adaptive linear predictive encoding. Wyner-Ziv coding does not use predictive coding but performs motion estimation at the decoder. Recent work uses a difference signal at the encoder to estimate the prediction quality at the decoder. In this paper, we recognise that this operation constitutes a step in the motion estimation process. We exploit this information by omitting suitable blocks, effectively implementing linear predictive coding with deadzone quantisation for parts of the input signal. This modified Wyner-Ziv coding results in large bitrate reductions as well as significant decoding complexity decrease. At certain bitrates, our modified Wyner-Ziv codec outperforms conventional hybrid coding in an I-B-I-B setup.
A low-complexity multiple description video coder based on 3D-transforms
Andrey Norkin (University of Tampere, Finland); Atanas Gotchev (Tampere University of Technology, Finland); Karen Egiazarian (Tampere University of Technology, Finland); Jaakko Astola (Tampere University of Technology, Finland)
The paper presents a multiple description (MD) video coder based on three-dimensional (3D) transforms. The coder has low computational complexity and high robustness to transmission errors and is targeted to mobile devices. The encoder represents video sequence in form of coarse sequence approximation (shaper) included in both descriptions and residual sequence (details) split between two descriptions. The shaper is obtained by block-wise pruned 3D-DCT. The residual sequence is coded by 3D-DCT or hybrid 3D-transform. The coding scheme is simple and yet outperforms some plain MD coders based on H.263 in lossy environment, especially in low-redundancy region.
Adaptive Interpolation Algorithm for Fast and Efficient Video Encoding in H.264
Gianluca Bailo (University of Genova, Italy); Massimo Bariani (University of Genova, Italy); Chiappori Andrea (University of Genova, Italy); Riccardo Stagnaro (University of Genova, Italy)
H.264/MPEG-4 AVC is the latest video-coding standard jointly developed by VCEG (Video Coding Experts Group of ITU-T and MPEG (Moving Picture Experts Group) of ISO/IEC. It uses state of the art video signals algorithms providing enhanced efficiency, compared with previous standards, for a wide range of applications including video telephony, video conferencing, video surveillance, storage, streaming video, digital video editing and creation, digital cinema and others. In order to reduce the bitrate of the video signal in H.264, the ISO and ITU coding standards use a Â¼ pel displacement resolution. A 6-tap Weiner filter is utilized to obtain half-pel samples, which are then averaged in order to achieve the quarter-pel interpolation. H.264 saves 50% bit-rate maintaining the same quality if compared with existing video coder standards, but such a result demands additional computational complexity. In this paper, we propose an algorithm for the reduction of the interpolation computational time. The goal is to adapt the H.264 Â¼ pel interpolation to the complexity of the video stream to encode, on the basis of our motion detection algorithm. The proposed solution allows to decrease the overall encoder complexity both in low and high complex sequences. This paper illustrates the integration of the H.264 encoder with our motion detection algorithm for the development of an adaptive interpolation. The obtained results are compared with the jm86 standard interpolation using different quantization values.
Efficient Image Registration With Subpixel Accuracy
Irene Karybali (University of Patras, Greece); Emmanouil Psarakis (University of Patras, Greece); Kostas Berberidis (University of Patras, Greece); Georgios Evangelidis (, ? )
The contribution of this paper is twofold. First, a new spatial domain image registration technique with subpixel accuracy is presented. This technique is based on a double maximization of the correlation coefficient and provides a closed-form solution to the subpixel translation estimation problem. Second, an efficient iterative scheme for integer registration is proposed, which reduces significantly the number of searches, as compared to the exhaustive search. This scheme can be used as a pre-processing step in the sub-pixel accuracy technique, leading to lower computational complexity. Extensive simulation results have shown that the performance of the proposed technique compares very favorably with respect to existing ones.
H.264 video coding for low bit rate error prone channels: an application to Tetra systems
Stefania Colonnese (Università  "La Sapienza" di Roma, Italy); Alessandro Piccoli (University of Rome "La Sapienza", Italy); Claudio Sansone (University of Rome "La Sapienza", Italy); Gaetano Scarano (Università  "La Sapienza" di Roma, Italy)
This work investigates a H.264 coding scheme for video transmission over packet networks characterized by heavy packet losses and low available bitrate, such as those en-countered on channels that were originally designed for voice and limited data services. The H.264 resilient coding tools such Flexible Macroblock Ordering, Redundant Slices and Arbitrary Slice Ordering are here tuned in order to adapt the application layer parameters to the physical lay-ers characteristics. Due to the limited bandwidth, the tools are differentiated on a Region Of Interest (ROI). Moreover, the Redundant Slices tool is integrated by suitable applica-tion level interleaving to counteract the bursty nature of the errors. The performances of the codec design choices are assessed on a TETRA communication channel, that is quite challenging due to both limited bandwidth and severe error conditions. However, the illustrated codec design criteria can be adopted in different low bit-rate, error prone chan-nels.
Spatio-temporal selective extrapolation for 3-D signals applied to concealment in video communications
Katrin Meisinger (University of Erlangen-Nuremberg, Germany); Sandra Martin (University of Erlangen-Nuremberg, Germany); André Kaup (University of Erlangen-Nà¼rnberg, Germany)
In this paper we derive a frequency selective extrapolation method for three-dimensional signals. Extending a signal beyond a limited number of known samples is commonly referred to as signal extrapolation. We provide an extrapolation technique which enables to estimate image areas by exploiting simultaneously spatial and temporal correlations of the video signal. Lost areas caused by transmission errors are concealed by extrapolation from the surrounding. The missing areas in the video sequence are estimated conventionally from either the spatial or temporal surrounding. Our approach approximates the known signal by a weighted linear combination of 3-D basis functions from spatial as well as temporal direction and extrapolates it into the missing area. The algorithm is able to extrapolate smooth and structured areas and to inherently compensate motion and changes in luminance from frame to frame.
A scalable SPIHT-based multispectral image compression
Fouad Khelifi (Queen's University Belfast, United Kingdom); Ahmed Bouridane (Queen's University, United Kingdom); Fatih Kurugollu (Queen's University Belfast, United Kingdom)
This paper addresses the compression of multispectral images which can be viewed, at the encoder side, as a three-dimensional (3D) data set characterized by a high correlation through the successive bands. Recently, the celebrated 3D-SPIHT (Sets Partitioning In Hierarchical Trees) algorithm has been widely adopted in the literature for the coding of multispectral images because of its proven state-of-the art performance. In order to exploit the spectral redundancy in the 3D wavelet transform domain, a new scalable SPIHT based multispectral image compression technique is proposed. The rational behind this approach is that image components in two consecutive transformed bands are significantly dependent in terms of zerotrees locations in the 3D-DWT domain. Therefore, by joining the trees with the same location into the List of Insignificant Sets (LIS), a considerable amount of bits can be reduced in the sorting pass in comparison with the separate encoding of the transformed bands. Numerical experiments on two sample multispectral images show a highly better performance of the proposed technique when compared to the conventional 3D-SPIHT.
Temporal And Spatial Scaling For Stereoscopic Video Compression
Anil Aksay (Middle East Technical University, Turkey); Cagdas Bilen (Middle East Technical University, Turkey); Engin Kurutepe (Koc University, Turkey); Tanir Ozcelebi (Koc University, Turkey); Gozde Bozdagi Akar (Middle East Technical University, Turkey); Reha Civanlar (Koc University, Turkey); A. Murat Tekalp (Koc University, Turkey)
In stereoscopic video, it is well-known that compression efficiency can be improved, without sacrificing PSNR, by predicting one view from the other. Moreover, additional gain can be achieved by subsampling one of the views, since the Human Visual System can perceive high frequency information from the other view. In this work, we propose subsampling of one of the views by scaling its temporal rate and/or spatial size at regular intervals using a real-time stereoscopic H.264/AVC codec, and assess the subjective quality of the resulting videos using DSCQS test methodology. We show that stereoscopic videos can be coded at a rate about 1.2 times that of monoscopic videos with little visual quality degradation.
Linear and Nonlinear Temporal Prediction Employing Lifting Structures for Scalable Video Coding
Behcet Toreyin (Bilkent University, Turkey); Maria Trocan (ENST, France); Béatrice Pesquet (Ecole Nationale Supérieure des Télécommunications, France); Enis Çetin (Bilkent University, Turkey)
Scalable 3D video codecs based on wavelet lifting structures have attracted recently a lot of attention, due to their compression performance comparable with that of state-of-art hybrid codecs. In this work, we propose a set of linear and nonlinear predictors for the temporal prediction step in lifting implementation. The predictor uses pixels on the motion trajectories of the frames in a window around the pixel to be predicted to improve the quality of prediction. Experimental results show that the video quality as well as PSNR values are improved with the proposed prediction method.
An H.264-Based Video Encoding Scheme for 3D TV
This paper presents an H.264-based scheme for compress-ing 3D content captured by 3D depth range cameras. Exist-ing MPEG-2 based schemes take advantage of the correla-tion between the 2D video sequence and its corresponding depth map sequence, and use the 2D motion vectors (MV) for the depth video sequence as well. This improves the speed of encoding the depth map sequence, but it results in an increase in the bitrate or a drop in the quality of the re-constructed 3D video. This is found to be due to the MVs of the 2D video sequence not being the best choice for encod-ing some parts of the depth map sequence containing sharp edges or corresponding to distant objects. To solve this problem, we propose an H.264-based method which re-estimates the MVs and re-selects the appropriate modes for these regions. Experimental results show that the proposed method enhances the quality of the encoded depth map se-quence by an average of 1.77 dB. Finding the MVs of the sharp edge-included regions of the depth map sequence amounts to 30.64% of the computational effort needed to calculate MVs for the whole depth map sequence.

### Poster: Source Localization - 6 papers

Room: Poster Area
Chair: Marco Luise (University of Pisa, Italy)
A Novel Signal Subspace Approach for Mobile Positioning with Time-of-Arrival Measurements
Hing Cheung So (City University of Hong Kong, Hong Kong); Kit Wing Chan (City University of Hong Kong, Hong Kong)
The problem of locating mobile terminals has recently received considerable attention particularly in the field of wireless communications. In this paper, a simple signal subspace based algorithm is devised for mobile positioning with the use of time-of-arrival (TOA) measurements received at three or more reference base stations. Computer simulations are included to contrast the estimator performance with Cramer-Rao lower bound and computationally attractive TOA-based localization methods in the literature.
Bayesian Mobile Location in Cellular Networks
Mohamed Khalaf-Allah (University of Hannover, Germany); Kyandoghere Kyamakya (Department of Informatics Systems, University of Klagenfurt, Austria)
Determining the location of mobile stations could be achieved by collecting signal strength measurements and correlating them to pre-calculated signal strength values at reference locations. This method is advantageous, because no LOS conditions are needed, it can work even with one base station (BS), and its implementation costs are pretty low. However, the correlation process needs an appropriate likelihood function such as that provided by Bayesian statis-tical estimation approaches. They use all available informa-tion surrounding candidate hypotheses to determine their likelihoods. In this paper, we present a Bayesian mobile lo-cation algorithm, and show its performance by field meas-urements in a working GSM network.
Position and Velocity Recovery from Independent Ultrasonic Beacons
Michael McCarthy (University of Bristol, United Kingdom); Henk Muller (University of Bristol, United Kingdom); Andrew Calway (University of Bristol, United Kingdom); R Wilson (University of Bristol, United Kingdom)
In this paper we present a study on how to estimate the position of a mobile receiver using ultrasonic beacons fixed in the environment. Unlike traditional approaches, the ultrasonic beacons are independent, and positioning is performed by measuring the Doppler shift within their observed periods. We show that this approach allows us to deduce both position and velocity, but an analysis of the space indicates that we can recover the direction of velocity very well, the magnitude of velocity less well, and that location estimation is the least accurate. Based on the characteristics of the solution space, we suggest a method for improving positioning accuracy.
Spectral Separation Coefficients for digital GNSS receivers
Daniele Borio (Politecnico di Torino, Italy); Letizia Lo Presti (Politecnico di Torino, Italy); Paolo Mulassano (Istituto Superiore Mario Boella, Italy)
The extreme weakness of a GNSS (Global Navigation Satellite System) signal makes it vulnerable to almost every kind of interferences, that can be radically different in terms of time and frequency characteristics. For this reason the development of a consistent theory allowing comparative analysis was needed and the concepts of effective C/N0 and SSCs (Spectral Separation Coefficients) were introduced as reliable measures of the interfering degradations. However these parameters were defined only in the analog domain, not considering specific features due to digital synthesis. In this article an alternative derivation for the analog case and the extension to digital devices are provided. The analysis is particularly focused on the acquisition block, the first element of a GNSS receiver that provides a roughly estimated code delay and Doppler shift. The innovative approach presented in the paper is based on the fact that effective C/N0 and SCCs are interpreted in terms of ROCs (Receiver Operative Characteristics) showing how the system performance strictly depends on these parameters.
Cramer-Rao Bounds for Source Localization in Shallow Ocean with Generalized Gaussian Noise
Pramod N. c. (Indian Institute of Science, India); Anand G. v. (Indian Institute of Science, India)
Localization of underwater acoustic sources is a problem of great interest in the area of ocean acoustics. There exist several algorithms based on array signal processing for source localization problem. It is of interest to know the theoretical performance limits of these estimators for a given task. In this paper we develop a general exression for the Cramer-Rao-Bound (CRB) for DOA / Range-Depth estimation of underwater acoustic sources in a shallow range-independent ocean for the case of generalized Gaussian noise. The CRB provides the theoretical performance limits of any estimator for source localization in shallow ocean. We then study the performance of some of the popular source localization techniques, through simulations, for DOA/range-depth estimation of underwater acoustic sources in shallow ocean by comparing the estimates with the corresponding CRBs.
Unscented Kalman Filter For Location In Non-Line-Of-Sight
Marc Bosch (UPC, Spain); Montse Najar (UPC, Spain)
This paper deals with the problem of Non Line Of Sight (NLOS) in wireless communications systems devoted to location purposes. It is well known that NLOS condition, which is mainly due to blocking of transmitted signal by obstacles, biases Time Of Arrival (TOA) estimates therefore providing biased position. The objective of this paper is to analyze the improvements in positioning accuracy tracking the TOA bias with the Unscented Kalman Filter (UKF) proposed for location estimation. The evaluation of the approach has been carried out in Impulse Radio Ultra-Wideband (IR-UWB) communications systems.

### Wed.6.2: Radar Detection - 5 papers

Room: Room 4
Chair: Maria Greco (University of Pisa, Italy)
GLRT-Based Detection-Estimation of Gaussian Signals in Under-Sampled Training Conditions
Ben Johnson (JORN Technical Director, Australia); Yuri Abramovich (Cooperative Research Centre for Sensor Signal and Information Processing, Australia)
A likelihood ratio test has recently been developed for detection-estimation in under-sampled scenarios where the number of training data T is less than the number of antenna elements M. This test can be applied in a GLRT detection-estimation framework to many problems not satisfactorily addressed by conventional techniques. In particular, we consider under-sampled MUSIC performance breakdown phenomenon for independent sources, and use the under-sampled likelihood ratio to detect the presence of MUSIC outliers.
On a SIRV-CFAR detector with radar experimentations in impulsive noise
Frederic Pascal (ONERA, France); Jean-Philippe Ovarlez (ONERA, France); Philippe Forster (Universit Paris X - GEA, France); Pascal Larzabal (SATIE ENS-Cachan, France)
This paper deals with radar detection in impulsive noise. Its aim is twofold. Firstly, assuming a Spherically Invariant Random Vectors (SIRV) modelling for the noise, the corresponding unknown covariance matrix is estimated by a recently introduced algorithm [1, 2]. A statistical analysis (bias, consistency, asymptotic distribution) of this estimate will be summarized allowing us to give the GLRT properties: the SIRV-CFAR (Constant False Alarm Rate) property, i.e. texture-CFAR and CovarianceMatrix-CFAR, and the relationship between the Probability of False Alarm (PFA) and the detection threshold. Secondly, one of the main contributions of this paper is to give some results obtained with real non-Gaussian data. These results demonstrate the interest of the proposed detection scheme, and show an excellent correspondence between experimental and theoretical false alarm rates.
Permutation Detectors under Correlated K-Distributed Clutter in Radar Applications
In this paper, we analyze the performance of some permutation tests (PTs) under a correlated K-distributed clutter model and nonfluctuating and Swerling II target models. Also, we compare the PTs results against their parametric counterparts under the same conditions. We analyze the detector performances in terms of detection probability (Pd) versus signal-to-clutter ratio (SCR) for different parameter values: the number of integrated pulses (N), the clutter reference samples (M), the false alarm probability (Pfa), the shape parameter (v) of the K-distributed clutter and the clutter correlation coefficients.
GLRT detection for range and doppler distributed targets in non-Gaussian clutter
Nicolas Bon (Laboratoire E3I2-EA 3876, France); Ali Khenchaf (Laboratoire E3I2-EA3876, France); Jean Michel Quellec (thales airborne systems, France); René Garello (GET-ENST Bretagne, France)
A Generalized likelihood ratio test (GLRT) is derived for adaptive detection of range and Doppler distributed targets. The clutter is modelled as a Spherically Invariant Random Process (SIRP) and its texture component is range dependent (heterogeneous clutter). We suppose here that the speckle component covariance matrix is known or estimated thanks to a secondary data set. Thus, unknown parameters to be estimated are local texture values, the complex amplitudes and frequencies of all scattering centers. The proposed detector assumes \emph{a priori} knowledge on the spatial distribution of the target and has the precious property of Constant False Alarm Rate (CFAR) with the assumption of a known speckle covariance matrix or by the use of frequency agility.
A Hybrid STAP Approach for Radar Target Detection in Heterogeneous Environments
Elias Aboutanios (University of Edinburgh, United Kingdom); Bernard Mulgrew (The University of Edinburgh, United Kingdom)
We address the problem of radar target detection under clutter heterogeneity. Traditional approaches, or two-data set (TDS) algorithms, require a training data set in order to estimate the interference covariance matrix and implement the adaptive lter. When the training data exhibits statistical heterogeneity with respect to the test data, the TDS detectors su er from a degradation in their performance. The singledata set (SDS) detectors have been proposed to deal with this problem by operating solely on the test data. In this paper, we propose a novel hybrid approach that combines the SDS and TDS algorithms, taking the degree of heterogeneity into account. We derive the hybrid detectors and propose the use of the generalised inner product as a heterogeneity measure. We also give expressions for their probabilities of false alarm and detection under heterogeneous assumptions. Simulation results show that new detectors outperform the other algorithms both in homogeneous and heterogeneous interfere

### Wed.4.2: Equalization I - 5 papers

Room: Sala Onice
Chair: Kutluyil Dogancay (University of South Australia, Australia)
Two Nonequivalent Structures for Widely-Linear Decision-Feedback MMSE Equalization over MIMO Channels
Fabio Sterle (University of Naples Federico II, Italy); Davide Mattera (Università  degli Studi di Napoli Federico II, Italy); Luigi Paura (Università  di Napoli Federico II, Italy)
The paper deals with the design of the equalizer based on the widely linear processing combined with the decision-feedback (DF) procedure and operating over time-dispersive multiple-input multiple-output channel. A basic issue concerns the choice between two widely-linear/widely-linear decision-feedback structures: the former is based on the complex-valued signal representation, whereas the latter utilizes the real-valued representation of the involved signals. Indeed, in previous contributions, both structures have been indifferently used since, in the considered scenarios, they resulted to be equivalent. In this paper, we recognize that there is an important scenario where the above structures are not equivalent. To fairly compare them, the issue of decision-error propagation have been addressed. An extensive set of experimental results shows that the real-valued signal representation-based equalizer outperforms significantly the complex-valued representation-based equalizer as well as the conventional DF equalizer when the effects of decision errors in the feedback filters are taken into account.
Tiziano Bianchi (University of Florence, Italy); Fabrizio Argenti (University of Florence, Italy)
In this paper, we address the problem of equalization for filterbank transceivers in the presence of a dispersive time-variant channel. Filterbank transceivers can be adapted to the channel transfer function to yield intersymbol interference (ISI) cancellation. However, when the channel is time-variant, the transceiver should be changed whenever the channel evolves. In this paper, we will allow both the transmitter and the receiver to change and satisfy the interference-free condition, under the assumption of a zero-padded block transmission. Two transmitter-receiver pairs are proposed by using a singular value decomposition (SVD) of the channel matrix, and they are periodically adapted to the channel status relying on an SVD tracking algorithm. Simulation results show that minimum performance loss with respect to the ideal receiver can be achieved by the proposed approach, while it clearly outperforms systems based on a constant transmitter.
Widely-linear fractionally-spaced blind equalization of frequency-selective channels
Angela Sara Cacciapuoti (Università  degli Studi di Napoli Federico II, Italy); Giacinto Gelli (University of Napoli - Federico II, Italy); Luigi Paura (Università  di Napoli Federico II, Italy); Francesco Verde (Università  degli Studi di Napoli Federico II, Italy)
This paper deals with the problem of designing widely-linear (WL) fractionally-spaced (FS) finite-impulse response equalizers for both real- and complex-valued improper modulation schemes. Specifically, the synthesis of both WL-FS minimum mean-square error and zero-forcing equalizers is discussed, by deriving the mathematical conditions assuring perfect symbol recovering in the absence of noise. In a general framework, the feasibility of designing WL-FS blind equalizers based on the constant modulus criterion is also investigated, and both unconstrained and constrained designs are provided. The effectiveness of the proposed equalizers is corroborated by means of computer simulation results.
Low Complexity Turbo Space-Frequency Equalization for Single-carrier MIMO Wireless Communications
Ye Wu (The University of Liverpool, United Kingdom); Xu Zhu (University of Liverpool, United Kingdom); Asoke Nandi (The University of Liverpool, United Kingdom); Yi Huang (University of Liverpool, United Kingdom)
Turbo equalization and frequency-domain equalization (FDE) have both been proved to be effective to combat frequency-selective fading channels. By combining the two techniques, we propose a low complexity Turbo space-frequency equalization (TSFE) structure for single-carrier (SC) multiple input multiple output (MIMO) systems, which provides close performance to its full complexity version with a huge complexity reduction. It is shown that TSFE outperforms the previously proposed Turbo space-time equalization (TSDE) especially at a high delay spread, with much lower complexity. TSFE also provides better performance than its Turbo orthogonal frequency division multiplexing (TOFDM) counterpart with the increase of the number of iterations, at a comparable complexity.
Gaussian processes for regression in channel equalization
Sebastián Caro (Universidad de Sevilla, Spain); Fernando Pérez-Cruz (Universidad Carlos III de Madrid, Estonia); Juan José Murillo-Fuentes (Departamento de Teorà­a de la Señal y Comunicaciones. Universidad de Sevilla, Spain)
In linear channels with additive white noise, linear symbol decision equalizers under perform. In this paper we present Gaussian Processes (GPs) for regression as new non-linear equalizer for digital communications. GPs can be cast as non-linear MMSE, a commom criterion in digital communications. Unlike other similar non-linear kernel based methods such as the kernel adaline (KA) or SVMs, the solution of GPs is analytical and the hyperparameters can be easily learnt by maximum likelihood. Hence, by using GPs we avoid cross-validation or noise estimation, and improve convergence speeds. We present some experimental results regarding some linear and non-linear channels to show that GPs clearly outperform linear and nonlinear state-of-the-art solutions.

### Wed.3.2: Channel Coding and Decoding - 5 papers

Room: Sala Verde
Chair: Luc Vandendorpe (Universit catholique de Louvain, Belgium)
New Full-Diversity High-Rate Space-Time Block Codes Based on Selective Power Scaling
Sushanta Das (University of Texas at Dallas, USA); Naofal Al-Dhahir (University of Texas at Dallas, USA); A. Robert Calderbank (Princeton University, USA); Jimmy Chui (Princeton University, USA)
We design a new rate-5/4 full-diversity orthogonal STBC for QPSK and 2 transmit antennas by enlarging the signalling set from the set of quaternions used in the Alamouti code. Selective power scaling of information symbols is used to guarantee full-diversity while maximizing the coding gain and minimizing the transmitted signal peak-to-minimum power ratio. The optimum power scaling factor is derived analytically and shown to outperform schemes based on only constellation rotation while still enjoying a low-complexity ML decoding algorithm. Finally, we extend our designs to the case of 4 transmit antennas by enlarging the set of Quasi-Orthogonal STBC with power scaling. Extensions to general M-PSK constellations are straightforward.
Serial LDPC Decoding on a SIMD DSP Using Horizontal-Scheduling
Marco Gomes (University of Coimbra, Portugal); Vitor Silva (Institute of Telecommunications, Portugal); Cláudio Neves (University of Coimbra, Portugal); Ricardo Marques (University of Coimbra, Portugal)
In this paper we propose an efficient vectorized low density parity check (LDPC) decoding scheme based on the min-sum algorithm and the horizontal scheduling method. Also, the well known forward-backward algorithm, used in the check-node messages update, is improved. Results are presented for 32 and 16 bits logarithm likelihood ratio messages representation on a high performance and modern fixed point DSP. The single instruction multiple data (SIMD) feature was explored in the 16 bits case. Both regular and irregular codes are considered.
Building Space Time Block Codes with set partitioning for three transmit antennas: application to STTC Codes
Guillaume Ferré (University of Limoges, France); Jean Pierre Cances (University of Limoges, France); Vahid Meghdadi (University of Limoges, France); Jean-Michel Dumas (University of Limoges, France)
This paper introduces new variations about the codes recently introduced by Jafarkhani & al named Super Orthogonal Space Time Trellis Codes (SOSTTC). Using powerful set partitioning rules, these codes are able to combine the coding advantage of STTCs together with the advantage diversity of STBC. This partitioning is based mainly on the determinant criterion introduced first by Tarokh. In this paper we propose a new application field of these codes in the difficult context of a three transmit antenna system. The new obtained STTC codes enables to improve significantly the performances of the best existing STTC codes.
Error Performance Of Super-Orthogonal Space-Time Trellis Codes with Transmit Antenna Selection
Özgür Özdemir (Selcuk University, Turkey); Ibrahim Altunbas (Istanbul Technical University, Turkey); Mehmet Bayrak (Selcuk University, Turkey)
A space-time coding scheme with transmit antenna selection that employs conventional super-orthogonal space-time trellis and improved super-orthogonal space-time trellis codes is considered. Transmit antenna selection criterion is given and the error performance of the proposed structure is tested by computer simulations for systems that have two active transmit antennas and single receive antenna in quasi-static flat Rayleigh fading channels. It has been shown that proposed scheme outperforms the previous space-time trellis coding structures with transmit antenna selection given in the literature.
High-Precision LDPC Codes Decoding at the Lowest Complexity
Massimo Rovini (University of Pisa, Italy); Francesco Rossi (Dept. of Information Engineering - University of Pisa, Italy); Nicola L'Insalata (University of Pisa, Italy); Luca Fanucci (University of Pisa, Italy)
This paper presents a simplified, low-complexity check node processor for a decoder of LDPC codes. This is conceived as the combination of the modified Min-Sum decoding with the reduction of the number of computed messages to only P+1 different values. The simulations with a random code used as test-bench show that this approximation performs excellently even when only two different values are propagated (P=1). This result is assumed as the basement to the design of an optimised serial architecture. The logic synthesis on 0.18um CMOS technology shows that our design outperforms in complexity similar state--of--the--art solutions and makes the check node operations no longer critical to the decoder implementation.

## 2:10 PM - 3:10 PM

### Plenary: Signal Processing in Maternal-Fetal Medicine

Room: Auditorium
Chair: V. John Mathews (University of Utah, USA)

## 3:10 PM - 4:50 PM

### Wed.2.3: Biomedical Signal Processing - 5 papers

Chair: Jiri Jan (Brno University of Technology, Czech Republic)
Embedded Wavelet Packets--based Algorithm for ECG compression
Manuel Blanco-Velasco (University of Alcala, Spain); Fernando Cruz-Roldán (Universidad de Alcalà¡, Spain); Juan-Ignacion Godino-Llorente (Universidad Politécnica de Madrid, Spain); Kenneth Barner (Univesity of Delaware, USA)
The conventional Embedded Zerotree Wavelet (EZW) algorithm takes advantage of the hierarchical relationship among subband coefficients of the pyramidal wavelet decomposition. Nevertheless, it performs worse when used with Wavelet Packets as the hierarchy becomes more complex. In order to address this problem we propose a new technique that considers no relationship among coefficients, and is therefore suitable for use with Wavelet Packets. So in this work, an embedded ECG compression scheme is presented using Wavelet Packets that shows better ECG compression performance than the conventional EZW.
A Geometrically Constrained Anechoic Model for Blind Separation of Temporomandibular Joint Sounds
Clive Cheong Took (Cardiff University, United Kingdom); Saeid Sanei (Cardiff University, United Kingdom); Jonathon Chambers (Cardiff University, United Kingdom); Stephen Dunne (Kings College London, United Kingdom)
Extraction of temporomandibular Joint (TMJ) sound sources is attempted in this paper. A priori knowledge of the geometrically constrained medium (i.e. the head) to extract the temporomandibular joint sound sources from their anechoic mixtures is exploited. This is achieved by estimating the delay of the contra lateral (from the opposite side) source within the anechoic mixtures. Subsequently, we consider the mixing medium as a multichannel filter of constrained length whereby the instantaneous mixing hypothesis can be assumed for each lag. Last but not least, we utilize mutual information as a selection criterion to pick the correct independent components. Successful reconstruction of the TMJ sources (free from artefacts present in the estimates of the TMJ sources by other well-known signal processing techniques) was achieved.
Nonlinear Projective Techniques To Extract Artifacts in Biomedical Signals
Ana Teixeira (DETUA/IEETA,University of Aveiro, Algeria); Ana Maria Tom (University of Aveiro, Portugal); Kurt Stadlthanner (Institute of Biophysics, University of Regensburg, Germany); E.w. Lang (Institute of Biophysics, University of Regensburg, Germany)
Biomedical signals are generally contaminated with artifacts and noise. In case the artifacts dominate, the useful signal can easily be extracted with projective subspace techniques. Then, biomedical signals which often represent one dimensional time series, need to be transformed to multi-dimensional signal vectors for the latter techniques to be applicable. The transformation can be achieved by embedding an observed signal in its delayed coordinates. Using this embedding we propose to cluster the resulting feature vectors and apply a singular spectrum analysis (SSA) locally in each cluster to recover the undistorted signals. We also compare the reconstructed signals to results obtained with kernel-PCA. Both nonlinear subspace projection techniques are applied to artificial data to demonstrate the suppression of random noise signals as well as to an electroencephalogram (EEG) signal recorded in the frontal channel to extract its prominent electrooculogram (EOG) interference.
A time-frequency adaptive signal model-based approach for parametric ECG compression
Nicolas Ruiz Reyes (University of Jaen, Spain); Pedro Vera Candeas (University of Jaen, Spain); Pedro Jesus Reche Lopez (University of Jaen, Spain)
A preliminary investigation of an atomic model-based algorithm for the compression of single lead ECG is presented. The paper presents a novel coding scheme for ECG signals based on time-frequency atomic signal representations. Signal-adaptive parametric models based on overcomplete dictionaries of time-frequency atoms are considered. Such overcomplete expansions are here derived using the matching pursuit algorithm. The compression algorithm has been evaluated with the MIT-BIH Arrhythmia Database. Our algorithm was compared with several well-known ECG compression algorithms and was found to be superior at all tested bit rates. An average compression rate of approximately 140 bps (compression ratio of about 28:1) has been achieved with a good reconstructed signal quality (PRD of about 7% ).
Extracting the Haemodynamic Response Function from fMRI Time Series using Fourier-Wavelet Regularised Deconvolution with Orthogonal Spline Wavelets
Alle Meije Wink (University of Cambridge, United Kingdom); Jos Roerdink (University of Groningen, The Netherlands)
We describe a method to extract the haemodynamic response function (HRF) from functional magnetic resonance imaging (fMRI) time series based on Fourier-wavelet regularised deconvolution (ForWaRD), and introduce a simple model for the HRF. The HRF extraction algorithm extends the ForWaRD algorithm by introducing a more efficient computation in the case of very long wavelet filters. We compute shift-invariant discrete wavelet transforms (SI-DWT) in the frequency domain, and apply ForWaRD using orthogonal spline wavelets. Extraction and modelling of subject-specific HRFs is demonstrated, as well as the use of these HRFs in a subsequent brain activation analysis. Temporal responses are predicted by using the extracted HRF coefficients. The resulting activation maps show the effectiveness of the proposed method.

### Wed.5.3: Image Compression - 5 papers

Chair: Nick Kingsbury (University of Cambridge, United Kingdom)
Lossless Compression of Bayer Mask Images Using An Optimal Vector Prediction Technique
Stefano Andriani (University of Padova, Italy); Giancarlo Calvagno (University of Padova, Italy); Daniele Menon (University of Padova, Italy)
In this paper a lossless compression technique for Bayer pattern images is presented. The common way to save these images was to colour reconstruct them and then code the full resolution images using one of the lossless or lossy methods. This solution is useful to show the captured images at once, but it is not suitable for the source coding. In fact, the resulting full colour image is three times greater than the Bayer pattern image and the compression algorithms are not able to remove the correlations introduced by the reconstruction algorithm. However, the Bayer pattern images present new problems for the coding step. In fact, adjacent pixels belong to different colour bands mixing up different kinds of correlations. In this environment we present an optimal vector predictor, where the Bayer pattern is divided into non-overlaped 2x2 blocks, each of them predicted as a vector. We show that this solution is able to exploit the existing correlation giving a good improvement of the compression ratio with respect to other lossless compression techniques, e.g., JPEG-LS.
Representing Laplacian Pyramids with Varying Amount of Redundancy
Gagan Rath (IRISA-INRIA, France, France); Christine Guillemot (IRISA-INRIA, France, France)
The Laplacian pyramid (LP) is a useful tool for obtaining spatially scalable representations of visual signals such as image and video. However, the LP is overcomplete or redundant and has lower compression efficiency compared to critical representations such as wavelets and subband coding. In this paper, we propose to improve the rate-distortion (R-D) performance of the LP by varying its redundancy through decimation of the detail signals. We present two reconstruction algorithms based on the frame theory and the coding theory, and then show them to be equivalent. Simulation results with various standard test images suggest that, using suitable quantization parameters, it is possible to have better R-D performance over the usual or the dual frame based reconstruction.
Region based compression of multispectral images by classified KLT
Marco Cagnazzo (University of Napoli, Italy); Raffaele Gaetano (Università  "Federico II" di Napoli, Italy); Sara Parrilli (University of Napoli, Italy); Luisa Verdoliva (University of Napoli, Italy)
A new region-based algorithm is proposed for the compression of multispectral images. The image is segmented in homogeneous regions, each of which is subject to spectral KLT, spatial shape-adaptive DWT, and SPIHT encoding. We propose to use a dedicated KLT for each class rather than a single global KLT. Experiments show that the classified KLT guarantees a significant increase in energy compaction, and hence despite the need to transmit more side information, it provides a valuable performance gain over reference techniques.
A Rate-Distortion Approach to Optimal Color Image Compression
Evgeny Gershikov (Technion - Israel Institute of Technology, Israel); Moshe Porat (Technion- Israel Institute of Technology, Israel)
Most image compression systems deal today with color images although the coding theory and algorithms are still based on gray level imaging. The performance of such algorithms has not been analyzed and optimized so far for color images, especially when the selection of color components is considered. In this work we introduce a rate-distortion approach to color image compression and employ it to find the optimal color components and optimal bit allocation for the compression.We show that the DCT (Discrete Cosine Transform) can be used to transform the RGB components into an efficient set of color components suitable for subband coding. The optimal rates can be used to design adaptive quantization tables in the coding stage with results superior to fixed quantization tables. Based on the presented results, our conclusion is that the new approach can improve presently available methods for color compression.
Pointwise Shape-Adaptive DCT for high-quality deblocking of compressed color images
Alessandro Foi (Tampere University of Technology, Finland); Vladimir Katkovnik (Tampere University of Technology, Finland); Karen Egiazarian (Tampere University of Technology, Finland)
We present an high-quality image deblocking algorithm based on the shape-adaptive DCT (SA-DCT). The SA-DCT is a low-complexity transform which can be computed on a support of arbitrary shape. This transform has been adopted by the MPEG-4 standard and it is found implemented in modern video hardware. The use of this shape-adaptive transform for denoising and deblurring has been recently proposed, showing a remarkable performance. In this paper we discuss and analyze the use of this approach for the deblocking of block-DCT compressed images. Particular emphasis is given to the deblocking of highly-compressed color images. Extensive simulation experiments attest the advanced performance of the proposed filtering method. The visual quality of the estimates is high, with sharp detail preservation, clean edges. Blocking and ringing artifacts are suppressed while salient image features are preserved. The SA-DCT filtering used for the chrominance channels allows to faithfully reconstruct the missing structural information of the chrominances, thus correcting color-bleeding artifacts.

### Wed.1.3: MIMO Channel Modelling, Emulation and Sounding I (Special session) - 5 papers

Room: Auditorium
Chair: Peter Grant (Edinburgh School of Engineering and Electronics, United Kingdom)
A review of radio channel sounding techniques
David Laurenson (The University of Edinburgh, United Kingdom); Peter Grant (Edinburgh School of Engineering and Electronics, United Kingdom)
This short paper will introduce the key approaches that have been adopted for channel sounding and describe systems that have been reported to date for measuring indoor and outdoor radio channels in the 1-5 GHz range of operating frequencies.
Performance Verification of MIMO Concepts using Multi-Dimensional Channel Sounding
Christian Schneider (Technische Università¤t Ilmenau, Germany); Uwe Trautwein (TeWiSoft, Germany); Reiner Thomae (University of Ilmenau, Germany); Walter Wirnitzer (MEDAV GmbH, Germany)
The advances in multi-dimensional channel-sounding techniques make it possible to evaluate performances of radio multiple access and signal processing schemes under realistic propagation conditions. This paper focuses on the methodology how recorded impulse response data gathered through multidimensional channel sounding field measurements can be used to evaluate link- and system-level performances of the multiple-input multiple-output (MIMO) radio access schemes. The method relies on offline simulations. It can be classified in between the performance evaluation using some predefined channel models and the evaluation in field experiments using a set of prototype hardware. New aspects for the simulation setup are discussed, which are frequently ignored when using simpler model-based evaluations. Example simulations are provided for an iterative ("turbo") MIMO equalizer concept. The dependency of the achievable bit error rate performance on the spatial-temporal propagation characteristics and on the variation in some system design parameters is shown. Although in many of the considered constellations turbo MIMO equalization appears feasible in real field scenarios, there exist cases with poor performance as well, indicating that in practical applications link adaptation of the transmitter and receiver processing to the environment is necessary.
A Simple Approach to MIMO Channel Modelling
Rafal Zubala (Warsaw University of Technology, Poland); Hubert Kokoszkiewicz (Warsaw University of Technology, Poland); Martijn Kuipers (IT / IST-TUL, Portugal); Luis Correia (IST - Tech. Univ. Lisbon, Portugal)
A semi-statistical MIMO radio channel model is described, adequate for analysing multi-user environments, by simulating the channels between different users at the radio propagation level. The model is capable of simulating MIMO links between users, by allowing multiple antennas at mobile terminals and/or base stations. Results are shown for the influence of antenna spacing on MIMO capacity gain. For pico- and micro-cells, an increase in the number of antennas has a larger impact on capacity gain compared to macro-cells. Using the Geometrically Based Single Bounce Channel Model for micro-cell scenarios, a 20% variation in performance is obtained, depending on the orientation of antennas of both transmitter and receiver. For the macro-cell, a similar variation is seen, but only for the orientation of base station antennas.
Enhanced Tracking of Radio Propagation Path Parameters Using State-Space Modeling
Jussi Salmi (Helsinki University of Technology, Finland); Andreas Richter (Helsinki University of Technology, Finland); Visa Koivunen (Helsinki University of Technology, Finland)
Future wireless communication systems will exploit the rich spatial and temporal diversity of the radio propagation environment. This requires new advanced channel models, which need to be verified by real-world channel sounding measurements. In this context the reliable estimation and tracking of the model parameters from measurement data is of particular interest. In this paper, we build a state-space model, and track the propagation parameters with the Extended Kalman Filter in order to capture the dynamics of the channel parameters in time. We then extend the model by considering first order derivatives of the geometrical parameters, which enhances the tracking performance due to improved prediction and robustness against shadowing and fading. The model also includes the effect of distributed diffuse scattering in radio channels. The issue of varying state variable dimension, i.e., the number of propagation paths to track, is also addressed. The performance of the proposed algorithms is demonstrated using both simulated and measured data.
Characterization of MIMO Channels for Handheld Devices in Personal Area Networks at 5 GHz
Johan Karedal (Lund Univ., Sweden); Anders Johansson (Lund University, Sweden); Fredrik Tufvesson (Lund University, Sweden); Andreas Molisch (Mitsubishi Electric Research Laboratory, USA)
In this paper we analyze the properties of MIMO channels for personal area networks (PANs). Such channels differ from propagation channels in wide-area networks due to several reasons: (i) the environments in which the systems operate are different, (ii) the mobility models and ranges are different, (iii) the influence from human presence in the environment is different. In this paper, we present results from a measurement campaign for PAN channels between two handheld devices. The measurements are conducted over distances of 1-10 m using two handheld four-element antenna devices. For each distance, a number of channel realizations are obtained by moving the devices over a small area, and by rotating the persons holding the devices. We find that the correlation between the antenna elements is low. The small-scale statistics of the amplitude are well described by the Rayleigh distribution in many cases, but the effects of shadowing by the body of the operator can lead to different statistics.

### Poster: Audio Processing and Enhancement - 18 papers

Room: Poster Area
Chair: Graziano Bertini (ISTI-CNR, Italy)
Low-Delay Nonuniform Oversampled Filterbanks for Acoustic Echo Control
Bogdan Dumitrescu (Tampere University of Technology, Finland); Robert Bregovic (Tampere University of Technology, Finland); Tapio Saramaki (Tampere University of Technology, Finland); Riitta Niemisto (Nokia Corporation, Finland)
We propose an algorithm for designing nonuniform oversampled filterbanks with arbitray delay. The filterbank has uniform sections obtained by generalized DFT modulation; between the uniform sections, there are transition filters. There is no a priori constraint on the widths of transition filters channels, as in previous publications. The design algorithm is composed of three steps, in which a bank (analysis or synthesis) is optimized by solving convex optimization problems for finding the prototypes of uniform sections and the transition filters. In the first step, an orthogonal filterbank is designed, while in the other steps a bank is given and the other is optimized. We present an example of design suitable to subband processing of wideband speech signals.
Multiband Source/Filter Representation of Multichannel Audio for Reduction of Inter-Channel Redundancy
Athanasios Mouchtaris (Foundation for Research and Technology-Hellas, Greece); Kiki Karadimou (Foundation for Research and Technology-Hellas, Greece); Panagiotis Tsakalides (University of Crete, Greece)
In this paper we propose a model for multichannel audio recordings that can be utilized for revealing the underlying interchannel similarities. This is important for achieving low bitrates for multichannel audio and is especially suitable for applications when there is a large number of microphone signals to be transmitted (such as remote mixing or distributed musicians collaboration). Using this model, we can encode a multichannel audio signal using only one full audio channel and some side information in the order of few KBits/sec per channel, which can be used to decode the multiple channels at the receiving end. We apply objective and subjective measures in order to evaluate the performance of our method.
Tracking Behaviour of Acoustic Echo Canceller Using Multiple Sub-filters
Ravinder Sharma (Indian Institute of Technology, Kanpur, India)
An adaptive filter with large number of coefficients is required for acoustic echo cancellation. But a long length adaptive filter has slow convergence and poor tracking performance. In this paper a multipath acoustic echo model has been used as the basis for an adaptive echo canceller. The convergence and tracking performance of this new structure has been studied and compared with the conventional structure and found to be better.
Frequency domain simultaneous equations method for active noise control systems
Kensaku Fujii (University of Hyogo, Japan); Mitsuji Muneyasu (Kansai University, Japan); Yusuke Fujita (Catsystem corp., Japan)
This paper presents a new method applied to feedfoward type active noise control systems. This method, named frequency domain simultaneous equations method, is based on a different principle from the filtered-x algorithm requiring a filter modelled on a secondary path from a loudspeaker to an error microphone. Instead of the filter, this method uses an auxiliary filter identifying the overall path consisting of a primary path, a noise control filter and the secondary path. This paper first presents a computer simulation result demonstrating that the convergence speed of the proposed method is much higher than that of the filtered-x algorithm, and finally, by using an experimental system, verifies that the proposed method can automatically recover the noise reduction effect degraded by path changes.
On the use of linear prediction for acoustic feedback cancellation in multi-band hearing aids
Papichaya Chaisakul (Chulalongkorn University, Thailand); Nisachon Tangsangiumvisai (Chulalongkorn University, Thailand); Parinya Luangpitakchumpon (Chulalongkorn University, Thailand); Akinori Nishihara (Tokyo Institute of Technology, Japan)
An efficient approach to mitigate the howling effect in hearing aids is via the use of Acoustic Feedback Cancellation (AFC). In this paper, the use of a Forward Linear Predictor (FLP) is investigated to improve the performance of an AFC system in multi-band hearing aids. The FLP is used to predict the speech input signal before eliminating it from the error signal of the AFC system. Computer simulations demonstrate that more accurate estimation of the acoustic feedback signal than the conventional AFC system can be obtained. In addition, maximum usable gain of hearing aids required by the users can be achieved when employing the proposed multi-band AFC system.
Multi-Resolution Partial Tracking with Modified Matching Pursuit
Pierre Leveau (Laboratoire d'Acoustique Musicale (UPMC Paris), France); Laurent Daudet (Laboratoire d'Acoustique Musicale (UPMC Paris), France)
The widely used Matching Pursuit algorithmprocesses the input signal as a whole, and as such does not build relationships between atoms that are selected at every iteration. For audio signals, variants of this algorithm have been introduced that catch structured sets of atoms (molecules) sharing common properties: harmonic relationship, time-frequency proximity. However, they are limited by the use of a single scale, hence a fixed time-frequency resolution, within a molecule. In this study, we propose a modified Matching Pursuit that groups atoms at different scales within a given frequency line, allowing molecules with an optimized time and frequency resolution. Results on simple signals, as well as real audio recordings, show that the extra flexibility provided by multiresolution comes at a small computational cost.
A Double-talk Detection Algorithm Using a Psychoacoustic Auditory Model
Successful adaptive echo cancellation in telecommunications depends on a control device called a double-talk (DT) detector. DT refers to the situation when signals from both ends of an echo cancellation system are simultaneously active. In the presence of a DT condition, the role of a DT de-tector is to prevent divergence of the adaptive filter in an echo cancellation system. This paper presents a novel double-talk detection (DTD) algorithm using a psychoacoustic auditory model. The model exploits the frequency masking properties of the human auditory system. It performs an analysis of the far-end signal and removes spectral compo-nents below a perceptual threshold, to create spectral holes without affecting the perceptual quality of the signal. A DT condition can be detected by monitoring the energy level in the created holes. Simulations with real speech data and comparisons with other DTD algorithms are presented to show the performance of the proposed algorithm.
Statistical Properties of the FXLMS-Based Narrowband Active Noise Control System
Yegui Xiao (Prefectural University of Hiroshima, Japan)
Noise signals generated by rotating machines such as diesel engines, cutting machines, fans and so on may be modeled as noisy sinusoidal signals which can be successfully suppressed by narrowband active noise control (ANC) systems. In this paper, the statistical performance of such a conventional filtered-X LMS (FXLMS) based narrowband ANC system is investigated in detail. First, difference equations governing the dynamics of the system are derived in terms of convergence of the mean and mean square estimation errors for the discrete Fourier coefficients (DFC) of the secondary source. Steady-state expressions for DFC estimation mean square error (MSE) as well as the remaining noise power are then developed in closed forms. A stability bound for the FXLMS in the mean sense is also derived. Extensive simulations are performed to demonstrate the validity of the analytical findings.
Logarithmic temporal processing applied to accurate empirical transfer function measurements in vocal sound propagation
Masanori Morise (Wakayama University, Japan); Toshio Irino (Wakayama University, Japan); Hideki Kawahara (Wakayama University, Japan)
A new procedure to improve accuracy in empirical transfer function measurement method is proposed for investigating speech sound propagation. In our previous work, vowel dependent behavior of empirical transfer functions from a lip reference point to observation points around speaker's head were found. The accuracy of the method were also evaluated by using references obtained using a HATS and M-sequence and revealed significant accuracy degradations in higher frequency range due to low speech energy. The proposed method alleviates this problem by introducing a logarithmic temporal manipulation and lowpass filtering. The proposed method was tested using 186 vocalizations of sustained Japanese vowels. The test results indicated that the proposed method reduced standard deviations in measurement down to 80% in gain estimation, 33.8% in weighted group delay estimation and 20% in duration estimation respectively in frequency region higher than 10 kHz. Detailed aspects on implementation are also discussed.
A time improvement over the Mycielski algorithm for predictive signal coding: Mycielski-78
The Mycielski algorithm is commonly known for applica-tions requiring high quality predictions due to its infinite-past rule based prediction method. Since it repeatedly searches from the beginning of the data source, the complex-ity becomes non-polynomial, resulting in a limited practical use in multimedia applications, including coding. In this work, we present a time improvement over the Mycielski method by incorporating a codebook for the search which is dynamically constructed during the coding process. The construction method is symmetrical in the encoder and de-coder parts, therefore reconstruction is assured. The idea and strategy resembles the time improvement of the cele-brated LZ-78 method over the LZ-77 compression method, where the non-polynomial search is shortened by incorpo-rating a codebook. Analogously, we call the faster method proposed here, the Mycielski-78 method.
Sound Reproduction System with Simultaneous Perturbation Method
Kazuya Tsukamoto (Kansai University, Japan); Yoshinobu Kajikawa (Kansai Univ., Japan); Yasuo Nomura (Kansai University, Japan)
In this paper, we propose a novel sound field reproduction system using the simultaneous perturbation (SP) method and its fast convergence version. In the conventional sound reproduction systems, the preprocessing filters are generally determined and fixed based on transfer functions from loudspeakers to control points in advance. However, movements of control points result in severe localization errors. Therefore, we propose a sound field reproduction system using the SP method which updates the filter coefficients only using error signal. The SP method suffers from the disadvantage of slow convergence, although this method can track the movements of any controlling points. Hence, we also propose an improving method of the convergence speed, which compensates only the delay by using the delay control filters. Simulation results demonstrate that the proposed methods can track the movements of control points and have reasonable convergence speed.
A multi-channel maximum likelihood approach to de-reverberation
Massimiliano Tonelli (Queen Mary, University of London, United Kingdom); Maria Jafari (Queen Mary, University of London, United Kingdom); Michael Davies (Queen Mary, University of London, United Kingdom)
Reverberation can severely degrade the intelligibility of speech. Blind de-reverberation aims at restoring the original signal by attenuating the reverberation without prior knowledge of the surrounding acoustic environment nor of the source. In this paper, single-channel and multi-channel de-reverberation structures are compared and the advantages of the multi-channel approach are discussed. We propose an adaptive multi-channel blind de-reverberation algorithm based on a maximum likelihood approach that exploits results relating to the multiple input/output inverse theorem (MINT). The performance of the algorithm is illustrated using an eight-channel linear microphone array placed in a real room. Simulation results show that the algorithm can achieve very good de-reverberation when the channels are time aligned.
A Time Domain System for Transient Enhancement in recorded music
Graziano Bertini (ISTI-CNR, Italy); Massimo Magrini (ISTI-CNR, Italy); Tommaso Giunti (ISTI-CNR, Italy)
The set of audio treatment methods commonly used in the mastering process of commercial music tend to increase the loudness perception of the final audio product, obtaining a fat sound that can be played with sufficient quality on low level audio devices, such as small radios or computer loudspeaker. The side effect of these audio processes are the loss of transients and dynamic variations, with a resulting flat sound. The widely diffused compressed audio formats (mp3, wma) introduce further degradation of the recorded music. In this paper we describe a method for time domain transient enhancement of recorded music, which can be easily implemented with low cost Digital Signal Processors in stand alone device or included in HW (iPod like) or SW (Winamp like) audio player applications.
Testing supervised classifiers based on non-negative matrix factorization to musical instrument classification
Emmanouil Benetos (Aristotle University of Thessaloniki, Greece); Constantine Kotropoulos (Aristotle University of Thessaloniki, Greece); Thomas Lidy (Vienna University of Technology, Austria); Andreas Rauber (Vienna University of Technology, Austria)
In this paper, a class of algorithms for automatic classification of individual musical instrument sounds is presented. Two feature sets were employed, the first containing perceptual features and MPEG-7 descriptors and the second containing rhythm patterns developed for the SOMeJB project. The features were measured for 300 sound recordings consisting of 6 different musical instrument classes. Subsets of the feature set are selected using branch-and-bound search, obtaining the most suitable features for classification. A class of supervised classifiers is developed based on the non-negative matrix factorization (NMF). The standard NMF method is examined as well as its modifications: the local and the sparse NMF. The experiments compare the two feature sets alongside the various NMF algorithms. The results demonstrate an almost perfect classification for the first set using the standard NMF algorithm (classification error 1.0%), outperforming the state-of-the-art techniques tested for the aforementioned experiment.
On the efficiency of time-varying channel models
Scott Rickard (University College Dublin, Ireland); Konstantinos Drakakis (, ? ); Nikolaos Tsakalozos (University College Dublin, Algeria)
We consider the form of a physically motivated simple one path time-varying channel in the time-varying impulse response, time-frequency characterization and time-scale characterization settings. The focus, in general, is to determine which setting allows for the most efficient (i.e., sparse) discrete channel representation. We measure how well the discrete representation reconstructs the channel when a limited number of discrete coefficients are available and compare the result among the channel models.
Noise Robust Relative Transfer Function Estimation
Markus Schwab (Technical University of Berlin, Germany); Peter Noll (TU Berlin, Germany); Thomas Sikora (Technische Università¤t Berlin, Germany)
Microphone arrays are suitable for a large range of applications. Two important applications are speaker localization and speech enhancement. For both of these the transfer functions from one microphone to the other microphones are needed to form potential algorithms for these applications. In this paper we present a new transfer function estimator optimized for speech sourdes in a noisy environment. To achieve this, we integrate an new covariance matrix estimation algorithm for the noisy speech as well as for the adaptive and correlated noise signals as received by the microphones. Results are given which illustrates the higher performance of this algorithms against traditional algorithms.
An identification system of monophonic musical tones
Vincenzo Di Salvo (CONSIGLIO NAZIONALE DELLE RICERCHE, Italy); Graziano Bertini (ISTI-CNR, Italy)
ABSTRACT The present paper describes the problem of defining a system recognizing musical tones in the context of automatic identification systems. This study had arisen from the demand to have a tool based on a comparison criterion to measure the distance between the tones of a just executed monophonic sequence and a reference database of musical tones of the same instrument. The algorithm is realized in two main distinct steps. At first, digital processing techniques are used with the purpose to obtain a pattern vectors from the original waveform. This resulting patterns are, subsequently, elaborated using the Least Squares Optimal Filtering. The algorithm is relatively simple and may be implemented efficiently with low latency on DSP processors .
Digital Sound Effects Echo and Reverb Based on Non-Exponentially Decaying Comb Filter
Vitezslav Kot (Brno University of Technology, Czech Republic)
The paper presents algorithms of two digital sound effects based on NEDCF (Non-Exponentially Decaying Comb Filter). The first part of the paper is devoted to description of NEDCF structure and algorithm of digital sound effect type of Echo with easy controllable parameters. The second part describes an extension of previous algorithm in order to obtain a new algorithm of multichannel digital reverberation. Presented reverberation algorithm produces impulse response with controllable decay curve, reverberation time and frequency dependent reverberation time. The decay curve can consist of arbitrary number of increasing or decreasing linear segments, which provide possibility of creating an interesting reverberation effect.

### Poster: Sensor Array Processing - 10 papers

Room: Poster Area
Chair: Fabrizio Lombardini (Univ. of Pisa, Italy)
Efficiency of subspace-based estimators
Jean-Pierre Delmas (INT, France); Habti Abeida (INT, Algeria)
This paper addresses subspace-based estimation and its purpose is to complement previously available theoretical results generally obtained for specific algorithms. We focus on asymptotically (in the number of measurements) minimum variance (AMV) estimators based on estimates of orthogonal projectors obtained from singular value decompositions of sample covariance matrices associated with the general linear model ${\bf y}_t={\bf A}(\Theta){\bf x}_t+{\bf n}_t$ where the signals ${\bf x}_t$ are complex circular or noncircular and dependent or independent. Using closed-form expressions of AMV bounds based on estimates of different orthogonal projectors, we prove that these AMV bounds attain the stochastic Cramer-Rao bound (CRB) in the case of independent circular or noncircular Gaussian signals.
A Novel Technique for Broadband Subspace Decomposition
John McWhirter (QinetiQ ltd, United Kingdom); Paul Baxter (QinetiQ Ltd, United Kingdom); Tom Cooper (QinetiQ Ltd, United Kingdom); Soydan Redif (QinetiQ Ltd, United Kingdom)
A generalisation of the eigenvalue decomposition (EVD) is proposed for symmetric polynomial matrices. A novel technique for computing this polynomial matrix EVD is outlined. It involves applying a sequence of elementary paraunitary matrices and is referred to as the second order sequential best rotation (SBR2) algorithm. An application of the SBR2 algorithm to broadband subspace identification is briefly illustrated.
Stochastic ML Estimation under Misspecified Number of Signals
Pei-Jung Chung (The University of Edinburgh, United Kingdom)
The maximum likelihood (ML) approach for estimating direction of arrival (DOA) plays an important role in array processing. Its consistency and efficiency have been well established in the literature. A common assumption made is that the number of signals is known. In many applications, this information is not available and needs to be estimated. However, the estimated number of signals does not always coincide with the true number of signals. Thus it is crucial to know whether the ML estimator provides any relevant information about DOA parameters under a misspecified number of signals. In the previous study, we focused on the deterministic signal model and showed that the ML estimator under a misspecified number of signals converges to a well defined limit. Under mild conditions, the ML estimator converges to the true parameters. In the current work, we extend those results to the stochastic signal model and validate our analysis by simulations.
Nonparametric Method for Detecting the Number of Narrowband Signals without Eigendecomposition in Array Processing
Jingmin Xin (Fujitsu Laboratories Limited, Japan); Yoji Ohashi (Fujitsu, Japan); Akira Sano (Keio University, Yokohama, Japan, Japan)
A computational simple and efficient nonparametric method for estimating the number of signals without eigendecomposition is proposed for the narrowband signals impinging on a uniform linear array (ULA). When finite array data are available, a new detection criterion is formulated in terms of the row elements of the QR upper-triangular factor of the auto-product of a matrix formed from the cross-correlations between some sensor data. Then the number of signals is determined as a value for which this ratio criterion is maximized, where the QR decomposition with column pivoting is also used to improve detection performance. The proposed estimator is asymptotically consistent, and it is superior in detecting closely-spaced signals with a small number of snapshots and/or at low signal-to-noise ratio (SNR).
Adaptive array detection algorithms with steering vector mismatch
Chin Heng Lim (University of Edinburgh, United Kingdom); Aboutanios Elias (University of Edinburgh, United Kingdom); Bernard Mulgrew (Institute for Digital Communications, The University of Edinburgh, United Kingdom)
The problem of signals detection with known templates in coloured Gaussian interference is especially relevant for radar applications. The optimum detector requires knowledge of the true interference covariance matrix, which is usually unknown. In practice, the matrix is estimated from target-free training data that must be statistically homogeneous with the test data. Traditional approaches, like the GLRT and AMF, require the availability of such training data. These conditions are normally not always satisfied, which degrades the detection performance. Recently, two single data set approaches that deal only with the test data, namely the GMLED and MLED were introduced. In this paper, we examine the performance of these detection algorithms and their reduced-dimension counterparts under steering vector mismatches. We investigate the cases where there is a mismatch in the spatial and temporal steering vectors.
Subspace-based Estimation of DOA with Known Directions
Guillaume Bouleux (LSS, France); Rémy Boyer (CNRS, Université Paris-Sud (UPS), Supelec, France)
Estimation of Directions-Of-Arrival is an important problem in various applications. {\em A priori} knowledge on the signal location is sometimes available and previous works have exploited this prior-knowledge. The principle is to "orthogonally" deflate the signal subspace and therefor to cancel the known part of the steering matrix. Our solution is based on a simple modification of the well-known MUSIC criterion by substituting the classical Moore-Penrose pseudo-inverse by the {\em obliquely weighted} pseudo-inverse. The later is in fact an efficient way to introduce prior-knowledge into subspace fitting techniques.
Undersea buried object detection using Stoneley-Scholte waves: application in coherent noise
Cyril Kotenkoff (Laboratoire des Images et des Signaux, France); Jean-Louis Lacoume (Laboratory of Images and Signals, France); Jerome Mars (Laboratoire des Images et des Signaux, France)
A new system for buried object detection at the sea floor is presented. It is an alternative to SONAR systems using Stoneley-Scholte surface waves. The general processing method was presented in a previous publication. It is a multicomponent beamforming using an array of four component sensors set on the floor to detect echoes reflected by objects. In this paper we extend the optimal reception to waves in a correlated noise field. We derive from literature results an empirical model including spatial and intercomponent correlation. We present simulations based on it. Output SNR comparisons are made in optimal and non optimal cases. Finally we discuss the pertinence of the introduced model.
Robust Adaptive Beamformer with Feasibility Constraint on the Steering Vector
Wenyi Zhang (University of California, San Diego, USA); Bhaskar Rao (University of California, San Diego, USA)
The standard MVDR beamformer has high resolution and interference rejection capability when the array steering vector is accurately known. However, it is known to degrade if steering vector error exists. Motivated by recent work in robust adaptive beamforming, we develop variants of the constrained robust adaptive beamformer that attempt to limit the search in the underlying optimization problem to a feasible set of steering vectors thereby achieving improved performance. The robustness against steering vector error is provided through a spherical uncertainty set constraint, while a set of magnitude constraints is enforced on each element of the steering vector to better constrain the search in the space of feasible steering vectors. By appropriately changing the variables, the optimization problem is modified such that the need for the magnitude constraints are avoided. The developed algorithm is tested in the context of speech enhancement using a microphone array and shown to be superior to existing algorithms.
Localization of Tactile Interactions through TDOA analysis: geometric vs. inversion-based methods
Augusto Sarti (DEI - Politecnico di Milano, Italy); Giovanni De Sanctis (Politecnico di Milano, Italy); Diego Rovetta (Politecnico di Milano, Italy); Gabriele Scarparo (DEI - Politecnico di Milano, Italy); Stefano Tubaro (Politecnico di Milano, Italy)
In this paper we propose a comparison of three different methods for the localization of tactile interactions with a planar surface, all based on the analysis of differences between time of arrivals from a single source to a set of contact sensors, located around the area of interest. We tested these methods with both synthetic and real data, taken recording a finger touch with four sensors placed over a MDF (Medium Density Fiberboard) board, to prove their accuracy and robustness against time delay estimation errors. The aim is to create tangible interfaces without intrusive sensors or devices placed inside the interaction area.
New Recursive Adaptive Beamforming Algorithms for Uniform Concentric Circular Arrays with Frequency Invariant Characteristics
Haihua Chen (The University of Hong Kong, Hong Kong)
This paper proposes new recursive adaptive beamforming algorithms for uniform concentric circular array (UCCA) that has nearly frequency invariant (FI) characteristics. By using a fixed compensation network, the far field pattern of a UCCA frequency invariant beamformer (FIB) is determined by a set of weights and it is approximately invariant over a wide range of frequencies. New recursive adaptive beamforming algorithms based on the least mean square (LMS) and recursive least square (RLS) algorithms and the Generalized Sidelobe Canceller (GSC) structure are proposed to address the high computational complexity of the sample matrix inversion (SMI) method proposed previously by the authors. Simulation results show that the proposed adaptive FI-UCCA beamformer requires much fewer variable taps than the conventional UCCA for the same steady-state performance, while offering much faster convergence speed.

### Poster: Detection of Digital Data Signals - 5 papers

Room: Poster Area
Chair: Javier Fonollosa (Technical University of Catalonia (UPC), Spain)
Joint deregularization and box-constraint in multiuser detection
Yuriy Zakharov (University of York, United Kingdom); Jie Luo (University of Maryland, USA); Christos Kasparis (University of Bristol, United Kingdom)
Multiuser detection can be described as a quadratic optimization problem with binary constraint. Many techniques are available to find approximate solution to this problem. These techniques can be characterized in terms of complexity and detection performance. The efficient frontier'' of known techniques include the decision-feedback (DF), branch-and-bound (BB) and probabilistic data association (PDA) detectors. We propose a novel iterative multiuser detection technique based on joint deregularized and box-constrained solution to quadratic optimization with iterations similar to that used in the nonstationary Tikhonov iterated algorithm. The deregularization maximizes the energy of the solution; this is opposite to the Tikhonov regularization where the energy is minimized. However, combined with box-constraints, the deregularization forces the solution to be close to the binary set. Our development improves the efficient frontier'' in multiuser detection, which is illustrated by simulation results.
The Demodulation of M-PSK and M-QAM Signals Using Particle Filtering
Maoge Xu (Nanjing University of Science & Technology, P.R. China)
In this paper, the problem of particle filter to demodulate uncoded M-PSK and M-QAM signals over Rayleigh flat fading channels is investigated. Based on the JakesÂ¯ model, the channel state is modelled as a first order autoregressive (AR) process. The observation noise is assumed complex Gaussian. It is shown that, in the demodulation of uncoded PSK signals, particle filter does not have superiority compared to the decision-directed Kalman filter due to the M-ary phase ambiguity of the PSK signalsÂ£Â¬ but it is not the case to detect uncoded QAM signals. As can be seen, while using the same pilot symbol rate in the demodulation of uncoded M-QAM signals particle filter outperforms the decision-directed Kalman filter and it performs well even in the low pilot symbol rate.
Soft-Output Detection of CPM Signals Transmitted Over Channels Affected by Phase Noise
Alan Barbieri (University of Parma, Italy); Giulio Colavolpe (University of Parma, Italy)
We consider continuous phase modulations (CPMs) and their transmission over a typical satellite channel affected by phase noise. By modeling the phase noise as a Wiener process and adopting a simplified representation of an M-ary CPM signal based on the principal pulses of its Laurent decomposition, we derive the MAP symbol detection strategy. Since it is not possible to derive the exact detection rule by means of a probabilistic reasoning, the framework of factor graphs (FGs) and the sum-product algorithm is used. By pursuing the principal approach to manage continuous random variable in a FG, i.e., the canonical distribution approach, two algorithms are derived which do not require the presence of known (pilot) symbols, thanks to the intrinsic differential encoder embedded in the CPM modulator.
An MMSE Based Weighted Aggregation Scheme for Event Detection using Wireless Sensor Network
Bhushan Jagyasi (IIT Bombay, India); Bikash Dey (Indian Institute of Technology Bombay, India); S. n. Merchant (IIT Bombay, India); Uday Desai (IIT Bombay, India)
We consider design of wireless sensor network for event detection application. An MMSE based weighted aggregation scheme is proposed for event detection application using wireless sensor network. Accuracy and the network lifetime are the two performance evaluating parameters considered here. We compare the performance of the proposed scheme with the previously known schemes.
Blind Bayesian Multiuser Detection for Impulse Radio UWB Systems with Gaussian and Impulsive Noise
Ahmet Ay (Bogazici University, Turkey); Hakan Delic (Bogazici University, Turkey); Mutlu Koca (Bogazici University, Turkey)
In this paper, we address the blind parameter estimation and multiuser detection problems for impulse radio ultra-wide band (UWB) systems under frequency selective fading. We consider that the ambient and impulsive noise parameters as well as the UWB channel, characterized by a large number of taps, are unknown. We propose a blind Bayesian multiuser detector employing Gibbs sampling. Because a Gibbs sampler is a soft-input soft-output module, capable of exchanging probabilistic information, the proposed detector is also employed within a turbo multiuser detection structure for coded UWB systems. The simulation results show that the Gibbs sampler is effective in estimating the system parameters and that the proposed receiver provides significant performance gains after a few detection/decoding iterations.

### Poster: Signal Processing Applications in Engineering - 8 papers

Room: Poster Area
Chair: Jose Luis Sanz (University of Santander, Spain)
Gabor feature vectors to detect changes amongst time series: a geoacoustic example
Patrick Oonincx (Netherlands Defense Academy, The Netherlands)
The Matching Pursuit algorithm is used to decompose Greens functions of varying systems. Feature vectors representing the Greens functions are constructed from the pursuit approximation. Increasing distances among these vectors are related to changing parameters in the systems. The quality of the entries of the feature vectors is discussed and the distances between these vectors are measured following an adaptive approach. Results of the method are ilustrated using a geophysical example.
A new approach for roughness analysis of soil surfaces
Edwige Vannier (Centre d'étude des Environnements Terrestres et Planétaires, France); Antoine Gademer (CETP/IPSL, France); Valérie Ciarletti (CETP/IPSL, France)
We propose a new method for roughness analysis of soil surfaces based on a multiscale approach. Tilled surfaces are decomposed into a large scale oriented structure due to the tillage practice and a non-orderly spatial distribution of clods. The multiresolution approximations of the soil surface allow to characterize these two components. The approximation at the level 4 enhances the large scale oriented structure due to the tillage practice. The spatial distribution of the size of the clods is modeled after the detection of the clods by a dedicated algorithm. The discrimination between different types of soil tillage practices and the evolution of a soil surface under controlled rainfalls are considered. The method can be applied to other types of natural relief studies.
Histogram-Based Orientation Analysis for Junctions
Jan Hendrik Moltz (University of Luebeck, Germany); Ingo Stuke (University of Luebeck, Germany); Til Aach (RWTH Aachen University, Germany)
This article presents an algorithm for multiple orientation estimation at junctions which can be used as a first step towards a complete description of the junction structure. The algorithm uses the structure tensor approach to determine the orientations of the edges or lines that meet at a junction and then extracts the principal orientations from a histogram of the orientation angles in a circular region around the junction. In contrast to previous solutions it uses only first-order derivatives and is suited for junctions with an arbitrary number of orientations without increasing the runtime.
Frequency Selective Detection of NQR Signals in the Presence of Multiple Polymorphic Forms
Samuel Somasundaram (King's College London, United Kingdom); Andreas Jakobsson (Karlstad University, Sweden); John Smith (King's College London, United Kingdom)
Nuclear quadrupole resonance (NQR) is a radio frequency (RF) technique that detects compounds in the solid state and is able to distinguish between different polymorphic forms of certain compounds. For example, a typical sample of trinitrotoluene (TNT) will contain at least two polymorphic forms with rather different NQR properties. In this paper, we propose a frequency selective hybrid detector that exploits the presence of such polymorphic forms. The presented detector offers both improved probability of detection, as compared to recently proposed detectors, and allows for an estimation of the relative proportions of the multiple polymorphic forms.
A high resolution pursuit-based approach for detecting ultrasonic flaw echoes
Nicolas Ruiz Reyes (University of Jaen, Spain); Pedro Vera Candeas (University of Jaen, Spain); Jose Curpian Alonso (University of Jaen, Spain)
A new NDT method to detect ultrasonic flaw echoes close to the surface in strongly scattering materials is proposed. The method is based on High Resolution Pursuit (HRP), which is a version of Matching Pursuit (MP) that emphasizes local fit over global fit. Since HRP produces representations which resolve closely spaced features, it is a very valuable signal processing tool for achieving the goal claimed in this work. Furthermore, HRP has the same order of complexity of MP. The good performance of the method is experimentally verified using ultrasonic traces acquired from a Carbon Fibre Reinforced Plastic (CFRP) material.
Abrupt noise rejection using wavelet transform for electromagnetic wave
Akitoshi Itai (Aichi Prefectural University, Japan); Hiroshi Yasukawa (Aichi Prefectural University, Japan); Ichi Takumi (Nagoya Institute of Technology, Japan); Masayasu Hata (Chubu University, Japan)
This paper presents a method of realizing abrupt noise rejection in an observed electromagnetic (EM) wave signal by using the wavelet transform(WT) technique. Our goal is to better process the EM waves that radiate from the earth's crust in order to predict earthquakes. The proposed method involves the multi-scale wavelet transform domain with a second-order derivation property. Typical noise rejection methods that use WT set the threshold according to the noise power. Unfortunately, this approach can suppress the precursor radiation since its radiated energy and period are similar to the abrupt noise. The method proposed herein uses local maximum points on the WT coefficient (wavelet maxima), which means that no threshold is needed to suppress abrupt noise. It is shown that the proposed method can be reject abrupt noise without dropping the precursor signal.
Modelling Elastic Wave Propagation In Thin Plates
Diego Rovetta (Politecnico di Milano, Italy); Augusto Sarti (DEI - Politecnico di Milano, Italy); Giovanni De Sanctis (Politecnico di Milano, Italy); Marco Fabiani (Politecnico di Milano, Italy)
In this work we propose an in-depth study of elastic wave propagation in thin plates. We show that at the frequency range of interest and for modest plate thicknesses, the only waves that can be excited and propagate in the structure are guided waves (also called Lamb waves). This propagation modeling approach is based on the theory of Viktorov. The elastic properties of the panel and the finger touch signature are usually unknown, therefore we propose a method for estimating them through a simple experimental procedure. The obtained estimates are then used to simulate the propagation in the boards. Our approach is to implement the general solution of the elastic wave equation for infinite plates, and introduce the boundary conditions afterwards using a real-time beam tracer. We finally prove the effectiveness of the approach by comparing the predicted response of a finger touch with the measured one on materials such as MDF (Medium Density Fiberboard) and PLX (Plexiglass).
HOS-Based Method for Power Quality Events Classification
Danton Ferreira (UFJF, Brazil); Augusto Cerqueira (UFJF, Brazil); Moises Ribeiro (Federal University of Juiz de Fora, Brazil); Carlos Duque (UFJF, Brazil)
This paper presents a novel method for classification of power quality events in voltage signals which makes use of higher order statistics based technique for extracting a reduced and representative event signature vector. The signature vectors composed of samples from the diagonal slices of the second and forth order cummulants, which are selected with Fisher's discriminant ratio (FDR), provide enough separability among classification regions resulting in classification rate as high as 100\% if the voltage signals are corrupted by the presence of isolated events. A comparison performance among the proposed method and two other ones found on the literature is provided and reveals that the proposed method not only achieve a good performance, but it also surpass the performance of previous techniques.

### Wed.6.3: Sonar signal processing - 5 papers

Room: Room 4
Chair: Stefano Coraluppi (NATO Undersea Research Centre, Italy)
Signal processing for an active sonar system suitable for advanced sensor technology applications and environmental adaptation schemes
Alberto Baldacci (NATO Undersea Research Centre, Italy); Georgios Haralabus (NATO Undersea Research Centre, Italy)
An overview of the basic elements of an active sonar system in conjunction with a description of the signal processing chain utilized at the NATO Undersea Research Centre for detection and localization of undersea targets is presented. As the focus of the Navy has shifted to the complex and volatile littoral environments characterized by high background interference, emphasis is given to the capability of advanced sensor technology to resolve left/right ambiguity in the direction of arrival and in broadband adaptation methods to enhance detection performance by incorporating in situ environmental information in the processing chain. This processing was used in a series of sea trials organized and executed by the Centre, a few of which included submarine targets.
Sonar Signal Processing Based On COTS Components
Dirk Maiwald (ATLAS ELEKTRONIK GmbH, Germany)
This paper addresses the application of commercial off-the-shelf (COTS) components to sonar signal processing. The Active Towed Array Sonar Sytem (ACTAS) is introduced, the COTS components used for setting up the signal processing are described, and key experiences with the use of COTS are illustrated. Furthermore, the main software components necessary for system set-up are described. CORBA is used for communication and data transfer between software components.
Use of statistical hypothesis test for mines detection in SAS imagery
Lionel Bombrun (Images and Signals Laboratory, France); Frédéric Maussang (Images and Signals Laboratory, France); Eric Moisan (Images and Signals Laboratory, France); Alain Hétet (DGA/DCE/GESMA, France)
Synthetic Aperture Sonar (SAS) imagery is currently used in order to detect underwater mines laying on or buried in the sea bed. But the low signal to noise ratio characterizing these images leads to a high number of false alarms. In this paper, a new method of detection based on a statistical hypothesis test is presented. The proposed method can be divided into two main steps. Firstly, a statistical model of the speckle noise is described. A statistical hypothesis test is then performed and an evaluation of the performances is proposed.
Stefano Coraluppi (NATO Undersea Research Centre, Italy); Craig Carthel (NATO Undersea Research Centre, Italy)
This paper provides an overview of an on-going research effort at the NATO Undersea Research Centre. In particular, we focus on the automatic tracking component of the active sonar signal and information processing chain that takes hydrophone data from multiple receivers and generates a unified surveillance picture. Novel features of our tracking approach include the determination and exploitation of statistically consistent contact measurement covariances, the use of a computationally efficient multihypothesis data association algorithm, and the instantiation of both centralized and distributed data fusion schemes to optimise tracking performance.
Directivity Analysis of Time-Reversal Arrays in a Simple Ocean Waveguide
Joao Gomes (ISR - Instituto Superior Tecnico, Portugal)
This paper analyses the directivity of time-reversal arrays of arbitrary shape in a range-independent homogeneous ocean waveguide. Through feedback, these devices have the ability to focus waves in unknown media, which makes them potentially useful in many applications. The analysis is based on the image method, which expands the array into a series of reflected virtual replicas that interfere to create a strong focal spot. This setup allows the acoustic field to be approximately expressed as the product of two terms, one of which depends on the free-space directivity of the array, and the other one on the environmental properties. The former dictates the gross distribution of acoustic energy in the water column, while the latter defines the fine-scale variations and the effective size of the focus.

### Wed.4.3: Equalization II - 5 papers

Room: Sala Onice
Chair: Kostas Berberidis (University of Patras, Greece)
Semi-Blind Space-Time Equalization for Single-Carrier MIMO Systems with Block Transmission
Luciano Sarperi (University of Liverpool, United Kingdom); Xu Zhu (University of Liverpool, United Kingdom); Asoke Nandi (University of Liverpool, United Kingdom)
This paper proposes a novel semi-blind space-time equalization method for wireless Multiple-Input Multiple-Output (MIMO) spatial multiplexing systems using Single-Carrier Cyclic-Prefix (SC-CP) block transmissions. Independent Component Analysis (ICA) is employed to track the time-varying MIMO channel. It is shown that with a training overhead of only 0.05%, the proposed method provides close performance to the case with perfect channel state information (CSI), even at relatively high Doppler frequency. The semi-blind SC-CP system also outperforms its OFDM counterpart with perfect CSI at high Signal to Noise Ratios (SNRs).
Digital estimation of analog imperfections using blind equalization
Davud Asemani (Ecole Supeieure d'Electricite (Supelec), France); Jacques Oksman (Supelec, France); Daniel Poulton (Supelec, France)
The analog electronic circuits are always subject to some imperfections. Analog imperfections cause deviations from nominal values of electronic elements. In the case of Linear Time-Invariant (LTI) circuits, the coefficients of the transfer function include some deviations from related typical values leading to the differences between the typical (i.e. design) and the actual transfer functions. In this paper, the analog imperfections are digitally estimated using only the output samples, without any access to the input signal nor to the analog system (blind method). Super Exponential Algorithm (SEA) is used as the blind equalization technique, since it provides rapid convergence.The only assumption is that the input is a non-Gaussian independent and identically distributed (i.i.d.) signal. Using this algorithm, the effects of analog imperfections in the analog circuits can be digitally estimated and possibly compensated without any dependance on the types and the sources of the analog imperfections. It provides the possibility to have an online compensation of the imperfections (realization errors, drifts, etc.). The analog imperfections have been estimated with a precision of 0.2% and 1.3\% for the exemplary RC and RLC circuits respectively.
A frequency domain conjugate gradient algorithm and its application to channel equalization
Aris Lalos (University of Patras, Dept. of Computer Eng. and Informatics, Greece); Kostas Berberidis (University of Patras, Greece)
In this paper, a new block adaptive filtering algorithm, based on the Conjugate Gradient (CG) method of optimization, is proposed. A Toeplitz approximation of the autocorrelation matrix is used for the estimation of the gradient vector and the correlation quantities are updated on a block by block basis. Due to this formulation, the algorithm can be implemented in the frequency domain (FD) using the fast Fourier transform (FFT). Efficient recursive relations for the frequency domain quantities updated on a block by block basis have been derived and an appropriate decoupling of the direction vector has been applied. The applicability of the new algorithm to the problem of adaptive equalization is studied. The proposed algorithm exhibits superior convergence properties as compared to existing CG techniques, offering significant savings in computation complexity.
Robust Group DelayEqualization of Discrete-Time Filters Using Neural Networks
Maurício Quélhas (Universidade Federal do Rio de Janeiro, Brazil); Antonio Petraglia (Federal University of Rio de Janeiro, Brazil)
A novel methodology for first allocating the poles and zeros of a group delay equalizer is introduced. In this paper, feed-forward neural networks are used, instead of empirical formulae. The results obtained with the networks can be applied as the initial solution of an optimization procedure that searches for the optimum group delay equalizer. Different inputs are considered a priori for the neural networks, but after pre-processing the acquired data, some of them are discarded. By evaluating cross-correlation between the inputs and the outputs, the simplified networks are designed through an optimization procedure with mean square error back-propagation, by batch method. Armijo search method is applied in this procedure for improving convergence rate. Simulation results proving the efficiency of the proposed method are presented. Quasi-equiripple group delay responses are obtained with the neural network initial estimate, avoiding local minima and improving the convergence rate of the optimization step of the equalizer designs.
PDA-BCJR Algorithm for Factorial Hidden Markov Models with Application to MIMO Equalisation
Robert Piechocki (University of Bristol, United Kingdom); Christophe Andrieu (University of Bristol, United Kingdom); Magnus Sandell (Toshiba Research Europe Limited, United Kingdom); Joe McGeehan (University of Bristol, United Kingdom)
In this paper we develop an efficient algorithm for inference in Factorial Hidden Markov Models (FHMM), which is particularly suitable for turbo equalisation in Multiple Input - Multiple Output (MIMO) systems. The proposed PDA-BCJR algorithm can be viewed as a generalisation of the PDA algorithm, which in its basic form handles single latent variables only. Our generalisation replaces each of the single latent variables with a HMM.

### Wed.3.3: Synchronization and Parameter Estimation - 5 papers

Room: Sala Verde
Chair: Paolo Banelli (University of Perugia, Italy)
Closed-form expressions of the true Cramer-Rao bound for parameter estimation of BPSK, MSK or QPSK waveforms
Jean-Pierre Delmas (INT, France)
This paper addresses the stochastic Cramer-Rao bound (CRB) pertaining to the joint estimation of the carrier frequency offset, the carrier phase and the noise and data powers of binary phase-shift keying (BPSK), minimum shift keying (MSK) and quaternary phase-shift keying (QPSK) modulated signals corrupted by additive white circular Gaussian noise. Because the associated models are governed by simple Gaussian mixture distributions, an explicit expression of the Fisher Information matrix is given and an explicit expression for the stochastic CRB of these four parameters are deduced. Refined expressions for low and high-SNR are presented as well. Finally, our proposed analytical expressions are numerically compared with the approximate expressions previously given in the literature.
ML Symbol Synchronization For General MIMO-OFDM Systems In Unknown Frequency-Selective Fading Channels
Amir Saemi (University of Limoges, France); Vahid Meghdadi (University of Limoges, France); Jean Pierre Cances (University of Limoges, France); Mohammad Reza Zahabi (University of Limoges, France); Jean-Michel Dumas (University of Limoges, France); Guillaume Ferré (University of Limoges, France)
This paper proposes a new synchronization algorithm for a general MIMO-OFDM system over frequency selective fading channels, based on maximum-likelihood principle. By knowing the exact place of the training sequence packet, we are able to align the start of FFT window within the OFDM symbol. Symbol timing joint with channel estimation is performed in this paper in order to find the exact place of the beginning of the training sequence. In addition, loss in system performance due to synchronization error is derived for MIMO systems and used as a performance criterion. We provide simulation results to illustrate that the performance of the proposed algorithm is quite satisfying.
A comparison of soft and hard decision-directed feedforward phase estimators
Mathieu Dervin (IRIT - ENSEEIHT, France); Marie-Laure Boucheret (ENSEEIHT, France); Jean-Yves Tourneret (IRIT/ENSEEIHT/TéSA, France)
This paper is devoted to the comparison of hard and soft decision directed feedforward phase estimators based on the maximum likelihood principle. The particular structure of these estimators is taken into account in the derivation of a new lower bound on the estimation variance for a constant phase error. The proposed schemes are then compared in terms of equivalent noise bandwidth. Finally, simulation results allow us to compare the algorithms in terms of root mean square estimation error, respectively for a constant phase error and for a time varying phase error.
Blind frame synchronisation for Block code
Sebastien Houcke (ENST-Bretagne, France); Guillaume Sicot (ENST-Bretagne, France)
Canal coding can not nowadays be by passed. In order to decode the coded sequence, the receptor has to find the beginning of the codewords. This problem is usually solved by adding periodically to the transmit sequence a frame synchronization sequence. Of course the longer the sequence the better the synchronization but the less the spectral efficiency. We understand clearly the stakes of developping a blind technic that synchronizes before decoding (at high bit error rate) without synchronization sequence. In this article we propose a blind method that allows us to synchronize a block code, we show that it is specially well suited for the LDPC codes and has for those particular code very convincing performance.
Phase Estimation Mean Square Error and Threshold Effects for Windowed and Amplitude Modulated Sinusoidal Signals
Stefan Schuster (University of Linz, Austria); Stefan Scheiblhofer (University Linz, Austria); Reinhard Feger (University Linz, Austria); Andreas Stelzer (University of Linz, Austria)
The problem of estimating the phase of (possibly amplitude modulated) sinusoidal signals arises in a variety of signal processing applications [1]. A closed form expression for the maximum likelihood (ML) estimator exists which achieves the best possible performance given by the Cramer-Rao lower bound (CRLB) asymptotically. However, due to the nonlinear nature of the problem, below a certain level of signal-to-noise ratio (SNR), the so called threshold effect occurs, and the performance of the estimator decreases quickly. In this paper, we investigate this effect, together with the influence of windowing and amplitude modulation on the threshold using the unmodified estimator.

Room: Auditorium

## 5:10 PM - 6:30 PM

### Wed.2.4: Image Understanding - 4 papers

Chair: Enis Çetin (Bilkent University, Turkey)
Automatic Fire Detection in Video Sequences
Turgay Celik (Eastern Mediterranean University, Turkey); Hasan Demirel (Eastern Mediterranean University, Turkey); Huseyin Ozkaramanli (Eastern Mediterranean University, Turkey)
In this paper, we propose a real-time fire-detector which combines foreground information with statistical color information to detect fires. The foreground information which is obtained using adaptive background information is verified by the statistical color information which is extracted using hand labeled fire pixels to determine whether the detected foreground object is a candidate for fire or not. The output of the both stages is analyzed in consecutive frames which is the verification process of fire that uses the fact that fire never stays stable in visual appearance. The frame processing rate of the detector is about 30 fps with image size of 176x144 which enables the proposed detector to be applied for real-time applications.
Image fusion based on level set segmentation
Filip Sroubek (Instituto de Optica, CSIC, Spain); Gabriel Cristobal (Instituto de Optica (CSIC), Spain); Jan Flusser (Institute of Information Theory and Automation, Czech Republic)
Most of image fusion techniques utilize a key notion called decision map. This map determines which information to take at what place and therefore governs the fusion process. We illustrate that calculation of decision maps is identical to a segmentation problem. Modifying a state-of-the-art segmentation procedure based on level sets, we obtain more accurate and smooth decision maps. Verification of the proposed method is carried out on wavelet-based multifocus fusion and concluded with an experiment on microscopic multifocal images.
Comparison of the estimation of the degree of polarization from four or two intensity images degraded by speckle noise
Muriel Roche (Institut Fresnel - EGIM, France); Philippe Réfregier (Fresnel Institute, France)
Active polarimetric imagery is a powerful tool for accessing the information present in a scene. Indeed, the polarimetric images obtained can reveal polarizing properties of the objects that are not avalaible using conventional imaging systems. However, when coherent light is used to illuminate the scene, the images are degraded by speckle noise. The polarization properties of a scene are characterized by the degree of polarization. In standard polarimetric imagery system, four intensity images are needed to estimate this degree. If we assume the uncorrelation of the measurements, this number can be decreased to two images using the Orthogonal State Contrast Image (OSCI). However, this approach appears too restrictive in some cases. We thus propose in this paper a new statistical parametric method to estimate the degree of polarization assuming correlated measurements with only two intensity images. The estimators obtained from four images, from the OSCI and from the proposed method, are compared using simulated polarimetric data degraded by speckle noise.
Enhancing facial expression classification by information fusion
Ioan Buciu (Aristotle University of Thessaloniki, Greece); Zakia Hammal (Institut National Polytechnique de Grenoble, France); Alice Caplier (Institut National Polytechnique de Grenoble, France); Nikos Nikolaidis (Aristotle University of Thessaloniki, Greece); Ioannis Pitas (ARISTOTLE UNIVERSITY OF THESSALONIKI, Greece)
The paper presents a system that makes use of the fusion information paradigm to integrate two different sorts of information in order to improve the facial expression classification accuracy over a single feature based classification. The Discriminant Non-negative Matrix Factorization (DNMF) approach is used to extract a first set of features and an automatic extraction algorithm based on geometrical feature is used for retrieving the second set of features. These features are then concatenated into a single feature vector at feature level. Experiments showed that, when these mixed features are used for classification, the classification accuracy is improved compared with the case when only one type of features is used.

### Wed.5.4: Undetermined Sparse Audio Source Separation (Special session) - 4 papers

Chair: Shoji Makino (NTT Communication Science Laboratories, Japan)
Chair: Shoko Araki (NTT communication Science Laboratories, Japan)
Sparse sources are separated sources
Scott Rickard (University College Dublin, Ireland)
Sparse respresentations are being used to solve problems previously thought insolvable. For example, we can separate more sources than sensors using an appropriate transformation of the mixtures into a domain where the sources are sparse. But what do we mean by sparse? What attributes should a sparse measure have? And how can we use this sparsity to separate sources? We investigate these questions and, as a result, conclude that sparse sources are separated sources, as long as you use the correct measure.
Towards Underdetermined Source Reconstruction from a Clap-and-Play Binaural Live Recording
Pau Bofill (Universitat Politecnica de Catalunya, Spain)
The goal of our current research is to be able to separate a few audio sources from the signals of two microphones, using a separate recording of each player clapping their hands. The separation is performed in the frequency domain, where speech and music signals are mostly sparse. The clapping is used to estimate each transfer function, and the sources are reconstructed using Second Order Cone Programming (SOCP). Our experiments show moderatly good results for synthetic mixtures (11.5dB average SNR) and poor results for the real case (2.2dB). This paper points out some of the issues that make this task a difficult one, and shows some experimental analysis of why this is so.
Bayesian blind separation of audio mixtures with structured priors
Cédric Févotte (University of Cambridge, United Kingdom)
In this paper we describe a Bayesian approach for separation of linear instantaneous mixtures of audio sources. Our method exploits the sparsity of the source expansion coefficients on a time-frequency basis, chosen here to be a MDCT. Conditionally upon an indicator variable which is 0 or 1, one source coefficient is either set to zero or given a Student t prior. Structured priors can be considered for the indicator variables, such as horizontal structures in the time-frequency plane, in order to model temporal persistency. A Gibbs sampler (a standard Markov chain Monte Carlo technique) is used to sample from the posterior distribution of the indicator variables, the source coefficients (corresponding to nonzero indicator variables), the hyperparameters of the Student t priors, the mixing matrix and the variance of the noise. We give results for separation of a musical stereo mixture of 3 sources.
Normalized Observation Vector Clustering Approach for Sparse Source Separation
Shoko Araki (NTT communication Science Laboratories, Japan); Hiroshi Sawada (NTT communication Science Laboratories, Japan); Ryo Mukai (NTT communication Science Laboratories, Japan); Shoji Makino (NTT Communication Science Laboratories, Japan)
This paper presents a new method for the blind separation of sparse sources whose number N can exceed the number of sensors M. Recently, sparseness based blind separation has been actively studied. However, most methods utilize a linear sensor array (or only two sensors), and therefore have certain limitations; e.g., they cannot be applied to symmetrically positioned sources. To allow the use of more than two sensors that can be arranged in a non-linear/non-uniform way, we propose a new method that includes the normalization and clustering of the observation vectors. We report promising results for the speech separation of 3-dimensionally distributed five sources with non-linear/non-uniform sensor arrangements of four sensors in a room (RT_{60}= 120 ms).

### Wed.1.4: MIMO Channel Modelling, Emulation and Sounding II (Special session) - 4 papers

Room: Auditorium
Chair: Peter Grant (Edinburgh School of Engineering and Electronics, United Kingdom)
Modelling and Manipulation of Polarimetric Antenna Beam Patterns via Spherical Harmonics
Giovanni Del Galdo (Ilmenau University of Technology, Germany); Jörg Lotze (Ilmenau University of Technology, Germany); Markus Landmann (Ilmenau University of Technology, Germany); Martin Haardt (Ilmenau University of Technology, Germany)
Measured antenna responses, namely their beam patterns with respect to the vertical and horizontal polarizations, play a major role in realistic wireless channel modelling as well as in parameter estimation techniques. The representations commonly used suffer from drawbacks introduced by the spherical coordinate system which is affected by two knots at the poles. In general, all methods which describe the beam pattern with a matrix fail in correctly reproducing its inherent spherical symmetry. In this contribution we propose the use of the Spherical Fourier Transformation (SFT) which allows the description of the beam pattern via spherical harmonics. This mathematical tool, well known in other fields of science, is rather new to wireless communications. In this context, the main applications of the SFT include the efficient description of a beam pattern, noise filtering, the precise interpolation in the spherical Fourier domain, and the possibility to obtain an equivalent description of the beam pattern for an arbitrary coordinate system. The latter allows us to improve an existing 2-D FFT based technique: the Effective Aperture Distribution Function (EADF).
Distributed UWB MIMO Sounding for Evaluation of Cooperative Localization Principles in Sensor Networks
Rudolf Zetik (Technical University Ilmenau, Germany); Jürgen Sachs (TU Ilmenau, Germany); Reiner Thomä (TU-Illmenau, Germany)
We describe architecture, design, and a novel application of a real-time MIMO UWB channel sounder. The sounder is applied for evaluating of localization principles in distrib-uted sensor networks that are based on UWB radio technol-ogy. We assume an application scenario without any sup-porting infrastructure as it may occur in emergency situa-tions such as fire disasters, earthquakes or terror attacks. At first we discuss the deployment scenario and signal process-ing principles applied for cooperative sensor node localiza-tion and imaging of the propagation environment. Then, we describe the architecture of the UWB MIMO channel sounder. Finally, a measurement example is demonstrated
System-level performance evaluation of MMSE MIMO turbo equalization techniques using measurement data
Mariella Särestöniemi (University of Oulu, Finland); Tadashi Matsumoto (CWC - Oulu, Finland); Christian Schneider (Technische Università¤t Ilmenau, Germany); Reiner Thomä (TU-Illmenau, Germany)
In this paper, system-level performance of MMSE turbo MIMO equalization techniques is evaluated in realistic scenarios. Soft cancellation and minimum mean squared error filtering (SC/MMSE) turbo equalization and its complexity reduced version, turbo equalized diversity, is considered. Furthermore, another version of equalized diversity, turbo equalized diversity with common SC/MMSE, which exploits the transmit diversity and coding gain through the cross-wise iterations over the decoding branches, is evaluated. The multi-dimensional channel sounding measurement data used for the simulations consists of snapshots measured in different channel conditions in terms of spatial and temporal properties. The system-level assessment is in terms of outage probabilities of the performance figures such as bit and frame error rates obtained by evaluating their cumulative probability densities using the field measurement data. It is found that the receivers considered in this paper can all provide reasonable system-level performance. However, turbo equalized diversity receiver is slightly more sensitive to the channel conditions than the original SC/MMSE equalizer. It is also found that the performance gain obtained from the cross-wise iteration over the decoding branches in the turbo equalized diversity with common SC/MMSE technique is significant.
Widely Linear MMSE Transceiver for Real-Valued Sequences over MIMO Channel
Davide Mattera (Università  degli Studi di Napoli Federico II, Italy); Luigi Paura (Università  di Napoli Federico II, Italy); Fabio Sterle (University of Naples Federico II, Italy)
Joint design of the precoder and the decoder (say, transceiver) for multiple-input/multiple-output (MIMO) channels is considered and, in particular, the already existing procedure for the design of the linear transceiver according to the minimum-mean-square-error (MMSE) criterion is extended to the more general case where the transceiver resorts to widely linear (WL) processing rather than linear one. WL filters linearly and independently process both the real and the imaginary parts of the input signals, and they are usually employed in order to trade-off a limited increase in the computational complexity with performance gains when the input signals are circularly variant. For this reason, we propose to resort to WL processing in the synthesis of the MIMO transceiver when real-valued data streams have to be transmitted. The performance analysis shows significant performance advantages of the proposed WL-MMSE MIMO transceiver with respect to the linear one.

### Wed.6.4: TOA and DOA Estimation - 4 papers

Room: Room 4
Chair: Visa Koivunen (Helsinki University of Technology, Finland)
Time-delay Estimation of Signals in Nonstationary Random Noise via Stationarization and Wigner Distribution-based Approach
Hiroshi Ijima (Kyoto Institute of Technology, Japan); Akira Ohsumi (Kyoto Institute of Technology, Japan); Satoshi Yamaguchi (Kyoto Institute of Technology, Japan)
An effective method is proposed in this paper for the estimation problem of unknown time-delay of a signal which is received corrupted by the nonstationary random noise. The keys of the method are the stationarization of the nonstationary observation data and the introduction of Wigner distribution-based maximum likelihood function. The method is tested by simulations to show the efficacy.
Time-of-arrival estimation under impulsive noise for wireless positioning systems
Hacioglu Isa (Yeditepe University, Turkey); Frederic Kerem Harmanci (Bogazici University, Turkey); Emin Anarim (Bogazici University, Turkey); Hakan Delic (Bogazici University, Turkey)
Forward-Backward modified Fractional Lower Order Moment-MUSIC (FB-FLOM-MUSIC), a high resolution spectral estimation algorithm, is proposed for time of arrival (TOA) estimation under a non-Gaussian noise model that represents impulsive outliers which occur in indoor wireless channels. FB-FLOM-MUSIC is designed for the alpha-stable model of impulsive noise and the peaks of its pseudo-spectra are used to estimate the impulse response of the channel which gives the TOA of each path's impulse in the channel. The first arriving path of the multi-path channel is assumed to be the line-of-sight (LOS) component, and its delay is accepted as the TOA estimate. Simulation results indicate that FLOM-MUSIC clearly outperforms S0S-MUSIC in impulsive noise and that FB-FLOM-MUSIC provides more precision in the expense of a slight loss of accuracy in moderately dispersed and highly impulsive noise.
Time-delay estimation using Farrow-based fractional-delay FIR filters: filter approximation vs. estimation errors
This paper provides error analysis regarding filter approximation errors versus estimation errors when utilizing Farrow-based fractional-delay filters for time-delay estimation. Further, a new technique is introduced which works on batches of samples and utilizes the Newton-Raphson technique for finding the minimum of the corresponding cost function.
DOA estimation based on cross-correlation by two-step particle filtering
Mitsunori Mizumachi (Kyushu Institute of Technology, Japan); Norikazu Ikoma (Kyushu Institute of Technology, Japan); Katsuyuki Niyada (Kyushu Institute of Technology, Japan)
This paper proposes a two-step particle filtering method for achieving robust DOA estimation under noisy environments. The proposed method aims at fusing the advantages of traditional cross-correlation (CC) and generalized (whitened) cross-correlation (WCC) methods, which are a CC method taking power information into account and a WCC method having a sharp peak in a cross-correlation function. We regard rectified CC and WCC functions as likelihood. The two-step filtering is carried out on bi-dimensional, spectrospatial state space. Experimental results under noisy and slightly reverberant environments show that the proposed method is superior to CC and WCC methods both in accuracy and stability.

### Wed.4.4: Speech coding - 4 papers

Room: Sala Onice
Chair: Bhaskar Rao (University of California, San Diego, USA)
An Adaptive Equalizer for Analysis-By-Synthesis Speech Coders
Mark Jasiuk (Motorola, USA); Tenkasi Ramabadran (Motorola, USA)
An equalizer to enhance the quality of reconstructed speech from an analysis-by-synthesis speech coder, e.g., CELP coder, is described. The equalizer makes use of the set of short-term predictor parameters normally transmitted from the speech encoder to the decoder. In addition, the equalizer computes a matching set of parameters from the reconstructed speech. The function of the equalizer is to undo the computed set of characteristics from the reconstructed speech and impose the set of desired characteristics represented by the transmitted parameters. Design steps for the equalizer and its implementation both in time and frequency domain are described. Experimental results of applying the equalizer to the output of a standard coder, viz., EVRC (Enhanced Variable Rate Coder) operating at half-rate (4000 bps), are presented. Objective evaluation using an ITU recommended voice quality tool shows that the equalizer can help enhance the quality of the reconstructed speech significantly.
SCELP: Low Delay Audio Coding with Noise Shaping based on Spherical Vector Quantization
Hauke Krueger (RWTH Aachen University, Germany); Peter Vary (RWTH Aachen University, Germany)
In this contribution a new wideband audio coding concept is presented that provides good audio quality at bit rates below 3 bits per sample with an algorithmic delay of less than 10 ms. The new concept is based on the principle of Linear Predictive Coding (LPC) in an analysis-by-synthesis framework, as known from speech coding. A spherical codebook is used for quantization at bit rates which are higher in comparison to low bit rate speech coding for improved performance for audio signals. For superior audio quality, noise shaping is employed to mask the coding noise. In order to reduce the computational complexity of the encoder, the analysis-by-synthesis framework has been adapted for the spherical codebook to enable a very efficient excitation vector search procedure. The codec principle can be adapted to a large variety of application scenarios. In terms of audio quality, the new codec outperforms ITU-T G.722 at the same bit rate of 48 kbit/sec and a sample rate of 16 kHz.
Bandwidth Efficient Mixed Pseudo Analogue - Digital Speech Transmission
Carsten Hoelper (RWTH Aachen University, Germany); Peter Vary (RWTH Aachen University, Germany)
Today's speech coding and transmission systems are either analogue or digital,with a strong shift from analogue systems to digital systems during the last decades. In this paper, both digital and analogue schemes are combined for the benefit of saving transmission bandwidth, complexity, and of improving the achievable speech quality at any given signal-to-noise ratio (SNR) on the channel. The combination is achieved by transmitting analogue samples of the unquantized residual signal of a linear predictive digital filter. The new system, Mixed Analogue-Digital (MAD) transmission, is applied to narrowband speech as well as to wideband speech. MAD transmission over a channel modeled by additive white Gaussian noise (AWGN) is compared to the GSM Adaptive Multi-Rate speech codec mode 12.2kbit/s (Enhanced Full-Rate Codec) which uses a comparable transmission bandwidth if channel coding is included.
Embedding Side Information into a Speech Codec Residual
Nicolas Chetry (Queen Mary, University of London, United Kingdom); Mike Davies (Queen Mary, University of London, United Kingdom)
We introduce a technique for embedding side information into a speech codec residual. While conserving the backward compatibility with existing decoders, it is described how it is possible to hide information into the speech long-term residual signal when it is encoded using a uniform or quasi-uniform quantiser. The method consists of embedding multiple parity bits at the quantiser level in a configuration that minimises the distortion over the whole sub-frame. The system has been evaluated quantitatively in terms of MOS and subjectively using "double blind" listening tests using the GSM-FR speech codec. It has been found that data can be embedded without severe perceptual degradation of the signal quality. Beyond this particular application to speech coding, it is shown how simple parity-check techniques can be developed to transparently transmit any binary data alongside the main encoded signal by using such joint quantisation and embedding scheme.

### Wed.3.4: Audio Watermarking - 4 papers

Room: Sala Verde
Chair: Jianguo Huang (Coll. Of Marine Engineering, Northwestern Polytechnical University, Xian China)
Digital audio watermarking for QoS assessment of MP3 music signals
Francesco Benedetto (University of Roma Tre, Italy); Gaetano Giunta (University of "Roma TRE", Italy); Alessandro Neri (University of ROMA TRE, Italy); C. Belardinelli (University of Roma Tre, Italy)
Digital watermarking of multimedia contents has become a very active research area over the last several years. In this work, we propose an audio watermarking signal processing technique to provide a quality assessment of the received audio signal after a coding/transmission process. Specifically, a fragile watermark is hidden in MP3-like host data audio transport stream (MPEG-1 layer III) using a spread-spectrum approach. At the receiving side, the watermark is extracted and compared to its original counterpart. Since the alterations endured by the watermark are likely to be suffered by the MP3-file, as they follow the same communication link (including coder and transport connection), the watermarks degradation can be used to estimate the overall alterations endured by the entire MP3-data. The Quality of Service assessment is based on the evaluation of the mean-square-error between the estimated and the actual watermark. The proposed technique has been designed for application to mobile multimedia communication systems. The results obtained through our simulation trials confirm the validity of such approach.
A Hybrid Pre-Whitening Technique for Detection of Additive Spread Spectrum Watermarks in Audio Signals
Krishna Kumar S. (Centre for Development of Advanced computing, India); Thippur Venkat Sreenivas (Indian Institute of Science, India)
Pre-whitening techniques are employed in blind correlation detection of additive spread spectrum watermarks in audio signals to reduce the host signal interference. A direct deterministic whitening (DDW) scheme is derived in this paper from the frequency domain analysis of the time domain correlation process. Our experimental studies reveal that, the Savitzky-Golay Whitening (SGW), which is otherwise inferior to DDW technique, performs better when the audio signal is predominantly lowpass. The novelty of this paper lies in exploiting the complementary nature to the two whitening techniques to obtain a hybrid whitening (HbW) scheme. In the hybrid scheme the DDW and SGW techniques are selectively applied, based on short time spectral characteristics of the audio signal. The hybrid scheme extends the reliability of watermark detection to a wider range of audio signals.
Speech processing in the watermarked domain: application in adaptive acoustic echo cancellation
Imen Marrakchi (university, Tunisia); Turki Monia (National School of Engineers, Tunisia); Sonia Larbi (University, Tunisia); Jaidane Meriem (U2S, Tunisia); Gael Mahé (University, France)
Audio watermarking, or embedding information in a host signal was originally used for digital copyright protection purposes. As audio coding, watermarking is progressively brought in audio processing applications. In this paper, we investigate some benefits of watermarking in signal processing. We focus here on a generic application: adaptive Acoustic Echo Cancellation (AEC). The proposed Watermarked AEC (WAEC) is based on a coupling of two adaptive filters. The first one extracts a rough estimation of the echo response to the known stationary watermark embedded in the speech signal. This extracted estimation constitutes the reference signal for the second adaptive filter. Driven by the known watermark, the second adaptive filter estimates then the actual echo path. The goal here is to drive the estimation of the echo path by the watermark itself, in order to take advantage of the its optimal properties (whiteness and stationarity). As expected, the proposed WAEC exhibits better transient and steady state performance than the classical one. These results of some interest follow from the fact that the second adaptive filter deals with much more stationary signals that the first one.
A fingerprinting system for musical content
Lahouari Ghouti (Queen's University Belfast, United Kingdom); Ahmed Bouridane (Queen's University, United Kingdom)
Digital multimedia content (especially audio) is becoming a major part of the average computer user experience. Large digital audio collections of music, audio and sound effects are also used by the entertainment, music, movie and animation industries. Therefore, the need for identification and management of audio content grows proportionally to the increasing widespread availability of such media virtually "any time and any where" over the Internet. In this paper, we propose a novel framework for musical content fingerprinting using balanced multiwavelets (BMW). The framework for generating robust perceptual fingerprint (or hash) values is described. The generated fingerprints are used for identifying, searching, and retrieving audio content from large digital music databases. Furthermore, we illustrate, through extensive computer simulation, the robustness of the proposed framework to efficiently represent musical content and withstand several signal processing attacks and manipulations.

Room: Auditorium

## 8:40 AM - 11:00 AM

### Thu.2.1: MIMO Transmission Techniques (Special session) - 7 papers

Chair: Wolfgang Utschick (Munich University of Technology, Germany)
Design of robust linear dispersion codes based on imperfect CSI for ML receivers
Svante Bergman (KTH, Sweden); Bjorn Ottersten (Royal Institute of Technology, Sweden)
This paper concern the design of codes for multi-input multi-output communication systems. The transmission scheme utilize imperfect channel state information (CSI) in the design, assuming maximum-likelihood detection is employed at the receiver. It is argued that channel diagonalizing codes are not robust to imperfections in the CSI. A robust non-diagonalizing code with good minimum distance separation between received codewords is proposed. The code is very suitable for systems with high data rates due to its low design complexity. Numerical results show that the proposed code outperforms a state of the art diagonalizing precoder.
MIMO-ISI Channel equalization -- Which Price We Have to Pay for Causality
Holger Boche (Heinrich-Hertz-Institut fà¼r Nachrichtentechnik Berlin GmbH, Germany); Volker Pohl (Technical University Berlin, Germany)
In the investigation of equalizers and precoders for multiple-input multiple-output systems with intersymbol interference, completely new phenomena appear if the causality of theses filters is required. Both, for transmit as well as for receive filters, the stability norm is an important performance measure which is connected to several performance criteria in communications. The paper shows that the optimal causal precoder with minimal stability norm is linear but time variant, in general. It is time invariant only if the channel is flat fading. Moreover, it is discussed that there exist causal precoders or equalizers for which the stability norm grows exponential with the minimum number of transmit and receive antennas of the MIMO system, whereas the stability norm of the optimal non-causal inverse is always independent from the dimensions of the MIMO system.
Multi-user Topologies in Multi-Antenna Wireless Networks
Christian Peel (Brigham Young University, USA); Lee Swindlehurst (Brigham Young University, USA); Wolfgang Utschick (Munich University of Technology, Germany)
Recent results on the throughput achievable with wireless networks have not fully considered multiple antennas and multi-user links. We introduce these topics by deriving the transport capacity of a multiple-access channel with CSI available only at the receiver. We also give the transport capacity of the multiple-access and broadcast channels with full CSI. We use these topologies at the physical layer of an ad-hoc network to obtain achievable distance-weighted rate regions for a multi-antenna wireless network. These regions are obtained by maximizing the distance-weighted rate over all combinations of uplink and downlink topologies, respectively. A Nash-equilibrium-seeking algorithm is used to optimize the transmit covariance matrices for the centralized topology search. Distributed algorithms for topology creation are also presented which utilize only local channel state information and compared with multi-user versions of slotted ALOHA. Numerical examples show the benefit of uplink topologies over point-to-point and downlink topologies, especially at high transmit power, high numbers of antennas, and a large number of nodes.
An Efficient Feedback Scheme with Adaptive Bit Allocation for Dispersive MISO Systems
Leonid Krasny (Ericsson Inc., USA); Dennis Hui (Ericsson Inc., USA)
In this paper, we focus on a cellular system with M transmit antennas at the base station (BTS) and one receive antenna at the mobile (i.e. an M-input/single-output (MISO) channel), where the BTS commands each mobile to transmit its channel state information back to the BTS. Our main result is a specific feedback scheme with adaptive bit allocation, where a binary tree-structured vector quantizer is used to separately quantize different channel taps at different level of quantization. We show that proposed feedback scheme allows to exploit the different statistics of the channel taps and results in a performance very close (within 1dB) to the performance that can be obtained with perfect channel knowledge at the BTS.
Randomized distributed Multi-antenna Systems in Multi-path channels
Anna Scaglione (Cornell University, USA); Birsen Sirkeci (Cornell University, USA); Stefan Geirhofer (Cornell University, USA); Lang Tong (Cornell University, USA)
A great deal of research on MIMO systems is now trying to focus on distributed designs to bring the advantages of co-located antenna systems to nodes with a single RF front end, by leveraging on the other nodes resources. Yet, most schemes that are considered assume that the nodes encode their signals in a fashion that requires at least the knowledge of the number of nodes involved and in many cases the specific encoding rule to use. Hence, while the hardware resources are distributed, the protocols that are proposed are not. Recently we have proposed schemes that are totally decentralized and using random matrix theory we have studied the diversity attainable through these schemes in flat fading channels. The goal of this paper is to show that our general randomized designs are suitable to work in frequency selective channels and can easily be adapted to block space-time precoding schemes that are known to harvest diversity not only from the multiple antennas but also from the multi-path.
Linear transmitter design for the MISO compound channel with interference
Ami Wiesel (Technion-Israel Institute of Technology, Israel); Yonina Eldar (Technion---Israel Institute of Technology, Israel); Shlomo Shamai (The Technion, Israel)
We consider the problem of linear transmitter design in multiple input single output (MISO) compound channels with interference. The motivation for this channel model is communication in MISO broadcast channels with partial channel state information (CSI). Since the compound capacity is unknown for this model, we consider optimal linear transmit methods for maximizing the data rate. We provide efficient numerical solutions with and without perfect CSI. We then discuss the optimality of beamforming and the existence of a saddle point in the compound channel.
On the Duality of MIMO Transmission Techniques for Multiuser Communications
Wolfgang Utschick (Munich University of Technology, Germany); Michael Joham (Technische Università¤t Mà¼nchen, Germany)
Since the downlink has a difficult algebraic structure, it is more convenient to switch to the dual uplink problem which has better algebraic properties. We consider the uplink/downlink duality with respect to the mean square error (MSE), where our system model is as general as possible, i.e., we allow not only for correlations of the symbols and noise, but also model the precoders, the channels, and the equalizers as compact linear operators. We show that a duality with respect to the MSE per user is preferable to the state-of-the-art stream-wise MSE duality, since the uplink/downlink transformation of the user-wise MSE duality has a considerably lower complexity than the one of the stream-wise MSE duality. Interestingly, the uplink/downlink transformation for the total MSE duality is trivial, i.e., a simple weighting with a scalar common for all filters has to be computed. We apply the uplink/downlink duality to derive the operator form of the well-known transmit Wiener filter (TxWF).

### Thu.5.1: Cross-layer Optimization for Wireless Communication Systems (Special session) - 7 papers

Chair: Holger Boche (Fraunhofer Institute for Telecommunications HHI, Germany)
Sub-carrier SNR Estimation at the Transmitter for Reduced Feedback OFDMA
Patrick Svedman (Royal Institute of Technology, Sweden); David Hammarwall (Royal Institute of Technology, Sweden); Bjorn Ottersten (Royal Institute of Technology, Sweden)
In multiuser OFDMA FDD systems with resource allocation based on the instantaneous channel quality of the users, the feedback overhead can be very large. In this paper, a method to significantly reduce this feedback is proposed. The idea is to let the users feed back the channel quality (the SNR in this paper) of only a sub-set of their strongest sub-carriers. The SNRs on the other sub-carriers are instead estimated from the fed back values. We derive the MMSE estimator of the SNR of a sub-carrier, which uses two fed back SNRs as input. As a comparison, we also study the performance of the LMMSE estimator as well as spline interpolation. Numerical results show that the LMMSE estimator tends to underestimate the SNR compared to the other two estimators, whereas the interpolation tends to overestimate the SNR. System simulations including adaptive modulation and packet losses indicate that the MMSE estimator is the best choice in practice.
Distributed Algorithms for Maximum Throughput in Wireless Networks
Yufang Xi (Yale University, USA); Edmund Yeh (Yale University, USA)
The Maximum Differential Backlog (MDB) control policy of Tassiulas and Ephremides has been shown to adaptively maximize the stable throughput of multi-hop wireless networks with random traffic arrivals and queueing. The practical implementation of the MDB policy in wireless networks with mutually interfering links, however, requires the development of distributed optimization algorithms. Within the context of CDMA-based multi-hop wireless networks, we develop a set of node-based scaled gradient projection power control algorithms which solves the MDB optimization problem in a distributed manner using low communication overhead. As these algorithms require time to converge to a neighborhood of the optimum, the implementation of the MDB policy must be done with delayed queue state information. For this, we show that the MDB policy with delayed queue state information remains throughput optimal.
Autonomous QoS Control for Wireless Mesh and Ad-hoc Networks - the Generalized Lagrangean Approach
Marcin Wiczanowski (Technical University of Berlin, Germany); Slawomir Stanczak (Fraunhofer German-Sino Lab for Mobile Comm., Germany); Holger Boche (Fraunhofer Institute for Telecommunications HHI, Germany)
We consider the combined problem of performance optimization and interference control in wireless mesh and ad-hoc networks. Relying on the specific construction of the generalized Lagrangean function we propose a simple primal-dual unconstrained iteration providing convergence to a (local) optimum under arbitrary performance objectives. We present a decentralized implementation of such routine in linear networks.
A framework for resource allocation in OFDM broadcast systems
Gerhard Wunder (Heinrich-Hertz-Institut, Germany); Thomas Michel (German-Sino Mobile Communications Institute (MCI), Germany); Chan Zhou (Mobile Communication Lab for Mobile Communications MCI, HHI, Germany)
In this paper we consider resource allocation for OFDM broadcast channels (BC) where we introduce several scheduling policies in an ideal information-theoretic context and analyze their performance in terms of throughput, stability and delay dependent on system parameters such as user numbers and channel parameters. We provide algorithms to solve the stated scheduling problems where we use Langrangian and duality theory. These solutions can be used as a general benchmark for specific approaches and they also provide some intuition for good suboptimal solutions. Additionally, all these strategies are compared to a practical setup tested in a complete simulation chain (physical and medium access layer) according to the 3GPP HSDPA specification.
On the interplay between scheduling, user distribution, CSI, and performance measures in cellular downlink
Eduard Jorswieck (Heinrich-Hertz-Institut, Germany); Mats Bengtsson (Royal Institute of Technology, Sweden); Bjorn Ottersten (Royal Institute of Technology, Sweden)
The cross-layer design of future communication systems jointly optimizes multiple network layers with the goal of boosting the system wide performance. This trend brings together the physical and the medium access layers. For the joint optimization of these two lowest layers, it is necessary to understand and relate their terms and concepts. In this paper, we study the interplay between four terms, namely channel state information from link-level, scheduling and user distribution from system level, and different performance measures from both levels. The envisaged scenario is the cellular downlink transmission. The average sum rate describes the long-term performance from a system perspective. The optimal scheduling policy as well as the impact of the user distribution can be clearly characterized as a function of the channel state information (CSI). In contrast, the short-term system performance which is described by the outage sum rate, shows a varying behavior in terms of the optimal scheduling policy and as a function of the user distribution. The analysis is performed by employing Majorization theory for comparing different user distributions. Three different CSI scenarios, namely the uninformed base, the perfectly informed base, and the base with covariance knowledge are studied. Finally, the extension to two less well known performance measures, the maximum throughput and the delay-limited sum rate is addressed.
Cross-layer Solutions to Performance Problems in VoIP over WLANs
Federico Maguolo (University of Padova, Italy); Francesco De Pellegrini (Universita di Padova, Italy); Andrea Zanella (University of Padova, Italy); Michele Zorzi (University of Padova, Italy)
The design of WLANs was meant to extend Ethernet LANs in the most transparent way, but no particular mechanism was deployed in order to support real-time applications natively. At present VoIP calls are becoming customary, and IEEE802.11 WLANs must face the provision of guaranteed quality of service. In practice, QoS should be provided somehow a posteriori on top of the existing standard. In this paper, we address some concerns on the efficiency of WLANs for VoIP provision already remarked in literature and analyze possible solutions to increase the voice capacity of DCF IEEE802.11 WLANs. We consider two candidate solutions, the VA [1] and the M-M [2] cross-layer schemes. The efficiency of such mechanisms is evaluated in order to assess the performance gain compared to existing solutions. We provide extensive simulation results, proving that the advantage is signifcant, while requiring minor changes compared to the current IEEE802.11 standard.
Coordination and resilience in wireless adhoc and sensor networks
Leandros Tassiulas (University of Thessaly, Greece)
abstract abstract abstract abstract abstract abstract abstract abstract abstract abstract abstract abstract abstract abstract abstract abstract abstract abstract abstract abstract abstract abstract abstract abstract abstract abstract abstract abstract abstract abstract abstract abstract abstract abstract abstract abstract abstract abstract abstract abstract abstract abstract

### Thu.1.1: MUSCLE: Recognizing Humans and Human Behaviour in Video (Invited special session) - 7 papers

Room: Auditorium
Chair: Ovidio Salvetti (ISTI-CNR, Italy)
Active Video-Surveillance Based on Stereo and Infrared Imaging
Gabriele Pieri (CNR, Inst. of Information Science and Technologies, Italy); Ovidio Salvetti (ISTI-CNR, Italy)
Video-surveillance is a very actual and critical issue at the present time. Within this topic we address the problem of firstly identifying moving people in a scene through motion detection techniques, and subsequently categorising them in order to identify humans for tracking their movements. The use of stereo cameras, coupled with infrared vision, allows to apply this technique to images acquired through different and variable condition, and allows an a priori filtering based on the characteristics of such images to give evidence to objects emitting an higher radiance (i.e. higher temperature).
László Havasi (Peter Pazmany Catholic University, Hungary)
The elimination of strong shadow in outdoor scenes contain-ing human activity is addressed in the paper. The main con-tribution of the introduced method is the integration of geo-metrical information into the shadow detection process. This novel approach takes into account the collinearity of shadow and light direction and completed with a simple colour based pre-filtering. The final classification step is carried out via a Bayesian iteration scheme which is general enough to handle further characteristics of the problem: weak shadow and reflection.
Human Face Detection in Video Using Edge Projections
Mehmet Turkan (Bilkent University, Turkey); Ibrahim Onaran (Bilkent University, Turkey); Enis Çetin (Bilkent University, Turkey)
In this paper, a human face detection algorithm in images and video is presented. After determining possible face candidate regions using colour information, each region is filtered by a high-pass filter of a wavelet transform. In this way, edges of the region are highlighted, and a caricature-like representation of candidate regions is obtained. Horizontal, vertical, filter-like and circular projections of the region are used as feature signals in support vector machine (SVM) based classifiers. It turns out that our feature extraction method provides good detection rates with SVM based classifiers.
Human Model and Motion Based 3D Action Recognition in Multiple View Scenarios
Cristian Canton (Universitat Politecnica de Catalunya, Spain); Josep Casas (UPC - Technical University of Catalonia, Spain); Montse Pardas (Technical University of Catalonia, Spain)
This paper presents a novel view-independent approach to the recognition of human gestures of several people in low resolution sequences from multiple calibrated cameras. In contraposition with other multi-ocular gesture recognition systems based on generating a classification on a fusion of features coming from different views, our system performs a data fusion (3D representation of the scene) and then a feature extraction and classification. Motion descriptors introduced by Bobick et al. for 2D data are extended to 3D and a set of features based on 3D invariant statistical moments are computed. A simple ellipsoid body model is fit to incoming 3D data to capture in which body part the gesture occurs thus increasing the recognition ratio of the overall system and generating a more informative classification output. Finally, a Bayesian classifier is employed to perform recognition over a small set of actions. Results are provided showing the effectiveness of the proposed algorithm in a SmartRoom scenario.
Multimodal Fusion by Adaptive Compensation for Feature Uncertainty with Application to Audiovisual Speech Recognition
Athanassios Katsamanis (National Technical University of Athens, Greece); George Papandreou (National Technical University Athens, Greece); Vassilis Pitsikalis (National Technical University of Athens, Greece); Petros Maragos (National Technical University of Athens, Greece)
In pattern recognition one usually relies on measuring a set of informative features to perform tasks such as regression or classification. While the accuracy of feature measurements heavily depends on changing environmental conditions, studying the consequences of this fact has received relatively little attention to date. In this work we explicitly take into account uncertainty in feature measurements and we show in a rigorous probabilistic framework how the models used for classification should be adjusted to compensate for this effect. Our approach proves to be particularly fruitful in multimodal fusion scenarios, such as audio-visual speech recognition, where multiple streams of complementary features are integrated. For such applications, provided that an estimate of the measurement noise uncertainty for each feature stream is available, we show that the proposed framework leads to highly adaptive multimodal fusion rules which are widely applicable and easy to implement. We further show that previous multimodal fusion methods relying on stream weights fall under our scheme under certain assumptions; this provides novel insights into their applicability for various tasks and suggests new practical ways for estimating the stream weights adaptively. Preliminary experimental results in audio-visual speech recognition demonstrate the potential of our approach.
Visual speech detection using mouth region intensities
Spyridon Siatras (Aristotle University of Thessaloniki, Greece); Nikos Nikolaidis (Aristotle University of Thessaloniki, Greece); Ioannis Pitas (ARISTOTLE UNIVERSITY OF THESSALONIKI, Greece)
In recent research efforts, the integration of visual cues into speech analysis systems has been proposed with favorable response. This paper introduces a novel approach for lip activity and visual speech detection. We argue that the large deviation and increased values of the number of pixels with low intensities that the mouth region of a speaking person demonstrates can be used as visual cues for detecting speech. We describe a statistical algorithm, based on detection theory, for the efficient characterization of speaking and silent intervals in video sequences. The proposed system has been tested into a number of video sequences with encouraging experimental results. Potential applications of the proposed system include speech intent detection, speaker determination and semantic video annotation.
Cooperative Background Modelling using Multiple Cameras Towards Human Detection in Smart-Rooms
Jose-Luis Landabaso (Technical University of Catalunya, Spain); Montse Pardas (Technical University of Catalonia, Spain)
Shape-from-Silhouette (SfS) is the common approach taken to reconstruct the Visual Hull which is later used in 3D-trackers and body fitting techniques. The Visual Hull is defined as the intersection of the visual cones formed by the back-projection of several 2D binary silhouettes into the 3D space. Silhouettes are usually extracted using a foreground classification process, which is performed independently in each camera view. In this paper we present a novel approach in which 2D-foreground classification is achieved in 3D accordance in a Bayesian framework. In our approach, instead of classifying images and reconstructing later, we simultaneously reconstruct and classify in the 3D space.

### Thu.6.1: Signal Processing in Radar Imaging (Special session) - 7 papers

Room: Room 4
Chair: Victor Chen (US Naval Research Laboratory, USA)
Chair: Marco Martorella (University of Pisa, Italy)
A slope-based technique for motion estimation and optimum time selection for ISAR imaging of ship targets
Debora Pastina (University of Rome "La Sapienza", Italy); Chiara Spina (Selex Airborne and Sensor Systems - Galileo Avionica Spa, Italy); Angelo Aprile (Selex Airborne and Sensor Systems - Galileo Avionica Spa, Italy)
The focus of this paper is on optimum time selection and angular motion estimation for ship ISAR imaging. The aim is to select proper imaging intervals and to estimate ship angular motion in order to obtain high quality top-view or side-view ship images suitable for processing by classifica-tion/identification procedures. To this purpose a slope-based ISAR algorithm is proposed, able to estimate the time instants better suited for top or side-view image formation and the rotation motion vertical/horizontal components for image scaling. The performance of the proposed ISAR tech-nique is investigated against simulated data under different ship model, ship motion, acquisition geometry and back-ground conditions. Results obtained by applying the pro-posed technique to live ISAR data proves the effectiveness of the proposed approach.
Luzhou Xu (University of Florida, USA); Jian Li (University of Florida, USA); Petre Stoica (Uppsala University, Sweden)
We investigate several adaptive techniques for a multiple-input multiple-output (MIMO) radar system. By transmitting independent waveforms via different antennas, the echoes due to targets at different locations are linearly independent of each other, which allows the direct application of many adaptive techniques. We discuss several adaptive radar imaging algorithms, which can provide excellent estimation accuracy of both target locations and target amplitudes, and high robustness to the array calibration errors. To reject the false peaks due to the strong jammers, we also propose a generalized likelihood ratio test (GLRT). As shown by the numerical examples, the number of targets can be estimated accurately by using GLRT, and an accurate description of the target scenario can be obtained by combining the adaptive radar imaging algorithms and the GLRT technique.
Optimised Image Autofocusing for Polarimetric ISAR
Marco Martorella (University of Pisa, Italy)
The use of full polarisation enables multi-channel SAR processing for enhancing both imaging and classification capabilities. In the field of Inverse Synthetic Aperture Radar (ISAR) very little has been investigated, especially from the point of view of multi-channel ISAR image formation. In this paper, the authors want to define an optimised image auto-focusing technique that exploits full polarisation informa-tion. Theory and simulation results will be provided in the paper.
Ljubisa Stankovic (University of Montenegro, Serbia and Montenegro); Thayananthan Thayaparan (Radar Applications and Space Technology, Defence Research and Development Canada, Ottawa, Canada, Canada); Milos Dakovic (University of Montenegro, Serbia and Montenegro); Vesna Popovic (University of Montenegro, Serbia and Montenegro)
Commonly used technique for the SAR and ISAR signal analysis is a two-dimensional Fourier transform. Moving targets in SAR or maneuvering targets in ISAR case, induce Doppler-shift and Doppler spread in the returned signal, producing blurred or smeared images. Standard techniques for the solution of these problems are motion compensation and time-frequency analysis based techniques. Both of them are computationally intensive. Here, we will present a numerically simple S-method based approach that belongs to the time-frequency techniques. Beside the basic S-method here we will present the signal adaptive form and two-dimensional form of this method. They improve readability of the radar images what will be demonstrated on the simulated SAR and ISAR setups.
Atomic Decomposition based Detector for Denoising in ISAR Imaging
We have investigated two different strategies to improve the quality of ISAR images corrupted by Gaussian noise. The images are generated using a Time Frequency technique known as Atomic Decomposition (AD). The first strategy is a classical denoising technique based on an AD detector developed for signal detection in noise. The second technique separates the atoms extracted through AD by their parameters in two classes: atoms coming from noise and atoms coming from signal components. Compared to the first one, the second technique requires a greater knowledge about the signal components.
A Novel Focusing Technique for ISAR in case of Uniform Rotation Rate
A method for the correction of Migration Through Resolution Cells (MTRC) in ISAR (Inverse Synthetic Aperture Radar) is addressed here. The new technique needs neither to know the target motion parameters, to estimate them nor to use optimization to maximize (minimize) an image focusing indicator. It assumes that the target rotation rate is uniform and the direction of the effective rotation vector does not change during the Coherent Processing Interval (CPI). The algorithm corrects the MTRC in two phases: Slant Range Rotation Compensation (SRRC) and Cross Range Rotation Compensation (CRRC), where CRRC is based on an extension of the Phase Difference method (PD). The effectiveness of the proposed technique is verified with simulated (MIG-25 aircraft) and real (sailboat) radar data and compared with the standard Range-Doppler Algorithm (RDA).
Signal Processing for Target Motion Estimation and Image Formation in Radar Imaging of Moving Targets
Trygve Sparr (FFI, Norway)
Radar imaging of moving targets is often called ISAR (Inverse Synthetic Aperture Radar.) Imaging of moving targets generally consists of two separate tasks: Estimation and correction of target motion, and the explicit image formation. Both tasks must be implemented with great care, as it is the coherent processing of the received radar signal phase that makes imaging possible. Of the two tasks, the motion compensation is often most difficult, as many radar targets move in a complicated and fairly unpredictable manner. When the motion is complicated, the imaging step can be a challenge as well. The reason is that the target 3D-structure begins to matter, and projection plane effects may cause blurred images for even well designed ISAR processors.

### Thu.4.1: Speech Recognition and Understanding I - 7 papers

Room: Sala Onice
Chair: Walter Kellermann (University Erlangen-Nuremberg, Germany)
Hands-free speech recognition using a reverberation model in the feature domain
Armin Sehr (University of Erlangen-Nuremberg, Germany); Marcus Zeller (University of Erlangen-Nuremberg, Multimedia Communications and Signal Processing, Germany); Walter Kellermann (University Erlangen-Nuremberg, Germany)
A novel approach for robust hands-free speech recognition in highly reverberant environments is proposed. Unlike conventional HMM-based concepts, it implicitly accounts for the statistical dependence of successive feature vectors due to the reverberation. This property is attained by a combined acoustic model consisting of a conventional HMM, modeling the clean speech, and a reverberation model. Since the HMM is independent of the acoustic environment, it needs to be trained only once using the usual Baum-Welch re-estimation procedure. The training of the reverberation model is based on a set of room impulse responses for the corresponding acoustic environment and involves only a negligible computational effort. Thus, the recognizer can be adapted to new environments with moderate effort. In a simulation of an isolated digit recognition task in a highly reverberant room, the proposed method achieves a 60% reduction of the word error rate compared to a conventional HMM trained on reverberant speech, at the cost of an increased decoding complexity.
Recognising verbal content of emotionally coloured speech
Theo Athanaselis (Institute for language and speech processing, Greece); Stelios Bakamidis (ILSP, Greece); Ioannis Dologlou (ILSP, Greece)
Recognising the verbal content of emotional speech is a difficult problem, and recognition rates reported in the litera-ture are in fact low. Although knowledge in the area has been developing rapidly, it is still limited in fundamental ways. The first issue concerns that not much of the spectrum of emotionally coloured expressions has been studied. The second issue is that most research on speech and emotion has focused on recognising the emotion being expressed and not on the classic Automatic Speech Recognition (ASR) problem of recovering the verbal content of the speech. Read speech and non-read speech in a careful style can be recognized with accuracy higher than 95% using the state-of-the-art speech recognition technology. Including informa-tion about prosody improves recognition rate for emotions simulated by actors, but its relevance to the freer patterns of spontaneous speech is unproven. This paper shows that rec-ognition rate for emotionally coloured speech can be im-proved by using a language model based on increased repre-sentation of emotional utterances.
A step further to objective modeling of conversational speech quality
Marie Guéguin (Université de Rennes1, France); Régine Le Bouquin-Jeannès (University of Rennes 1, France); Gérard Faucon (University of Rennes 1, France); Valérie Gautier-Turbin (France Télécom R&D, France); Vincent Barriac (France Télécom R&D, France)
A new approach to model the conversational speech quality is proposed in this paper. It has been applied to some conditions of echo and delay tested during a subjective test designed to study the relationship between conversational speech quality and talking, listening and interaction speech qualities. A multiple linear regression analysis is performed on the subjective conversational mean opinion scores (MOS) given by subjects with the talking and listening MOS as predictors. The comparison between estimated and subjective conversational scores show the validity of the proposed approach for the conditions assessed in this subjective test. The subjective talking and listening quality scores are then replaced with objective talking and listening quality scores provided by objective models. This new conversational objective model, feeded by signals recorded during the subjective test, presents a correlation of 0.938 with subjective conversational quality scores in these conditions of impairment.
Support Vector Machines for Continuous Speech Recognition
Although Support Vector Machines (SVM) have been proved to be very powerful classifiers, they still have some problems which make difficult their application to speech recognition, and most of the tries to do it are combined HMM-SVM solutions. In this paper we show a pure SVM-based continuous speech recognizer, using the SVM to make decisions at frame-level, and a Token Passing algorithm to obtain the chain of recognized words. We consider a connected digit recognition task with both, digits themselves and number of digits, unknown. The experimental results show that, although not yet practical due to computational cost, such a system can get better recognition rates that traditional HMM-based systems (96.96 % vs. 96.47 %). To overcome computational problems, some techniques as the Mega-GSVCs can be used in the future.
Combining Categorical and Primitives-Based Emotion Recognition
Michael Grimm (Universitaet Karlsruhe, Germany); Emily Mower (University of Southern California, USA); Kristian Kroschel (Universitaet Karlsruhe, Germany); Shrikanth Narayanan (University of Southern California, USA)
This paper brings together two current trends in emotion recognition; feature-based categorical classification and primitives-based dynamic emotion estimation. In this study, listeners rated a database of acted emotions using the three-dimensional emotion primitive space of valence, activation, and dominance. The emotion primitives were estimated by a fuzzy logic classifier using acoustic features. The evaluation results were also used to calculate class representations of the emotion categories happy, angry, sad, and neutral in the three-dimensional emotion space. Speaker-dependent variations of the emotion clusters in the 3D emotion space were observed for happy sentences in particular. The estimated emotion primitives were classified into the four classes using a kNN classifier. The recognition rate was 83.5% and thus significantly better than a direct classification from acoustic features. This study also provides a comparison of estimation errors of emotion primitives estimation and classification rates of emotion classification.
Parallel Training Algorithms for Continuous Speech Recognition, Implemented in a Message Passing Framework
Vladimir Popescu (CLIPS Laboratory, Institut National Polytechnique de Grenoble, Romania); Corneliu Burileanu (University "Politehnica" of Bucharest, Romania); Monica Rafaila (University "Politehnica" of Bucharest, Romania); Ramona Calimanescu (University "Politehnica" of Bucharest, Romania)
A way of improving the performance of continuous speech recognition systems with respect to the training time will be presented. The gain in performance is accomplished using multiprocessor architectures that provide a certain processing redundancy. Several ways to achieve the announced performance gain, without affecting precision, will be pointed out. More specifically, parallel programming features are added to training algorithms for continuous speech recognition systems based on hidden Markov models (HMM). Several parallelizing techniques are analyzed and the most effective ones are taken into consideration. Performance tests, with respect to the size of the training data base and to the convergence factor of the training algorithms, give hints about the pertinence of the use of parallel processing when HMM training is concerned. Finally, further developments in this respect are suggested.
Unsupervised Speaker Change Detection for Broadcast News Segmentation
Kasper Jørgensen (Technical University of Denmark, Denmark); Lasse Mølgaard (Technical University of Denmark, Denmark); Lars Kai Hansen (Technical University of Denmark, Denmark)
This paper presents a speaker change detection system for news broadcast segmentation based on a vector quantization (VQ) approach. The system does not make any assumption about the number of speakers or speaker identity. The system uses mel frequency cepstral coefficients and change detection is done using the VQ distortion measure and is evaluated against two other statistics, namely the symmetric Kullback-Leibler (KL2) distance and the so-called divergence shape distance'. First level alarms are further tested using the VQ distortion. We find that the false alarm rate can be reduced without significant losses in the detection of correct changes. We furthermore evaluate the generalizability of the approach by testing the complete system on an independent set of broadcasts, including a channel not present in the training set.

### Thu.3.1: Blind Source Separation - 7 papers

Room: Sala Verde
Chair: Kevin Knuth (University at Albany (SUNY), USA)
Separation of instantaneous mixtures of cyclostationary sources and application to digital communication signals
Pierre Jallon (University of Marne La Vallée, France); Antoine Chevreuil (Universityiversite Marne-la-Vallee, France)
In this contribution, we provide a simple condition on the statistics of the source signals ensuring that the Comon algorithm [2], originally designed for stationary data, achieves the separation of an instantaneous mixture of cyclostationary sources. The above condition is analyzed for digital communications signals and is (semi-analytically) proved to be fulfilled.
Dependent Component Analysis as a tool for blind Spectral Unmixing of remote sensed images
Cesar Caiafa (Laboratorio de Sistemas Complejos. Facultad de Ingenierà­a. Universidad de Buenos Aires, Argentina); Emanuele Salerno (Istituto di Scienza e Tecnologie dell'Informazione - CNR, Italy); Araceli Noemi Proto (Comisià³n de Investigaciones Cientà­ficas de la Prov. de Buenos Aires, Argentina); Lorenza Fiumi (Laboratorio Aereo Ricerche Ambientali. Istituto sull'Inquinamento Atmosferico - CNR, Italy)
In this work, we present a blind technique for the estimation of the material abundances per pixel (endmembers) in hyperspectral remote-sensed images. Classical spectral unmixing techniques require the knowledge of the existing materials and their spectra. This is a problem when no prior information is available. Some techniques based on independent component analysis did not prove to be very efficient for the strong dependence among the material abundances always found in real data. We approach the problem of blind endmember separation by applying the MaxNG algorithm, which is capable to separate even strongly dependent signals. We also present a minimum-mean-squared-error method to estimate the unknown scale factors by exploiting the source constraint. The results shown here have been obtained from either synthetic or real data. The synthetic images have been generated by a noisy linear mixture model with real, spatially variable, endmember spectra. The real images have been captured by the MIVIS airborne imaging spectrometer. Our results showed that MaxNG is able to separate the endmembers successfully if a linear mixing model holds true and for low noise and reduced spectral variability conditions.
A Computationally Affordable Implementation of An Asymptotically Optimal BSS Algorithm for AR Sources
Petr Tichavsky (Academy of Sciences of the Czech Republic, Czech Republic); Eran Doron (Tel-Aviv University, Israel); Arie Yeredor (Tel-Aviv University, Israel); Jan Nielsen (Institute of Information Theory and Automation, Czech Republic)
The second-order blind identification (SOBI) algorithm for separation of stationary sources was proved to be useful in many biomedical applications. This paper revisits the so called weights-adjusted variant of SOBI, known as WASOBI, which is asymptotically optimal (in separating Gaussian parametric processes), yet prohibitively computationally demanding for more than 2-3 sources. A computationally feasible implementation of the algorithm is proposed, which has a complexity not much higher than SOBI. Excluding the estimation of the correlation matrices, the post-processing complexity of SOBI is O(d^4M), where d is the number of the signal components and M is the number of covariance matrices involved. The additional complexity of our proposed implementation of WASOBI is O(d^6+d^3M^3) operations. However, for WASOBI, the number M of the matrices can be significantly lower than that of SOBI without compromising performance. WASOBI is shown to significantly outperform SOBI in simulation, and can be applied, e.g., in the processing of low density EEG signals.
A new approach for Sparse Decomposition and Sparse Source Separation
Arash Ali-Amini (Sharif University of Technology, Iran); Massoud Babaie-Zadeh (Sharif University of Technology, Iran); Christian Jutten (INP de Grenoble, France)
We introduce a new approach for sparse decomposition, based on a geometrical interpretation of sparsity. By sparse decomposition we mean finding sufficiently sparse solutions of underdetermined linear systems of equations. This will be discussed in the context of Blind Source Separation (BSS). Our problem is then underdetermined BSS where there are fewer mixtures than sources. The proposed algorithm is based on minimizing a family of quadratic forms, each measuring the distance of the solution set of the system to one of the coordinate subspaces (i.e. coordinate axes, planes, etc.). The performance of the method is then compared to the minimal 1-norm solution, obtained using the linear programming (LP). It is observed that the proposed algorithm, in its simplest form, performs nearly as well as LP, provided that the average number of active sources at each time instant is less than unity. The computational efficiency of this simple form is much higher than LP. For less sparse sources, performance gains over LP may be obtained at the cost of increased complexity which will slow the algorithm at higher dimensions. This suggests that LP is still the algorithm of choice for high-dimensional moderately-sparse problems. The advantage of our algorithm is to provide a trade-of between complexity and performance.
Blind Separation of Time-Varying Signal Mixtures Using Sparseness and Zadeh's Transform
Ran Kaftory (Technion - Israel Institute of Technology, Israel); Yehoshua Zeevi (Technion - Israel Institute of Technology, Electrical Engineering Department, Israel)
We consider the general problem of blindly separating time-varying mixtures. Physical phenomena, such as varying attenuation and the doppler effect, can be represented as special cases of a time-varying mixing model. This model can be considered as a linear mixing of time-varying attenuated-and-delayed versions of fixed channel distortions. In this special case, we use Zadeh's transform to project the signals to the time-frequency domain. In this domain, sparse source distribution highlights geometric properties of the mixing coefficients. These coefficients can be used in turn, for inverting the mixing system, and thereby, recover the time-varying filtered versions of the original sources.
A closed form solution for the blind separation of two sources from two
Adel Belouchrani (Ecole Nationale Polythechnique, Algiers,, Algeria); El-Bey Bourennane (Universite de BourgogneLE2I, Dijon, France., France); Karim Abed-Meraim (Dept TSI, Télécom Paris, France)
In this paper, we present a specific algorithm for the blind identification of a two input two output system. A closed form solution for the blind identification of the system is derived by exploiting the temporal coherence properties of the input sources. By exploiting the inherent indeterminacies of the blind processing, a simplified version is derived making the algorithm computationally cheaper and more suitable for hardware implementation. The weights of the zero forcing blind separator are then deduced. The performance of the proposed solutions with respect to the signal to noise ratio (SNR) and sample size are provided in the simulation section.
Underdetermined Source Separation for Colored Sources
Stefan Winter (NTT Communication Science Laboratories, Japan); Walter Kellermann (University Erlangen-Nuremberg, Germany); Hiroshi Sawada (NTT communication Science Laboratories, Japan); Shoji Makino (NTT Communication Science Laboratories, Japan)
So far nearly all approaches for underdetermined blind source separation (BSS) assume independently, indentically distributed (i.i.d.) sources. They completely ignore the redundancy that is in the temporal structure of colored sources like speech signals. Instead, we propose multivariate models based on the multivariate Student's t and multivariate Gaussian distribution and investigate their potential for underdetermined BSS. We provide a simple yet effective filter for recovering the sources involving their autocorrelations as basis for further advances in BSS. The experimental results suggest that underdetermined source separation can be reduced to the separation of their autocorrelations.

Room: Auditorium

## 11:20 AM - 1:00 PM

### Thu.2.2: Computationally Efficient Algorithms - 5 papers

Chair: Kutluyil Dogancay (University of South Australia, Australia)
An Efficient Method for Multi-Tap Long Term Predictor (LTP) Transcoding: Application To ITU-T G.723.1
Claude Lamblin (France Telecom, France); Mohamed Ghenania (France Telecom, France)
Network interconnections cause interoperability problems between different speech coding formats. Today, tandem (decoding/re-encoding) is currently used in communication chains. To overcome tandem drawbacks (complexity, quality degradation, delay), intelligent solutions have been proposed to efficiently transcode CELP coders parameters. This paper focuses on CELP Long Term Predictor (LTP) parameters transcoding. First, a survey of LTP transcoding methods is given. Then a novel method is described for multitap LTP transcoding. The multitap gain vector codebook search is restricted to ordered subsets selected from LTP gain parameter of the first coder. This technique is applied to intelligent transcoding towards ITU-T G.723.1. It achieves same quality as tandem while strongly reducing complexity.
An Improved Context Adaptive Binary Arithmetic Coder for the H.264/AVC standard
Simone Milani (University of Padova, Italy, Italy); Gian Antonio Mian (University of Padova, Italy)
During the last years, the increment of video transmissions over wireless channels has created the need for increasingly-efficient coding algorithms that are capable of coding the video information with a reduced number of bits. Among them, the H.264 coder provides the best performance in terms of video quality and reduced bit rates thanks to many enhanced coding solution that were included in it. One of the most effective is its adaptive arithmetic coder that estimates the probability of each syntax element via an elaborate structure of contexts. However, the coder can be significantly improved exploiting the statistical dependency in the transformed signal. In fact, the DCT coefficients of a single transform block or of a macroblock are correlated among each other. This statistical dependency makes it possible a better estimation of the bit probability with respect to the context counters defined by the standard. For this purpose, the probability mass function of the different bit planes in a block of coefficient can be estimated through a graphical model associated to a Directed Acyclic Graph (DAG). Experimental results report that the adoption of a DAG model leads to $10 \%$ reduction of the bit stream size for a given quality or, otherwise, a quality increment between $0.5$ and $1$ dB at the same bit rate .
The Block LMS Algorithm and its FFT based Fast Implementation--- New Efficient Realization using Block Floating Point Arithmetic
Mrityunjoy Chakraborty (Indian Institute of Technology., Kharagpur, India); Shaik Rafiahamed (Indian Institute of Technology, Kharagpur, India)
An efficient scheme is proposed for implementing the block LMS algorithm in a block floating point framework that permits processing of data over a wide dynamic range at a processor complexity and cost as low as that of a fixed point processor. The proposed scheme adopts appropriate formats for representing the filter coefficients and the data. Using these and a new upper bound on the step size, update relations for the filter weight mantissas and exponent are developed, taking care so that neither overflow occurs, nor are quantities which are already very small multiplied directly. It is further shown how the mantissas of the filter coefficients and also the filter output can be evaluated faster by suitably modifying the approach of the fast block LMS algorithm.
Using GPU for fast block-matching
Sebastien Mazare (Eurécom, France); Renaud Pacalet (Enst, France); Jean-Luc Dugelay (Eurécom, France)
On the one hand, basic PCs include by default more and more powerful GPU (Graphics Processing Unit) that sometimes even outperforms CPU (Computer Processing Unit) for some specific tasks. On the other hand, video processing are more and more useful and required for numerous emerging services and applications but is often too computational expensive to reach real time. Within the context of GPGPU (General Purpose GPU), we propose in this paper a possible implentation of a block-matching algorithm. Some significant results are obtained in terms of acceleration and quality. Based on our experiments on Block-Matching within the context of video compression, we finally underline some existing limitations of using GPU beyong graphics for image and video processing even if this concept remains attractive.
Fast Renormalization for H.264/MPEG4-AVC Arithmetic Coding
Detlev Marpe (Fraunhofer HHI, Germany); Heiner Kirchhoffer (Fraunhofer HHI, Germany); Gunnar Marten (Fraunhofer HHI, Germany)
We propose a fast, standard-compliant realization of the computationally expensive renormalization part of the binary arithmetic coder in H.264/MPEG4-AVC. Our technique allows to replace time-consuming, bitwise-operating input and output as well as bitwise carry-over handling in a conventional implementation with corresponding operations in units of multiple bits. Experimental results demonstrate that the proposed method enables a considerable speed-up of both arithmetic encoding and decoding in the range of 24 to 53%

### Thu.5.2: Wavelet Transform for Image Processing and Artificial Vision - 5 papers

Chair: Patrice Abry (Ecole Normale Superieure, Lyon, France)
Automation of pavement surface crack detection with a matched filtering to define the mother wavelet function used
Peggy Subirats (Laboratoire Central des Ponts et Chaussées, France); Jean Dumoulin (Laboratoire Central des Ponts et Chaussées, France); Vincent Legeay (Laboratoire Central des Ponts et Chaussées, France); Dominique Barba (Institut de Recherche en Communications et Cybernétique de Nantes, France)
This paper presents a new approach in automation for crack detection on pavement surface images. The method is based on the continuous wavelet transform. In the first step, a 2D continuous wavelet transform for several scales is per-formed. Complex coefficient maps are built. The angle and modulus information are used to keep significant coeffi-cients. The mother wavelet function is defined using a matched filtering, thus the method is self-adapted to the road texture. No user intervention is needed. Then, wavelet coefficients maximal values are searched and their propaga-tion through scales is analyzed. Finally, a post-processing gives a binary image which indicates the presence or not of cracks on the pavement surface image.
Rotation-Invariant Local Feature Matching with Complex Wavelets
Nick Kingsbury (University of Cambridge, United Kingdom)
This paper describes a technique for using dual-tree complex wavelets to obtain rich feature descriptors of keypoints in images. The main aim has been to develop a method for retaining the full phase and amplitude information from the complex wavelet coefficients at each scale, while presenting the feature descriptors in a form that allows for arbitrary rotations between the candidate and reference image patches.
Wavelet-Constrained Regularization for Disparity Map Estimation
Wided Miled (Université de Marne la vallée, France); Jean-Christophe Pesquet (Univ. Marne la Vallee, France); Michel Parent (INRIA, France)
This paper describes a novel method for estimating dense disparity maps, based on wavelet representations. Within the proposed set theoretic framework, the stereo matching problem is formulated as a constrained optimization problem in which a quadratic objective function is minimized under multiple convex constraints. These constraints arise from the prior knowledge and the observations. In order to obtain a smooth disparity field, while preserving edges, we consider appropriate wavelet based regularization constraints. The resulting optimization problem is solved with a block iterative method which offers great flexibility in the incorporation of several constraints. Experimental results on both synthetic and real data sets show the excellent performance and robustness w.r.t. noise of our method.
Multichannel image deconvolution in the wavelet transform domain
Amel Benazza-Benyahia (SUPCOM Tunis, Tunisia); Jean-Christophe Pesquet (Univ. Marne la Vallee, France)
In this paper, we are interested in restoring blurred multicomponent images corrupted by an additive Gaussian noise. The novelty of the proposed approach is two-fold. Firstly, we show how to combine M-band Wavelet Transforms (WT) with Fourier analysis to restore multicomponent images. Secondly, we point out that the multichannel deconvolution procedure takes advantage of exploiting multivariate regression rules. Simulations experiments carried out on multispectral satellite images indicate the good performance of our method.
Bootstrap for log Wavelet Leaders Cumulant based Multifractal Analysis
Herwig Wendt (ENS Lyon, France); Stéphane Roux (ENS-Lyon, France); Patrice Abry (Ecole Normale Superieure, Lyon, France)
Multifractal analysis, which mostly consists of estimating scaling exponents related to the power law behaviors of the moments of wavelet coefficients, is becoming a popular tool for empirical data analysis. However, little is known about the statistical performance of such procedures. Notably, despite their being of major practical importance, no confidence intervals are available. Here, we choose to replace wavelet coefficients with wavelet leaders and to use a log-cumulant based multifractal analysis. We investigate the potential use of bootstrap to derive confidence intervals for wavelet Leaders log-cumulant multifractal estimation procedures. From numerical simulations involving well-known and well-controlled synthetic multifractal processes, we obtain two results of major importance for practical multifractal analysis : we demonstrate that the use of Leaders instead of wavelet coefficients brings significant improvements in log-cumulant based multifractal estimation, we show that accurate bootstrap designed confidence intervals can be obtained for a single finite length time series.

### Thu.1.2: Source Localization - 5 papers

Room: Auditorium
Chair: Mustafa Altinkaya (Izmir Institute of Technology, Turkey)
Experimental evaluation of a localization algorithm for multiple acoustic sources in reverberating environments
Fabio Antonacci (Politecnico di Milano, Italy); Diego Saiu (Politecnico di Milano, Italy); Paolo Russo (Politecnico di Milano, Italy); Augusto Sarti (DEI - Politecnico di Milano, Italy); Marco Tagliasacchi (Politecnico di Milano, Italy); Stefano Tubaro (Politecnico di Milano, Italy)
The problem of blind separation of multiple acoustic sources has been recently addressed by the TRINICON framework. By exploiting higher order statistics, it allows to successfully separate acoustic sources when propagation takes place in a reverberating environment. In this paper we apply TRINICON to the problem of source localization, emphasizing the fact that it is possible to achieve small localization errors also when source separation is not perfectly obtained. Extensive simulations have been carried out in order to highlight the trade-offs between complexity and localization error at different levels of reverberation.
Information Content-Based Sensor Selection for Collaborative Target Tracking
Tolga Onel (Turkish Navy, Turkey); Cem Ersoy (Bogazici University, Turkey); Hakan Delic (Bogazici University, Turkey)
For target tracking applications, small wireless sensors provide accurate information since they can be deployed and operated near the phenomenon. These sensing devices have the opportunity of collaboration amongst themselves to improve the target localization and tracking accuracies. Distributed data fusion architecture provides a collaborative tracking framework. Due to the present energy constraints of these small, sensing and wireless communicating devices, a common trend is to put some of them into dormant state. We adopt a mutual information based metric to select the most informative subset of the sensors to achieve reduction in the energy consumption, while preserving the desired accuracies of the target position estimation for the distributed data fusion architecture.
Sensor Node Localization via Spatial Domain Quasi-Maximum Likelihood Estimation
Seshan Srirangarajan (University of Minnesota, USA); Ahmed Tewfik (Univ. of Minnesota, USA)
A Sensor node localization algorithm for indoor quasi-static sensor environments using spatial domain quasi-maximum likelihood (QML) estimation is presented. A time of arrival (TOA) based algorithm is used to arrive at the â?pseudoâ? range estimates from the base stations to the sensor nodes. The localization algorithm uses spatial domain quasi-maximum likelihood estimation to determine the actual sensor location. The algorithm is preceded by a calibration phase during which statistical characterization of the line-of-sight (LOS) and non-line-of-sight (NLOS) returns are derived. Using a synthesized bandwidth of 2GHz, a 4-bit analog-to-digital converter (ADC) and 5-10dB signal-to-noise ratio (SNR), localization with high accuracy is achieved.
Bias Reduction for The Scan-Based Least-Squares Emitter Localization Algorithm
Kutluyil Dogancay (University of South Australia, Australia); Samuel Drake (Defence Science and Technology Organisation, Australia)
This paper presents bias reduction techniques for the scan-based least-squares emitter localization algorithm. Scan-based emitter localization exploits constant scan rate of the radar antenna main beam to allow determination of the emitter location by three or more receivers. It does away with high-precision timing requirements for time of arrival measurements for intercepted radar beams and does not require high-sensitivity receivers to pick up sidelobes of the radar beam pattern. The paper develops a weighted least-squares estimator and an iterative maximum likelihood estimator to overcome the least-squares estimation bias. The improved bias performance is illustrated with simulation examples.
Bearing and range estimation of buried cylindrical shells in presence of sensor phase errors
Zineb Saidi (IRENav Ecole Navale, France); Salah Bourennane (Institut Fresnel, France)
Localization of buried cylindrical shells in presence of sensor phase errors is presented. This method is based on a new technique [5] that we have developed for buried object localization using a combination of acoustical model, focusing operator and the MUSIC method. The main assumption in [5] is the well known of the sensor positions that are not always satisfied. Thus, in practice the sensors move from their original positions during the experimentation (deformed sensor array) which introduce phase error in each sensor. Correction of phase errors is necessary to solve the object localization problem. The method provides estimation of the bearings and ranges of all buried objects as well as the phase error of each sensor in the observing array. This problem is reduced to minimization problem function using the DIRECT algorithm (DIviding RECTangles) to seek the minimum of this function. Finally, the performances of the proposed method are validated on experimental data recorded during an underwater acoustics experiments.

### Poster: CDMA Systems and Signals - 9 papers

Room: Poster Area
Chair: Bjorn Ottersten (Royal Institute of Technology, Sweden)
A family of spatiotemporal chaotic sequences outperforming Gold ones in asynchronous DS-CDMA systems.
Soumaya Meherzi (SYSCOM Laboratory & LSS, Tunisia); Sylvie Marcos (Laboratoire des Signaux et Systèmes, France); Safya Belghith (SYSCOM Laboratory/ ENIT, Tunisia)
In this work, a new family of spatiotemporal chaotic codes is proposed as an alternative to the Gold codes conventionally used in asynchronous DS/CDMA systems. In addition to their synchronization advantage over the temporal-only chaotic codes, these new codes are shown to have improved performance compared to the Gold codes. The performance criteria used in this study are Multiple Access Interference (MAI), Signal to Noise Ratio (SNR) and Bit Error Probability.
Performance of the Super Stable Orbits based Spreading Sequences in a DS-CDMA System with a MMSE Receiver
Zouhair Ben Jemaa (Signaux and Systems Laboratory, Supélec, France); Sylvie Marcos (Laboratoire des Signaux et Systems, Supélec, CNRS UMR8506, France); Safya Belghith (SYSCOM Laboratory/ ENIT, Tunisia)
It has been shown in [12] that quantized selected Super Stable Orbits (SSO) of the logistic map allow better performance in terms of the Average Interference Parameter (AIP) criterion than those allowed by Gold Sequences, when a conventional or decorrelator receiver is considered in a DS-CDMA system. In this paper we analyse the performance of the MMSE receiver when these orbits are considered. The motivation is that the performance of the MMSE receiver in term of error rate depends on the AIP. We find that the performance of the MMSE can be improved by considering non classical spreading sequences.
Bayesian multiuser detection based on a network of NLMS
Bessem Sayadi (LSS-SUPELEC-CNRS, France); Sylvie Marcos (Laboratoire des Signaux et Systèmes, France)
The Network of Kalman Filters structure was proposed, recently, to perform an optimal Bayesian symbol-by-symbol estimation in the multiuser detection context. By approximating the prediction error covariance matrix on each branch by a constant diagonal one, we show in this paper that the NKF structure can be expressed into a particular Network of normalized LMS filters exhibiting less computational complexity. The choice of the value of the step-size is also discussed. In order to overcome its heuristic choice, we, here, propose a new adaptive step size based on the second order moment of the estimated symbols. The form of the step-size still contains an information on the a priori state estimation. The performance of the resulted receiver structure is evaluated by means of computer simulations for very high asynchronous system load, in multipath fading channel, and compared to MAP, NKF, MMSE and Rake receivers.
Blind Multiuser Detection by Kurtosis Maximization for Asynchronous Multi-rate DS/CDMA Systems
Chun-Hsien Peng (National Tsing Hua University, Taiwan); Chong-Yung Chi (National Tsing Hua University, Taiwan); Chia-Wen Chang (National Tsing Hua University, Taiwan)
In this paper, Chi and Chen's computationally efficient fast kurtosis maximization algorithm (FKMA) for blind source separation is applied to blind multiuser detection (BMD) for asynchronous multi-rate (variable processing gain or multi-code) DS/CDMA systems. The proposed blind multiuser detection algorithm, referred to as the BMD-FKMA, enjoys the fast convergence rate and computational efficiency of the FKMA. Furthermore, the BMD-FKMA in conjunction with the blind maximum ratio combining algorithm proposed by Chi et al. is considered for multi-rate DS/CDMA systems equipped with multiple receive antennas. Finally, some simulation results are provided to support the efficacy of the proposed BMD-FKMA and the performance as well as complexity improvements over some existing algorithms.
A new LDPC-STB coded MC-CDMA systems with SOVA-based decoding and soft-interference cancellation
Luis Paredes (Polytechnical University of Madrid, Spain)
This paper analyzes several interference cancellation scheme applied to LDPC-STB coded MC-CDMA systems with SOVA-based decoding. In these systems, a linear MMSE detector is conventionally used to reduce interference generated by multipath, multiuser, and multiple antennas propagation. To obtain further performance improvements, a more efficient SOVA-based iterative MMSE scheme is considered. This receiver performs soft-interference cancellation for every user based on a combination of the MMSE criterion and the turbo processing principle. It is shown that these block space-time concatenated with LDPC detectors can potentially provide significant capacity enhancements over the conventional matched filter receiver.
A Turbo Receiver For Wireless MC-CDMA Communications Using Pilot-Aided Kalman Filter-Based Channel Tracking And MAP-EM Demodulator With LDPC Codes
Tung Man Law (The University of Hong Kong, Hong Kong); Shing-Chow Chan (The University of Hong Kong, Hong Kong)
This paper presents a turbo receiver for wireless multi-carrier code division multiple access (MC-CDMA) communications using pilot-aided Kalman filter-based channel tracking and maximum a posteriori expectation-maximization (MAP-EM) demodulator with a soft low-density parity-check (LDPC) decoder. The pilot-aided Kalman filter considerably simplifies the tracking of the time-varying channel, which helps to improve the performance of the MAP-EM demodulator and LDPC decoder in fast fading channels. Simulation results show that the proposed system gives a better bit-error-rate (BER) performance than the conventional turbo receiver, at the expense of slightly lower data rate due to use of pilot symbols. It therefore provides a useful alternative to conventional approaches with a different tradeoff between BER performance, implementation complexity, and transmission bandwidth.
A MC-CDMA Iterative Solution for Broadband over Powerline Communications
Vincent Le Nir (K. U. Leuven, Belgium); Marc Moonen (Katholieke Universiteit Leuven, Belgium)
Power Line Communication (PLC) is foreseen as a potential solution for increasing the throughput of future wireline communication systems. Indeed, the existing infrastructure allows the development of Broadband over Power Lines (BPL) to provide a competing high-speed Internet 'to-the-home' alternative. The frequency selectivity and the impulse noise of the PLC channel call for advanced signal processing techniques. Multi-Carrier Code Division Multiple Access (MC-CDMA) is a promising transmission procedure to mitigate these unfavorable properties, along with a linear iterative receiver to remove Multiple Access Interference (MAI) and Inter Symbol Interference (ISI). This paper focuses on the full description of a MC-CDMA transceiver on a block by block basis over realistic PLC channel models and adequate simulation parameters including spreading, interleaving, Orthogonal Frequency Division Multiplex (OFDM) modulation, linear Minimum Mean Square Error (MMSE) equalizer, Soft Output Viterbi Algorithm (SOVA) and iterative decoding.
Global Calibration of CDMA-Based Arrays
Azibananye Mengot (Imperial College London, United Kingdom); Athanassios Manikas (Imperial College London, United Kingdom)
In this paper, a calibration approach capable of handling simultaneously location, gain and phase uncertainties, as well as mutual coupling and multipath, is proposed for asynchronous CDMA-based antenna arrays. The manifold vector is modelled based on a first order Taylor series expansion, to encompass the errors. The calibration technique involves a hybrid combination of pilot and self calibration techniques, and requires the code sequence of a reference user. This method employs the concept of the STAR (Spatio-Temporal ARray) manifold vector and a subspace type preprocessor to provide estimates of the path delays and directions, as well as estimating the array manifold, location, gain and phase errors taking mutual coupling effects into consideration.
A Linear Chip-Level Model for Multi-Antenna Space-Time Coded Wideband CDMA Reconfigurable Receivers
Giovanni Garbo (Università  di Palermo, Italy); Stefano Mangione (Università  di Palermo, Italy); Vincenzo Maniscalco (Università  di Palermo, Italy)
Transmitter side reconfigurability is currently mostly imple-mented via adaptive modulation and coding. In the scenario of multiple antenna transmission which is foreseeable for fourth generation wireless mobile communications, a new level of reconfigurability might employ transmission scheme adaptation, i.e. dynamically switching between spatial-multiplex and space-time coded transmission. This paper presents a generalized linear model for the re-ceived signal in a multi-antenna CDMA signalling scheme on frequency-selective fading MIMO channels. The pro-posed model is unprecedented in that it supports general complex space-time codes as well as spatial-multiplex multi-antenna transmission and it may be implemented with minor modifications to the front-end of a matched filter receiver.

### Poster: Speech and Speaker Recognition and Analysis - 15 papers

Room: Poster Area
Chair: Sergios Theodoridis (University of Athenes, Greece)
Speaker recognition experiments on a bilingual database
Marcos Faundez-Zanuy (Escola Universitaria Politecnica de Mataro, Spain); Antonio Satue-Villar (EUP Mataro, Spain)
This paper presents some speaker recognition experiments using a bilingual speakers set (49), in two different languages: Spanish and Catalan. Phonetically there are significant differences between both languages. These differences have let us to establish several conclusions on the relevance of language in speaker recognition, using two methods: vector quantization and covariance matrices.
Efficient Implementation of GMM Based Speaker Verification Using Sorted Gaussian Mixture Model
Hamid Sadegh Mohammadi (Iranian Research Institute for Electrical Engineering, Iran); Rahim Saeidi (Iran University of Science and Technology, Iran)
In this paper a new structured Gaussian mixture model, called sorted GMM, is proposed as an efficient method to implement GMM-based speaker verification systems; such as Gaussian mixture model universal background model (GMM-UBM) scheme. The proposed method uses a sorted GMM which facilitate partial search and has lower compu-tational complexity and less memory requirement compared to the well-known tree-structured GMM of the same model order. Experimental results show that a speaker verification system based on the proposed method outperforms that of a similar system which uses tree-structured from performance point of view. It also provides comparable performance with the GMM-UBM method despite its 3.5 times lower computational cost for a GMM of order 64.
Speech to facial animation conversion for deaf customers
Gyorgy Takacs (Peter Pazmany Catholic University, Hungary); Attila Tihanyi (Péter Pà¡zmà¡ny Catholic University, Hungary); Tamás Bardi (Péter Pà¡zmà¡ny Catholic University, Hungary); Gergely Feldhoffer (Péter Pà¡zmà¡ny Catholic University, Hungary); Bálint Srancsik (Péter Pà¡zmà¡ny Catholic University, Hungary)
A speech to facial animation direct conversion system was developed as a communication aid for deaf people. Utilizing the preliminary test results a specific database was constructed from audio and visual records of professional lip-speakers. The standardized MPEG-4 system was used to animate the speaking face model. The trained neural net is able to calculate the principal component weights of feature points from the speech frames. The control coordinates have been calculated from PC weights. The whole system can be implemented in standard mobile phones. Deaf persons were able correctly recognize about 50% of words from limited sets in the final test based on our facial animation model.
Comparing Confidence-Guided and Adaptive Dynamic Pruning Techniques for Speech Recognition
Tibor Fabian (Volt Delta International, Germany); Günther Ruske (Technische Università¤t Mà¼nchen, Germany, Germany)
Improvement in pruning algorithms for automatic speech recognition leads directly to a more efficient recognition process. Efficiency is a very important issue in particular for embedded speech recognizers with limited memory capacity and CPU power. In this paper we compare two pruning algorithms, the confidence-guided pruning and the adaptive control pruning technique. Both methods set the pruning threshold for the Viterbi beam search process dynamically for each time frame depending on search space properties. We show that both dynamic pruning techniques are applicable in reducing the time consumption of the recognizer whereas our novel confidence-guided pruning approach outperforms the adaptive control technique clearly.
ICANDO: Intellectual Computer AssistaNt for Disabled Operators
Alexey Karpov (St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences, Russia); Andrey Ronzhin (Speech Informatics Group of SPIIRAS, Russia)
The paper describes a prospective multimodal system ICanDo (Intellectual Computer AssistaNt for Disabled Op-erators) developed in SPIIRAS and intended for assistance to persons without hands or with disabilities of their hands or arms in human-computer interaction. This system com-bines the modules for automatic speech recognition and head tracking in one multimodal system. The architecture of the system, methods for recognition and tracking, multimo-dal information fusion and synchronization, experimental conditions and obtained results are described in the paper. The developed system was applied for hands-free work with Graphical User Interface in such tasks as Internet commu-nication and work with documents.
Analysis of Disfluent Repetitions in Spontaneous Speech Recognition
Vivek Kumar Rangarajan Sridhar (University of Southern California, USA); Shrikanth Narayanan (University of Southern California, USA)
In this paper, we investigate the effect of disfluent repetitions in spontaneous speech recognition. We characterize the repetition errors in an automatic speech recognition framework using repetition word error rate (RWER). The problem is addressed by both building classifiers based on acoustic-prosodic features and a multiword model for modeling repetitions. We also analyze the repetition word error rate for different acoustic and language models in the Fisher conversational speech corpus. The classifier approach is not promising on recognizer output and generates a high degree of false alarms. The multiword approach to modeling the most frequent function word repetitions results in an absolute RWER reduction of 1.26% and a significant absolute WER reduction of 2.0% on already well trained acoustic and language models. This corresponds to a relative RWER improvement of 75.9%.
Weighted Nonlinear Prediction Based on Volterra Series for Speech Analysis
Karl Schnell (Goethe-University Frankfurt, Germany); Arild Lacroix (University Frankfurt, Germany)
The analysis of speech is usually based on linear models. In this contribution speech features are treated using nonlinear statistics of the speech signal. Therefore a nonlinear prediction based on Volterra series is applied segment-wise to the speech signal. The optimal nonlinear predictor can be determined by a vector expansion. Since the statistics of a segment is estimated a window function is integrated into the estimation procedure. Speech features are investigated representing the prediction gain between the linear and the nonlinear prediction. The analyses of speech signals show that the nonlinear features correlate with the glottal pulses. The integration of an appropriate window function into the prediction algorithm plays an important part for the results.
Aggelos Pikrakis (University of Athens, Greece); Theodoros Giannakopoulos (University of Athens, Greece); Sergios Theodoridis (University of Athenes, Greece)
This paper presents a speech/music discrimination scheme for radio recordings using a hybrid architecture based on a combination of a Variable Duration Hidden Markov Model (VDHMM) and a Bayesian Network (BN). The proposed scheme models speech and music as states in a VDHMM. A modified Viterbi algorithm for the computation of the observations' probabilities at each state is proposed. This is achieved by embedding a BN, that outputs to the HMM the required probability values. The proposed system has been tested on audio recordings from a variety of radio stations and has exhibited an overall performance close to 95%.
Classification of Speech Under Stress Using Features Selected by Genetic Algorithms
Salvatore Casale (University of Catania, Italy); Alessandra Russo (Università  degli Studi di Catania, Italy); Salvatore Serrano (University of Catania, Italy)
Determination of an emotional state through speech increases the amount of information associated with a speaker. It is therefore important to be able to detect and identify a speaker's emotional state or state of stress. The paper proposes an approach based on genetic algorithms to determine a set of features that will allow robust classification of emotional states. Starting from a vector of 462 features, a subset of features is obtained providing a good discrimination between neutral, angry, loud and Lombard states for the SUSAS simulated domain and between neutral and stressed states for the SUSAS actual domain.
Mutual information eigenlips for audio-visual speech recognition
Ivana Arsic (Swiss Federal Institute of Technology Lausanne (EPFL), Switzerland); Jean-Philippe Thiran (Swiss Federal Institute of Technology (EPFL), Switzerland)
This paper proposes an application of information theoretic approach for finding the most informative subset of eigenfeatures to be used for audio-visual speech recognition tasks. The state-of-the-art visual feature extraction methods in the area of speechreading rely on either pixel or geometric based methods or their combination. However, there is no common rule defining how these features have to be selected with respect to the chosen set of audio cues and how well they represent the classes of the uttered speech. Our main objective is to exploit the complementarity of audio and visual sources and select meaningful visual descriptors by the means of mutual information. We focus on the principal components projections of the mouth region images and apply the proposed method such that only those cues having the highest mutual information with word classes are retained. The algorithm is tested by performing various speech recognition experiments on a chosen audio-visual dataset. The obtained recognition rates are compared to those acquired using a conventional principal component analysis and promising results are shown.
Emotion Audio-Visual Text-to-Speech
Mohamed Abou Zliekha (Damascus University, Information Technology Faculty, Syria); Samer Al Moubayed (Damascus University, Information Technology Faculty, Saudi Arabia); Oumayma Al Dakkak (HIAST (Higher Institute of Applied Science and Technology), Syria); Nada Ghneim (HIAST (Higher Institute of Applied Science and Technology), Saudi Arabia)
The goal of this paper is to present an emotional audio-visual Text to speech system for the Arabic Language. The system is based on two entities: un emotional audio text to speech sys-tem which generates speech depending on the input text and the desired emotion type, and un emotional Visual model which generates the talking heads, by forming the corre-sponding visemes. The phonemes to visemes mapping, and the emotion shaping use a 3-paramertic face model, based on the Abstract Muscle Model. We have thirteen viseme models and five emotions as parameters to the face model. The TTS produces the phonemes corresponding to the input text, the speech with the suitable prosody to include the prescribed emotion. In parallel the system generates the visemes and sends the controls to the facial model to get the animation of the talking head in real time.
Evaluation of Implicit Broad Phonemic Segmentation of Speech Signals using Pitchmarks
Iosif Mporas (University of Patras, Greece); Panagiotis Zervas (University of Patras, Greece)
In this paper, we evaluate an implicit approach for the automatic detection of broad phonemic class boundaries of con-tinuous speech signals. The reported method is consisted of the prior segmentation of speech signal into pitch-synchronous segments, using pitchmarks location, for the computation of adjacent broad phonemic class boundaries. The approachs validity was tested on a phonetically rich speech corpus of Greek language as well as on the DARPA-TIMIT American-English language corpus. Our frameworks results were very promising since by this method we achieved 25 msec accuracy of 76% and 74,9% respectively, without presenting over-segmentation on the speech signal.
Novel Auditory Motivated Subband Temporal Envelope Based Fundamental Frequency Estimation Algorithm
Seelamantula Sekhar (Ecole Polytechnique Federale de Lausanne, Switzerland); Sridhar Pilli (Dept. of Electrical Comm. Engg, Indian Institute of Science, Algeria); Lakshmikanth C (Indian Institute of Science, India); Thippur Venkat Sreenivas (Indian Institute of Science, India)
We address the problem of fundamental frequency estimation of human speech. We present a novel solution motivated by the importance of amplitude modulation in sound processing and speech perception. The new algorithm is based on cumulative spectrum computed from the subband temporal envelopes. We provide theoretical analysis to derive the new envelope based pitch estimator based on the subband temporal envelopes. We report extensive experimental performance for synthetic as well as natural vowels for both real-world noisy and noise-free data. Experimental results show that the new technique performs accurate pitch estimation and is robust to noise. Comparative experimental results show that the technique is superior to the autocorrelation technique for pitch estimation.
First-Order Markov Property of the Auditory Spiking Neuron Model Response
Alexei Ivanov (Moscow Institute of Physics and Technology, Russia); Alexander Petrovsky (Bialystok Technical University, Poland)
This paper explores properties of the spiking neuron model of auditory nerve fiber. As it results from the described rea-soning, the model response in a form of the spike sequence is in fact a first-order Markov chain of certain non-overlapping subsequences, which, being taken separately, encode the incoming signal on the corresponding time inter-vals. This observation comes as a direct consequence of the finite precision of the spike registration process at the higher levels of neural signal processing. The result has important implications to the modelling of auditory apparatus and signal processing algorithmic interpretation of hearing physiology.
Unsupervised Speaker Indexing Using One-Class Support Vector Machines
Belkacem Fergani (University of Sciences and Technology Houari Boumedienne, Algeria); Manuel Davy (Lille, France); Amrane Houacine (USTHB, Algeria)
This paper adresses the unsupervised speaker change detection problem. We derive anew approach based on the Kernel Change Detection algorithm introduced recently by Desobry et al. This new algorithm does not require explicit modeling of the data, and is able to deal with large dimensional acoustic feature vectors. Several experiments using RT'03S NIST data shows the efficieny of the algorithm, the parameters tuning and compares it to the well known GLR-BIC algorithm.

### Poster: Wavelet Applications in Speech and Image - 7 papers

Room: Poster Area
Chair: Pradip Sircar (Indian Institute of Technology Kanpur, India)
Analysis of Multicomponent Speech-like Signals by Continuous Wavelet Transform-based Technique
Pradip Sircar (Indian Institute of Technology Kanpur, India); Keshava Prasad (Indian Institute of Technology Kanpur, India); Bandra Harshavardhan (Indian institute of Technology Kanpur, India)
A novel technique based on the continuous wavelet transform (CWT) is presented for analysis of multicomponent amplitude and frequency modulated signals. In the process, the multicomponent signal is decomposed in its constituents on the time-frequency plane and the analysis is carried out on individual components. It is demonstrated that the separation of components for analysis brings many advantages in the proposed method, viz., simplicity in procedure, and noise immunity. The developed technique is employed for analysis-synthesis of speech phonemes.
DCT-based image compression using wavelet-based algorithm with efficient deblocking filter
Wen-Chien Yan (Department of Information Management, Chung Chou Institution of Technology, Taiwan); Yen-Yu Chen (ChengChou Institute of Technology, Taiwan)
This work adopts DCT and modifies the SPIHT algorithm to encode DCT coefficients. The algorithm represents the DCT coefficients to concentrate signal energy and proposes com-bination and dictator to eliminate the correlation in the same level subband for encoding the DCT-based images. The proposed algorithm also provides the deblocking func-tion in low bit rate in order to improve the perceptual qual-ity. This work contribution is that the coding complexity of the proposed algorithm for DCT coefficients is just close to JPEG but the performance is higher than JPEG2000. Ex-perimental results indicate that the proposed technique im-proves the quality of the reconstructed image in terms of both PSNR and the perceptual results close to JPEG2000 at the same bit rate.
Wavelet method of speech segmentation
Bartosz Ziolko (University of York, United Kingdom); Suresh Manandhar (University of York, United Kingdom); Richard Wilson (University of York, United Kingdom); Mariusz Ziolko (AGH University of Science and Technology, Poland)
In this paper a new method of speech segmentation is suggested. It is based on power fluctuations of the wavelet spectrum for a speech signal. In most approaches to speech recognition, the speech signals are segmented using constant-time segmentation. Constant segmentation needs to use windows to decrease the boundary distortions. A more natural approach is to segment the speech signals on the basis of time-frequency analysis. Boundaries are assigned in places where some energy of a frequency band rapidly changes. The wavelet decomposition signals are analysed to localise these places. Most methods of non-constant segmentation need training for particular data or are realized as a part of modelling. In this paper we apply the discrete wavelet transform (DWT) to analyse speech signals, the resulting power spectrum and its derivatives. This information allows us to locate the boundaries of phonemes. It is the first stage of speech recognition process. Additionally we present an evaluation by comparing our method with hand segmentation. The segmentation method proves effective for finding most phoneme boundaries. Results are more useful for speech recognition than constant segmentation.
The Mexican Hat Wavelet Family. Application to the study of non-Gaussianity in cosmic microwave backgound maps
Francisco Argüeso (Universidad de Oviedo, Spain)
The detection of the non-Gaussian signal due to extragalactic point sources and its separation from the possible intrinsic non-Gaussianity is an issue of great importance in the cosmic microwave background (CMB) analysis. The Mexican Hat Wavelet Family (MHWF), which has been proved very useful for the detection of extragalactic point sources, is applied here to the study of non-Gaussianity due to point sources in CMB maps. We carry out simulations of CMB maps with the characteristics of the forthcoming Planck mission at 70 and 100 GHz and filter them with the MHWF. By comparing the skewness and kurtosis of simulated maps with and without point sources, we are able to detect clearly the non Gaussian signal due to point sources for flux cuts as low as 0.4 Jy (70 GHz) and 0.3 Jy (100 GHz). The MHWF performs better in this respect than the Mexican Hat Wavelet and much better than the Daubechies 4 wavelet.
Reconstruction of hidden images using wavelet transform and an entropy-maximization algorithm
Naoto Nakamura (Kyushu University, Japan); Shigeru Takano (Kyushu University, Japan); Yoshihiro Okada (Kyushu University, Japan); Koichi Niijima (Kyushu University, Japan)
This paper proposes a blind image separation method using wavelet transform and an entropy-maximization algorithm. Our blind separation algorithm is an improved version of the entropy-maximization algorithms presented by Bell-Sejnowsky and Amari. These algorithms work well for signals having a supergaussian distribution, such as speech and audio. The proposed method is to apply the improved algorithm to the wavelet coefficients of a natural image, whose distribution is close to supergaussian. Our method successfully reconstruct twelve images hidden in another twelve images which are similar each other.
Wavelets on the Sphere. Application to the detection problem
Jose Luis Sanz (University of Santander, Spain); Diego Herranz (IFCA, Santander, Spain, Spain); Marcos López-Caniego (IFCA, Santander, Spain, Spain); Francisco Argüeso (Universidad de Oviedo, Spain)
A new method is presented for the construction of a natural continuous wavelet transform on the sphere. It incorporates the analysis and synthesis with the same wavelet and the definition of translations and dilations on the sphere through the spherical harmonic coefficients. We construct a couple of wavelets as an extension of the flat Mexican Hat Wavelet to the sphere and we apply them to the detection of sources on the sphere. We remark that no projections are used with this methodology.
Singularity detection of electroglottogram signal by multiscale product method
Aïcha Bouzid (National School of Engineers of Tunis, Tunisia)
This paper deals with singularity detection in electroglot-togram (EGG) signal using multiscale product method. Wavelet transform of EGG signal is operated by a windowed first derivative of a Gaussian function. This wavelet trans-form acts as a derivative of a smoothed signal by the Gaus-sian function. The wavelet coefficients of EGG calculated for different scales, show modulus maxima at discontinui-ties. The detected singularities correspond to glottal opening and closure instants called GOIs and GCIs. Multiscale product is the multiplication of wavelet coefficients of the signal at three successive scales. This multiscale analysis enhances edge detection, and gives better estimation of the maxima. Geometric mean of the three scale wavelet coeffi-cients is calculated by applying cubic root amplitude func-tion on product. This method gives a good representation of GCI and a best detection of GOI, so as the product is a nonlinear combination of different scales which reduces noise and spurious peaks. The presented method is effective and robust in all cases even for particular signal showing undetermined GOIs and multiple closure peaks.

### Poster: Biomedical Signal Processing - 7 papers

Room: Poster Area
Chair: Asoke Nandi (The University of Liverpool, United Kingdom)
Evaluation of a blind method for the estimation of Hurst's exponent in time series
Federico Esposti (Politecnico di Milano University, Italy); Maria Gabriella Signorini (Politecnico di Milano, Italy)
Nowadays a lot of methods for the estimation of Hursts co-efficient (H) in time series are available. Most of them, even if very effective, need some a priori information to be ap-plied (in particular about the stationarity of the series). We analyzed eight up-to-date methods (working both in time and in frequency domain) at work with four kinds of syn-thetic time series (fBm, fGn, 1/f, FARIMA) in the range 0.1=H=0.9. We built graphs for each method evaluating the quality of the estimation, in terms of accuracy (bias) and precision (STD) of the deviation from the expected estima-tion value. Beginning from that, we formulated a procedure useful for a reliable estimation of H, using these existing methods, without any assumption on the stationarity of the time series. This procedure suggests to estimate, at first, the coefficient alpha, spectral slope in a bi-logarithmic scale-estimator chart, next to the zero-frequency axis, of the un-known time series. Once estimated alpha, i.e. an indirect estimation of the stationarity of the series, the procedure recommends the best method for the estimation of H, de-pending on the stationarity value.
Speckle Reduction in Echocardiographic Images
Nadia Souag (University of Algiers, Algeria)
Speckle noise is the major difficulty that arises in echocardiographic image processing. Adaptive smoothing techniques for speckle reduction in Two-dimensional echocardiographic images are presented in this paper. In the first stage, we have applied the Lee, Kuan, and Frost filters, based on the minimum mean square error (MMSE) design approach. Additionally, anisotropic diffusion method has been used to denoise echocardiographic images. Its a new method derived from convolution with a Gaussian, which allows reducing the noise in the image without blurring the frontiers between different regions. Criteria for quantifying the performance of the studied filters have been defined and calculated. We cite the ratio of mean grey level, the ratio of speckle index, and the parameter of transition. Quantitative measurements demonstrate the effectiveness of the anisotropic diffusion filter, for both speckle reduction and edge preservation. Key-words: speckle reduction, The Lee, Kuan, and Frost filters, anisotropic diffusion, performance criteria, echocardiographic images.
Segmentation of Retinal Blood Vessels using a Novel Clustering Algorithm
Sameh Salem (The University of Liverpool, United Kingdom); Nancy Salem (The University of Liverpool, United Kingdom); Asoke Nandi (University of Liverpool, United Kingdom)
In this paper, segmentation of blood vessels from colour retinal images using a novel clustering algorithm and scale-space features is proposed. The proposed clustering algorithm, which we call Nearest Neighbour Clustering Algorithm (NNCA), uses the same concept as the K-nearest neighbour (KNN) classifier with the advantage that the algorithm needs no training set and it is completely unsupervised. Results from the proposed clustering algorithm is comparable with the KNN classifier, which does require training set.
A quick low cost method for syncope prediction
Mathieu Feuilloy (ESEO, France); Daniel Schang (ESEO, France); Pascal Nicolas (University of Angers, France)
The aim of this study is to present a method that predicts unexplained syncope or presyncope occurrences induced by a head-upright tilt-test (HUTT). The HUTT is based on the reproduction of symptoms in combination with hypotension and bradycardia induced by a tilt at 70Â° during 45 minutes. The main drawback is the duration of this test because, by adding the supine position of 10 minutes, the test could reach 55 minutes. Therefore, this paper proposes a new method for syncope prediction by using only the supine position. We describe the signals used to extract the features employed for the prediction and we develop the preprocessing techniques of these signals in order to increase the quality interpretation of these features. We conclude by presenting the results obtained by the use of an artificial neural network.
A new shape-dependant skeletonization method. Application to porous media
Gabriel Aufort (Université d'Orléans, France); Rachid Jennane (University of Orleans, France); Rachid Harba (University of Orleans, France); Claude-Laurent Benhamou (Equipe INSERM U658 Centre Hospitalier Régional d'Orléans, France)
This communication presents a new method to compute a precise shape-dependant skeleton and its application to porous media. The local geometry of the objects structure is taken into account in order to switch between curve and surface thinning criterions. The resulting skeleton combines 2D surfaces and 1D curves to represent respectively the plate-shaped and rod-shaped parts of the object. First, methods to compute the shape-dependant skeleton are described: rod and plate classification, surface and curve thinning. Then, applications of the technique are presented in the field of biomedical imaging (trabecular bone) and geology (sandstone). A clinical study is led on 2 sets of bone samples. It shows the ability of the skeleton to characterize the trabecular bone microarchitecture.
New Markov Random Field Model based on Nakagami Distribution for modeling ultrasound RF envelope
Bouhlel Nizar (Université Tunis El Manar, Ecole Nationale d'Ingénieurs de Tunis, Tunisia); Sylvie Sevestre-Ghalila (Université René Descartes, MAP5, France)
The aim of this paper is to propose a new Markov Random Field (MRF) model for the backscattered ultrasonic echo in order to retrieve information about backscatter characteristics, such as the density, the scatterer amplitude, the scatterer spacing and the direction of interaction. The model combines the Nakagami distribution that describes the envelope of backscattered echo with spatial interaction using MRF. We first construct the Nakagami-MRF model and illustrate the role of its parameters by some synthetic simulations. Then, to enhance the ability of this MRF model to retrieve information on the spatial backscatter distribution, we compare the parameter values estimated on simulated radio-frequency (RF) envelope image for different tissue scatterers characteristics (density, amplitude, spacing, spatial orientation). It follows that the first parameter is related to the density and the amplitude, and, the interaction parameters are related to the scatterer spacing and the orientation.
Neural Network Based Arrhythmia Classification Using Heart Rate Variability Signal
Babak Mohammadzadeh-Asl (University of Tehran, Iran); Seyed Kamaledin Setarehdan (University of Tehran, Iran); Pedram Ataee (University of Tehran, Iran)
Heart Rate Variability (HRV) analysis is a non-invasive tool for assessing the autonomic nervous system and specifically it is a measurement of the interaction between sympathetic and parasympathetic activity in autonomic functioning. In recent years, HRV signal is mostly noted for automated arrhythmia detection and classification. In this paper, we have used a neural network classifier to automatic classification of cardiac arrhythmias into five classes. HRV signal is used as the basic signal and linear and nonlinear parameters extracted from it are used to train a neural network classifier. The proposed approach is tested using the MIT-BIH arrhythmia database and satisfactory results were obtained with an accuracy level of 99.38%.

### Thu.6.2: Radar Signal Processing - 5 papers

Room: Room 4
Chair: Stefano Buzzi (University of Cassino, Italy)
Impact of ballistic target model uncertainty on IMM-UKF and IMM-EKF tracking accuracies
Sandro Immediata (SELEX-SI, Italy); Alfonso Farina (SELEX-SI, Italy); Luca Timmoneri (SELEX-SI, Italy)
Some aspects of tracking of ballistic targets (BT) are analysed in this paper. In particular, the uncertainty in the kinematic model of the ballistic target is introduced and the impact on the tracking accuracies and robustness is detailed. Two architectures have been selected for comparison purposes: (i) an Interacting Multiple Model (IMM) constituted by a number of Extended Kalman Filters (EKFs) matched to the BT dynamics, each filter having the capability of on-line estimation of the BT characteristics (for instance the ballistic coefficient) and (ii) an IMM constituted by a number of Unscented Kalman Filters (UKFs). The performance evaluations of the designed IMM tracking algorithms are obtained via Monte Carlo simulation.
Knowledge-Based Recursive Least Squares Techniques for Heterogeneous Clutter Suppression
Antonio De Maio (Università  degli Studi di Napoli "Federico II", Italy); Alfonso Farina (SELEX-SI, Italy); Goffredo Foglia (Elettronica S.P.A., Italy)
In this paper we deal with the design of Knowledge-Based adaptive algorithms for the cancellation of heterogeneous clutter. To this end we revisit the application of the Recursive Least Squares (RLS) technique for the rejection of unwanted clutter and devise modified RLS filtering procedure accounting for the spatial variation of the clutter power. Then we introduce the concept of Knowledge-Based RLS and explain how the a-priori knowledge about the radar operating environment can be adopted for improving the system performance. Finally we assess the benefits resulting from the use Knowledge-Based processing both on simulated and on measured clutter data collected by the McMaster IPIX radar in November 1993.
Radar Detection and Classification of Jamming Signals
Maria Greco (University of Pisa, Italy); Fulvio Gini (University of Pisa, Italy); Alfonso Farina (SELEX-SI, Italy); Valentina Ravenni (GEM Elettronica S.r.l, Italy)
This paper considers the problem of detecting and classify-ing a radar target against jamming signals emitted by elec-tronic countermeasure (ECM) systems. The detection-classification algorithm proposed here exploits the presence in the jamming spectrum of spurious terms due to phase quantization performed by the radio frequency digital mem-ory (DRFM) device.
Joint Sequential Detection and Trajectory Estimation with Radar Applications
Stefano Buzzi (University of Cassino, Italy); Emanuele Grossi (Università  degli Studi di Cassino, Italy); Marco Lops (University of Cassino, Italy)
The problem of signal detection and trajectory estimation of a dynamic system when a variable number of measurements can be taken is here considered. A sequential probability ratio test (SPRT) when the parameter space has infinite cardinality is proposed for the detection problem while trajectory estimation relies upon a maximum-a-posteriori (MAP) estimate. The computational costs of the proposed algorithm, whose statistics are computed through a dynamic programming (DP) algorithm, are considered and applications to radar surveillance problems are inspected.
Francesco Bandiera (Universita di Lecce, Italy); Mohammed Jahangir (QinetiQ, United Kingdom); Giuseppe Ricci (University of Lecce, Italy); Roberta Verrienti (Università  di Lecce, Italy)
This paper addresses adaptive detection of possibly range-spread targets without resorting to secondary data, namely data free of signal components, but sharing the spectral properties of those under test. More precisely, we attack detection of coherent target echoes embedded in Gaussian noise with unknown covariance matrix. The covariance matrix of the noise is assumed to be block diagonal with identical diagonal blocks and estimated from a set of subvectors obtained by filtering out possible signal components from data under test. Although the proposed approach could be described in a more general framework, here we focus on detection of possibly range-spread targets based upon range compressed Synthetic Aperture Radar (SAR) data. Remarkably, the proposed detector guarantees the Constant False Alarm Rate (CFAR) property with respect to the noise power. The performance assessment has been conducted by Monte Carlo simulation resorting to both synthetic and SAR data recordings.

### Thu.4.2: Speech recognition and understanding II - 5 papers

Room: Sala Onice
Chair: Søren Holdt Jensen (Aalborg University, Denmark)
Warped-Twice Minimum Variance Distortionless Response Spectral Estimation
Matthias Wölfel (Università¤t Karlsruhe (TH), Germany)
This paper describes a novel extension to warped minimum variance distortionless response (MVDR) spectral estimation which allows to steer the resolution of the spectral envelope estimation to lower or higher frequencies while keeping the overall resolution of the estimate and the frequency axis fixed. This effect can be achieved by the introduction of a second bilinear transformation to the warped MVDR spectral estimation, but now in the frequency domain as opposed to the first bilinear transformation which is applied in the time domain, and a compensation step to adjust for the pre-emphasis of both bilinear transformations. In the feature extraction process of an automatic speech recognition system this novel extension allows to emphasize classification relevant characteristics while dropping classification irrelevant characteristics of speech features according to the characteristics of the signal to analyze, e.g. vowels and fricatives have different characteristics and therefore should be treated differently. We have compared the novel extension on evaluation data of the Rich Transcription 2005 Spring Meeting Recognition Evaluation to warped MVDR and got an word error rate reduction from 28.2% to 27.5%.
Hough Transform Based Masking in Feature Extraction for Noisy Speech Recognition
Eric Choi (National ICT Australia, Australia)
Despite various advances in recent years, robustness in the presence of various types and levels of environmental noise remains a critical issue for automatic speech recognition systems. This paper describes a novel and noise robust front-end that incorporates the use of Hough transform for simultaneous frequency and temporal masking, together with cumulative distribution mapping of cepstral coefficients, for noisy speech recognition. Recognition experiments on the Aurora II connected digits database have revealed that the proposed front-end achieves an average digit recognition accuracy of 83.31% for all the three Aurora test sets. Compared with the recognition results obtained by using the ETSI standard Mel-cepstral front-end, this accuracy represents a relative error rate reduction of around 57%.
Crosslingual Adaptation of Semi-Continuous HMMs using Maximum Likelihood and Maximum a Posteriori Convex Regression
Frank Diehl (Universitat Politecnica de Catalunya, Spain); Asunción Moreno (Universitat Politecnica de Catalunya, Spain); Enric Monte (Universitat Politecnica de Catalunya, Spain)
In this work we present a novel adaptation design for semicontinuous HMMs (SCHMM). The method, which is developed in the scope of a crosslingual model adaptation task, consists in adjusting the states' mixture weights associated to the prototype densities of the codebook. The mixture weights of the target language are modelled as convex combinations of prototype weights. They are defined by an acoustic regression scheme applied to the source models, followed by a refinement using probabilistic latent semantic analysis (PLSA). In order to find suitable combination weights for the convex combinations we present a maximum likelihood (ML) as well as a maximum a posteriori (MAP) estimate. Thus, we name them maximum likelihood convex regression (MLCR) and maximum a posteriori convex regression (MAPCR). Finally, a crosslingual model adaptation task transferring multilingual Spanish-English-German HMMs to Slovenian demonstrates the performance of the method.
Speaker Recognition Using Channel Factors Feature Compensation
Daniele Colibro (Loquendo, Italy); Claudio Vair (Loquendo, Italy); Fabio Castaldo (Politecnico di Torino, Italy); Emenuele Dalmasso (Politecnico di Torino, Italy); Pietro Laface (Politecnico di Torino, Italy)
The variability of the channel and environment is one of the most important factors affecting the performance of text-independent speaker verification systems. The best techniques for channel compensation are model based. Most of them have been proposed for Gaussian Mixture Models, while in the feature domain typically blind channel compensation is performed. The aim of this work is to explore techniques that allow more accurate channel compensation in the domain of the features. Compensating the features rather than the models has the advantage that the transformed parameters can be used with models of different nature and complexity, and also for different tasks. In this paper we evaluate the effects of the compensation of the channel variability obtained by means of the channel factors approach. In particular, we compare channel variability modeling in the usual Gaussian Mixture model domain, and our proposed feature domain compensation technique. We show that the two approaches lead to similar results on the NIST 2005 Speaker Recognition Evaluation data. Moreover, the quality of the transformed features is also assessed in the Support Vector Machines framework for speaker recognition on the same data, and in preliminary experiments on Language Identification.
N-Best Parallel Maximum Likelihood Beamformers for Robust Speech Recognition
Luca Brayda (Institut Eurecom, France); Christian Wellekens (Institut Eurecom, France); Maurizio Omologo (ITC-irst (Centro per la Ricerca Scientifica e Tecnologica), Italy)
This work aims at improving speech recognition in noisy environments using a microphone array. The proposed approach is based on a preliminary generation of N-best hypotheses. The use of an adaptive maximum likelihood beamformer (the Limabeam algorithm), applied in parallel to each hypothesis, leads to an updated set of transcriptions, among which the maximally likely to clean speech models is selected. Results show that this method improves recognition accuracy over both Delay and Sum Beamforming and Unsupervised Limabeam especially at low SNRs. Results also show that it can recover the recognition errors made in the first recognition step.

### Thu.3.2: Video Coding - 5 papers

Room: Sala Verde
Chair: Wan-Chi Siu (The Hong Kong Polytechnic University, Hong Kong)
Shot Detection Method For Low Bit-Rate H.264 Video Coding
Jorge Sastre Martínez (Polytechnic University of Valencia, Spain); Pau Usach Molina (Universitat Politècnica de València, Spain); Alejandro Moya Molina (Polytechnic University of Valencia, Spain); Valery Naranjo Ornedo (Polytechnic University of Valencia, Spain); Joaquín M. López Muñoz (Telefà³nica I+D, Spain)
This paper presents a low-complexity shot detection method for real-time, low bitrate video coding. Aimed at compression efficiency instead of frame indexing or other purposes, it is based on the macroblock intra/inter decision and the use of two thresholds. The first threshold is fixed and the second one is adaptive, providing robust scene change detection on almost all conditions (camera motion, zoom, high motion scenes or low number of frames per second). This method introduces the minimum number of necessary (but expensive in terms of bitrate) refreshing points in the video stream. The algorithm has been implemented and tested in an H.264 coder used to encode QCIF format sequences at a low, constant bitrate, for real-time, low delay communications.
A New Algorithm for Reducing the Requantization Loss in Video Transcoding
Jens Bialkowski (University of Erlangen-Nuremberg, Germany); Marcus Barkowsky (University of Erlangen-Nuremberg, Germany); André Kaup (University of Erlangen-Nà¼rnberg, Germany)
Video transcoders are devices that convert one video bitstream into another type of bitstream, either with or without standard format conversion. One step to be applied in video transcoders is the requantization of the transform coefficients, if an adaptation to a lower data rate is necessary. During this step, the quality is in most cases degraded compared to a single quantization. This is a consequence of non-overlapping quantization characteristics of the input and the output quantizer. In this work we propose a new choice of the reconstruction level for the requantization step depending on the effective quantization curve of both quantization parameters involved. The reconstruction level is calculated such that it is centered in each effective quantization interval after requantization. Compared to the standard midpoint requantization this leads to quality gains of 3 dB PSNR for most pairs of input and output quantization parameters. The algorithm is useful for intra- and inter-frame coding.
Model-based bit allocation between wavelet subbands and motion information in MCWT video coders
Marie Andrée Agostini (I3S - CNRS - UNSA, France); Marc Antonini (I3S-CNRS, France); Michel Barlaud (I3S - CNRS - UNSA, France)
In motion-compensated wavelet based video coders (MCWT), a precise motion estimation is necessary to minimize the wavelet coefficients energy. However, a motion vectors field of high precision is expensive in binary resources compared to wavelet subbands and it is thus necessary to optimize the rate-distortion trade-off between motion information and wavelet coefficients. To this end, we have proposed in previous works to quantize the motion vectors using a scalable and open-loop lossy coder and we have established a theoretical distortion model of the motion coding error which evaluates the impact of this lossy motion coding on the decoded sequence. We present in this paper an approach to realize an optimal model-based bit-rate allocation between motion and wavelet subbands. This method is based on the total distortion model of coding error on several decomposition levels, including both motion information and subbands quantization noise. First experimental validations are satisfactory.
Lossless Video Coding Using Multi-Frame MC and 3D Bi-Prediction Optimized for Each Frame
Hiroki Maeda (Science University of Tokyo, Japan); Akira Minezawa (Science University of Tokyo, Japan); Ichiro Matsuda (Science University of Tokyo, Japan); Susumu Itoh (Science University of Tokyo, Japan)
This paper proposes an efficient lossless video coding scheme based on forward-only 3D bi-prediction. In this scheme, a video signal at each pel is predicted using not only the current frame but also two motion-compensated reference frames. Since both the reference frames are taken from the past, coding process of successive frames can be performed in temporal order without extra coding delay. The resulting prediction errors are encoded using context-adaptive arithmetic coding. Several coding parameters, such as prediction coefficients and motion vectors, are iteratively optimized for each frame so that an overall coding rate required for the frame can have a minimum. Experimental results indicate that coding rates of the proposed scheme are 14-21 % lower than those of the H.264/AVC-based lossless coding scheme.
Feedback Channel in Pixel Domain Wyner-Ziv Video Coding: Myths and Realities
Catarina Brites (IST - IT, Portugal); Joao Ascenso (ISEL - IT, Portugal); Fernando Pereira (IST-TUL, Portugal)
Wyner-Ziv (WZ) video coding a particular case of distributed video coding (DVC) is a new video coding paradigm based on two major Information Theory results: the Slepian-Wolf and Wyner-Ziv theorems. Recently, practical WZ video coding solutions were proposed with promising results. Many of the solutions avail-able in the literature make use of a feedback channel (FC) to per-form rate control at the decoder. In this context, this paper intends to analyse the impact of this feedback channel, notably through a number of metrics such as the frequency the feedback channel is used as well as its associated rate. It is also presented a study on the evolution of the decoded frames quality as more parity bits are requested via feedback channel. Those measures are important since they allow characterizing the usage of the feedback channel, and have never been presented in the literature.

## 2:10 PM - 3:10 PM

### Plenary: Signal Processing between research and exploitation

Room: Auditorium
Chair: Leonardo Chiariglione (Digital Media Project, Italy)

## 3:10 PM - 4:50 PM

### Thu.2.3: Image Classification - 5 papers

Chair: Marco Diani (University of Pisa, Italy)
Clustering and color preview of polarization-encoded images
Ainouz Samia (LOUIS PASTEUR UNIVERSITY, France); Jihad Zallat (Ecole supérieure de physique de Strasbourg, France); Antonello de Martino (LPICM - Ecole Polytechnique, France)
In the framework of Stokes parameters imaging, polariza-tion-encoded images have four channels which make physi-cal interpretation of such multidimensional structures hard to grasp at once. Furthermore, the information content is intri-cately combined in the parameters channels which involve the need for a proper tool that allows the analysis and under-standing of polarization-encoded images. In this paper we address the problem of analyzing polarization encoded im-ages and explore the potential of this information for classifi-cation issues and propose ad hoc colour display as an aid to the interpretation of physical properties content. We propose a novel mapping between the Stokes space and a parametric HSV colour space.
Image Classification Using Labelled and Unlabelled Data
Irena Koprinska (University of Sydney, Australia); Da Deng (University of Otago, New Zealand); Felix Feger (University of Bamberg, Germany)
In this paper we present a case study of co-training to image classification. The results show that co-training with Naà¯ve Bayes classifiers when trained on 8-10 examples obtained only 1.5% lower classification accuracy than Naà¯ve Bayes trained on 160 examples (task 1) or 827 examples (task 2). Co-training was found to be sensitive to the choice of base classifier. We also propose a simple co-training modification based on the different inductive basis of classification algo-rithms and show that it is a promising approach.
Feature Classification Based on Local Information
Ling Shao (Philips Research Laboratories, The Netherlands); Ihor Kirenko (Philips Research Laboratories, The Netherlands); Ping Li (Philips, The Netherlands)
Feature extraction is the first and most critical step in various vision applications. The detected features must be classified into different feature types before they can be efficiently and effectively applied on further vision tasks. In this paper, we propose a feature classification algorithm that classifies the detected regions into four types including blobs, edges and lines, textures, and texture boundaries, by using the correlations with the neighbouring regions. The effectiveness of the feature classification is evaluated on image retrieval.
Two Dimensional (2D) Subspace Classifiers for Image Recognition
Hakan Cevikalp (Eskisehir Osmangazi University, Turkey); Hasan Serhan Yavuz (Eskisehir Osmangazi University, Turkey); Atalay Barkana (Anadolu University, Turkey)
The Class-Featuring Information Compression (CLAFIC) is a pattern classification method which uses a linear subspace for each class. In order to apply the CLAFIC method to image recognition problems, 2D image matrices must be transformed into 1D vectors. In this paper, we propose new subspace classifiers to apply the conventional CLAFIC method directly to the image matrices. The proposed methods yield easier evaluation of correlation and covariance matrices, which in turn speeds up the training and testing phases. Moreover, experimental results on the AR and the ORL face databases also show that recognition performances of the proposed methods are typically better than recognition performances of other subspace classifiers given in the paper.
Mathematical morphology applied to the segmentation and classification of galaxies in multispectral images
Erchan Aptoula (Louis Pasteur University, France); Sébastien Lefèvre (Louis Pasteur University, France); Christophe Collet (Louis Pasteur University, France)
The automated segmentation and classification of galaxies still constitute open problems for astronomical imaging, mainly due to their fuzzy and versatile nature, as well as to the multitude of the available channels. In this paper, a mathematical morphology based approach is explored. First, a semi-automated method for multispectral galaxy segmentation, based on the marker controlled watershed transformation is proposed. Moreover, a novel and viewpoint independent morphological feature, based on the top-hat operator, is introduced for the distinction of spiral from elliptical galaxies. Illustrative application examples of the presented approach on actual images are also presented.

### Thu.5.3: Multiple Access and Multiuser Detection - 5 papers

Chair: Marco Lops (University of Cassino, Italy)
Multi-Tier Cooperative Broadcasting with Hierarchical Modulations
Tairan Wang (University of Minnesota, USA); Alfonso Cano (Universidad Rey Juan Carlos, Spain); Georgios B. Giannakis (University of Minnesota,, USA); F. Javier Ramos (Rey Juan Carlos University, SPAIN, Algeria)
We consider broadcasting to multiple destinations with uneven quality receivers. Based on their quality of reception, we group destinations in tiers and transmit using hierarchical modulations. These modulations are known to offer a practical means of achieving variable error protection of the broadcasted information to receivers of variable quality. After the initial broadcasting step, tiers successively re-broadcast part of the information they received from tiers of higher-quality to tiers with lower reception capabilities. This multi-tier cooperative broadcasting strategy can accommodate variable rate and error performance for different tiers but requires complex demodulation steps. To cope with this complexity in demodulation, we derive simplified per-tier detection schemes with performance close to maximum-likelihood and ability to collect the diversity provided as symbols propagate through diversified channels across successive broadcastings. Error performance is analyzed and compared to (non)-cooperative broadcasting strategies. Simulations corroborate our theoretical findings.
Adaptive Joint Detection for a Random Permutation-based Multiple-Access System
Martial Coulon (INP-Enseeiht / IRIT-TéSA, France); Daniel Roviras (ENSEEIHT, France)
This paper addresses the problem of joint detection for a spread-spectrum multiple-access system based on random permutations. The transmission channels are assumed to be frequency-selective and time-varying; moreover, these channels are unknown to the receiver. Consequently, the detection is achieved using an adaptive algorithm, whose objective consists of minimizing the minimum mean-square error. Alternatively, under the same hypotheses, an adaptive detector is also proposed for the DS-CDMA system, and compared with the previous detector. This comparison shows that, even if the detection is difficult in such a context for both detectors, the permutation-based method gives better performance.
Joint Spectrum Management and Constrained Partial Crosstalk Cancellation in a Multi-User xDSL Environment
Jan Vangorp (Katholieke Universiteit Leuven, Belgium); Paschalis Tsiaflakis (Katholieke Universiteit Leuven, Belgium); Marc Moonen (Katholieke Universiteit Leuven, Belgium); Jan Verlinden (Alcatel-Bell, Belgium)
In modern DSL systems, crosstalk is a major source of performance degradation. Crosstalk cancellation techniques have been proposed to mitigate the effect of crosstalk. However, the complexity of these crosstalk cancellation techniques grows with the square of the number of lines. Therefore one has to be selective in cancelling crosstalk to reduce complexity. Secondly, crosstalk cancellation requires signal-level coordination between transmitters or receivers, which is not always available. Because of accessibility constraints, crosstalk between some lines cannot be cancelled and so has to be mitigated through spectrum management. This paper presents a solution for the joint spectrum management and constrained partial crosstalk cancellation problem. The complexity of the partial crosstalk cancellation part of the problem is reduced based on a line selection and user independence observation. However, to fully benefit from these observations, power loading has to be applied for spectrum management. We therefore consider ON/OFF power loading, which has only a minor performance degradation compared to normal power loading. The algorithm will be compared to currently available algorithms for independent spectrum management and partial crosstalk cancellation.
Blind MMSE-based receivers for Rate and Data Detection in Variable-Rate CDMA Systems
Stefano Buzzi (University of Cassino, Italy); Stefania Sardellitti (University of Cassino, Italy)
In this paper the problem of rate detection for a variable data rate code division multiple access (CDMA) system is addressed. A multiuser scenario is considered wherein each user may transmit at one out of a set of possible data rates in each data-frame. In particular a data rate and information symbol detection strategy based on the use of a bank of linear minimum mean square error (MMSE) filters is here proposed; as many filters as the number of available data rates are considered and a decision on the estimated data rate is taken based on rate matched to the filter with the minimum mean output energy (MOE). Analytical expressions for the MOE are derived in order to give theoretical grounds to our detection strategy; moreover, the effectiveness of the proposed detection strategy is also validated through simulation results that show satisfactory performance levels.
A Content Aware Scheduling Scheme for Video Streaming to Multiple Users Over Wireless Networks
Peshala Pahalawatta (Northwestern University, USA); Randall Berry (Northwestern University, USA); Thrasyvoulos Pappas (Northwestern University, USA); Aggelos K. Katsaggelos (Northwestern University, USA)
There is a rapidly growing interest in high speed data transmission over digital cellular networks. This interest is fueled mainly by the need to provide multimedia content to mobile users. In this paper, we present a packet scheduling scheme that can be used for real-time streaming of pre-encoded video over downlink packet access wireless networks. We consider a gradient-based scheduling scheme in which user data rates will be dynamically adjusted based on their channel quality as well as the gradients of a utility function. The utility functions are designed by taking into account the distortion of the received video. They allow for content-aware packet scheduling both within and across multiple users. Simulation results show that the gradient-based scheduling framework, when combined with the distortion-aware utility functions, significantly outperforms conventional content-independent packet scheduling schemes.

### Thu.1.3: Genomic Signal Processing I (Invited special session) - 5 papers

Room: Auditorium
Chair: Alfred Hero (University of Michigan, USA)
Dependence Model and Network for Biomarker Identification and Cancer Classification
Peng Qiu (University of Maryland, College Park, USA); Z. Jane Wang (University of British Columbia, Canada); K.J. Ray Liu (Department of Electrical and Computer Engineering, University of Maryland, USA)
Of particular interest in this paper is to develop statistical and modeling approaches for protein biomarker discovery to provide new insights into the early detection and diagnosis of cancer, based on mass spectrometry (MS) data. In this paper, we propose to employ an ensemble dependence model (EDM)-based framework for cancer classification, protein dependence network reconstruction, and further for biomarker identification. The dependency revealed by the EDM reflects the functional relationships between MS peaks and thus provides some insights into the underlying cancer development mechanism. The EDM-based classification scheme is applied to real cancer MS datasets, and provides superior performance for cancer classification when compared with the popular Support Vector Machine algorithm. From the eigenvalue pattern of the dependence model, the dependence networks are constructed to identify cancer biomarkers. Furthermore, for the purpose of comparison, a classification-performance-based biomarker identification criterion is examined. The dependence-network-based biomarkers show much greater consistency in cross validation. Therefore, the proposed dependence-network-based scheme is promising for use as a cancer diagnostic classifier and predictor.
A Differential Biclustering Algorithm for Comparative Analysis of Gene Expression
Alain Tchagang (University of Minnesota, USA); Ahmed Tewfik (Prof. University of Minnesota, USA); Amy Skubitz (University of Minnesota, USA); Keith Skubitz (University of Minnesota, USA)
Convergences and divergences among related organisms (S.cerevisiae and C.albicans for example) or same organisms (healthy and disease tissues for example) can often be traced to the differential expression of specific group of genes. Yet, algorithms to characterize such differences and similarities using gene expression data are not well developed. Given two related organisms A and B, we introduce and develop a differential biclustering algorithm, that aims at finding convergent biclusters, divergent biclusters, partially conserved biclusters, and split conserved biclusters. A convergent bicluster is a group of genes with similar functions that are conserved in A and B. A divergent bicluster is a group of genes with similar function in A (or B) but which play different role in B (or A). Partially conserved biclusters and split conserved biclusters capture more complicated relationships between the behavior and functions of the genes in A and B. Uncovering such patterns can elucidate new insides about how related organisms have evolved or the role played by some group of genes during the development of some diseases. Our differential biclustering algorithm consists of two steps. The first step consists of using a parallel biclustering algorithm to uncover all valid biclusters with coherent evolutions in each set of data. The second step consists of performing a differential analysis on the set of biclusters identified in step one, yielding sets of convergent, divergent, partially conserved and split conserved biclusters.
Choosing the design parameters for protein lysate arrays
Andrea Hategan (Tampere University of Technology, Finland); Ioan Tabus (Tampere University of Technology, Finland); Jaakko Astola (Tampere University of Technology, Finland)
Protein lysate array is a new technology for measuring the relative expressions of proteins, where the array image provides information about the concentrations (expressions) of a given protein for tens of patients or tissues. The array consists of replicated or serially diluted versions of the biological samples at several spots. When producing the lysate array the experimenter has to set several parameters, such as: the concentration of the sample solution to be printed at a certain spot, the concentration of the antibody solution, the number of dilutions, the number of replicates for each biological sample, and the dilution factor. Having the resulting image of intensities at all spots one can assume a nonlinear model and estimate the values of the relative protein expression levels for all biological samples. In this paper we study how the obtained model can be used to improve the design of the experiment, such that if a second lysate array will be produced, better design parameters will be selected. We propose a methodology for choosing the design parameters, and illustrate it with results for several lysate array data sets.
Spectral Analysis of DNA Sequences by Entropy Minimization
Lorenzo Galleani (Politecnico di Torino, Italy); Roberto Garello (Politecnico di Torino, Italy)
Spectral analysis can be applied to study base-base correlation in DNA sequences. A key role is played by the mapping between nucleotides and real/complex numbers. In this paper, we present a new approach where the mapping is not kept fixed: it is allowed to vary aiming to minimize the spectrum entropy, thus detecting the main hidden periodicities. The new technique is first introduced and discussed through a number of case studies, then extended to encompass time-frequency analysis.
A Sequential Monte Carlo Method for Motif Discovery
Kuo-Ching Liang (Columbia University, ? ); Xiaodong Wang (Columbia University, USA); Dimitris Anastassiou (Columbia University, USA)
We propose a sequential Monte Carlo (SMC)-based motif discovery algorithm that can efficiently detect motifs in datasets containing a large number of sequences. The statistical distribution of the motifs and the positions of the motifs within the sequences are estimated by the SMC algorithm. The proposed SMC motif discovery technique can locate motifs under a number of scenarios, including the single-block model, two-block model with unknown gap length, motifs of unknown lengths, motifs with unknown abundance, and sequences with multiple unique motifs. The accuracy of the SMC motif discovery algorithm is shown to be superior to that of the existing methods based on MCMC or EM algorithms. Furthermore, it is shown that the proposed method can be used to improve the results of existing motif discovery algorithms by using their results as the priors for the SMC algorithm.

### Poster: VoIP and Multimedia - 9 papers

Room: Poster Area
Chair: Beatrice Pesquet-Popescu (Ecole Nationale Superieure des Telecommunications, France)
A Voicing Decision Driven Forward Error Correction Scheme For Voice Over IP Applications
Dawn Black (Queen Mary University of London, United Kingdom); Mark Sandler (Queen Mary University of London, United Kingdom)
This paper examines the performance of a new packet loss compensation (PLC) schemes under bursty loss conditions. We confirm that burst losses cause higher distortion than equal rates of single packet loss, and introduce a new Forward Error Correction (FEC) and PLC combination that reduces artifacts typical of PLC schemes. FEC informs the receiver of the pitch and voicing of lost speech, combined as a single metric. A new PLC scheme combines ideas of repetition and interpolation with new approaches made possible by FEC to create loss concealing speech. The performance of the new scheme is assessed through comparison with the existing standard, and the increase in bit rate incurred by FEC is quantified.
Adaptive playout scheduling for multi-stream voice over IP networks
Chun-Feng Wu (National Chiao-Tung University, Taiwan); Ie-Ten Lin (national chiao -tung university, Taiwan); Wen-Whei Chang (National Chiao Tung University, Taiwan)
Packet delay and loss are two essential problems to real-time voice transmission over IP networks. In the proposed system, multiple redundant descriptions of the speech are transmitted to take advantage of largely uncorrelated delay and loss characteristics on independent network paths. Adaptive playout scheduling of multiple voice streams is then formulated as a constrained optimization problem leading to a better balance between end-to-end delay and packet loss. For proper reconstruction of continuous speech, we also develop a packet-based time-scale modification algorithm based on sinusoidal representation of the speech production mechanism. Experimental results indicate that the proposed adaptive multi-stream playout scheduling technique improves the delay-loss tradeoff as well as speech reconstruction quality.
Streaming over the Internet with a scalable parametric audio coder
Juan Carlos Cuevas Martínez (University of Jaén, Spain); Pedro Vera Candeas (University of Jaen, Spain); Nicolas Ruiz Reyes (University of Jaen, Spain)
Audio compression has progressively gained higher impor-tance in the Internet thanks to massive amount of multime-dia services on it. This new services require coders adapted to that new environment. Therefore, new generation coders use more complex models focused on features which make possible its use for audio streaming over the Internet, mainly low bit rate, scalability and robustness. In our case, a good trade-off between bit rate reduction and audio quality is achieved by using parametric audio coding, and further-more, this coder has a scalable version, optimized for streaming requirements. This coder avoids differential in-formation between coded audio segments and uses a layered scheme for changing straightforwardly the bit rate. The re-sults reveal our coder as a good candidate for massive dis-tributed audio applications, like music on demand, radio broadcasting or real-time streaming audio. In this article are shown the main features of this coder and their implica-tion on streaming.
Video Transmission over UMTS Networks using UDP/IP
Sébastien Brangoulo (GET/ENST - Paris, France); Nicolas Tizon (GET-ENST / Paris, France); Beatrice Pesquet-Popescu (Ecole Nationale Superieure des Telecommunications, France)
With the advent of third-generation wireless cellular systems (3G), video streaming over wireless networks has become ubiquous. However, the characteristics of wireless systems provide a major challenge for reliable transport of real-time multimedia applications since data transmitted over wireless channels is highly sensitive to noise, interferences and multipath environment, which can cause both packet losses and bit errors. Latest 3GPP/3GPP2 standards require 3G terminals to support MPEG4-AVC/H.264. The ISO/ITU video standard has inbuilt error resilience tools, and provides either the use of a classical packetization scheme or a RTP-based packetization scheme. Classical transport schemes over wireless networks use the RTP/UDP/IP scheme. In this article, some experiments are realized to analyse the performances of UDP/IP transport over 3G networks, without using RTP.
A High Performance, Low Latency, Low Power Audio Processing System For Wideband Speech Over Wireless Links
In this paper we present an audio processing system optimized for bi-directional wideband speech processing over wireless links. The systems architecture features a DSP core, a WOLA filterbank coprocessor and an I/O processor. We show how speech encoding, decoding and enhancement algorithms can be deployed simultaneously on this platform. The resulting system may include a combination of features that are usually not found when standard codecs are deployed on standard DSPs in wireless headsets. The algorithms can be combined through efficient re-use of the filterbank coprocessor for oversampled and critically sampled subband processing. Such a system may include features such as low latency, ultra-low power consumption, resilience to background noises and music and the possibility to add forward error correction for graceful degradation in difficult RF environments.
Error Sensitivity of JPEG2000 codestream for efficient data protection on unreliable network
Thomas Holl (CRAN-UMR 7039, Nancy-Universit, CNRS, France); Vincent Lecuire (CRAN CNRS UMR 7039 - University Henri Poincare of Nancy, France); Jean-Marie Moureaux (Universit Henri Poincar, Nancy 1, France)
Lossy compressed data, and multimedia content by extension can afford some further losses or degradation in lossy networks without removing its meaning. Still images coded by JPEG2000 exhibit a codestream hierarchically built with codeblock contributions organized in packets across quality layers and resolutions. In this paper we show how the image suffers from missing contributions in accordance with information available in packet header, with the layer inclusion tag especially. Our goal is to determine the level of importance of JPEG2000 packets with the only knowledge of the headers for efficient data protection on unreliable network.
A study of synchronous convergence in mu-law PNLMS for voice over IP
Laura Mintandjian (Nortel, France); Patrick Naylor (Imperial College London, United Kingdom)
A recent algorithm, the mu-law PNMLS, introduces a step-size proportionate to the mu-law of the estimated tap coefficient to cancel sparse echo in telephony over packet-switch networks. It is derived by optimizing the following criterion: fastest convergence is obtained when all coefficients reach the vicinity of their target value at the same time. We present a study of synchronous convergence as introduced in this algorithm. Simulations for a 2-tap adaptive filter illustrate the optimality of mu-law PNLMS. We then compare the performances of this algorithm on multiple echo paths. This comparison shows some restrictions in the applicability of the optimality criterion and highlights possible improvements in robustness for this algorithm.
User-Centric Evolution of Multimedia Content Description
Anastasios D. Doulamis (National Technical University of Athens, Greece)
Humans usually interprets the same content is a different way mainly due to their subjective perception. For this reason, generic multimedia content description schemes, which remain constant regardless of the users preferences, are not appropriate for a reliable and efficient content description. In this paper, a user-centric multimedia content description scheme is proposed. In particular, initially simple content multimedia algorithms are applied and in the following (and based on users profile) particular descriptor algorithms are analyzed in more details in a progressive framework. This results in a multi-layer oriented scheme. To estimate, which descriptors are important for a particular user, a user profile estimator is applied by exploiting users interaction. In particular, visual descriptor is organized into different categories, the energy of which determines the respective degree of importance.
An Adaptable Emotionally Rich Pervasive Computing System
Nikolaos Doulamis (National Technical University of Athens, Greece)
Different people express their feelings in a different way under different circumstances. For this reason, an adaptable architecture is proposed in this paper able to automatically update its perform-ance to a particular individual. This means that the system takes into account the specific users characteristics and properties and adapts its performance to the specific users needs and prefer-ences. The architecture also takes into account the context of the environment, which significantly affects the way that people ex-press their emotions (family, friends, working environment). As a result, the same expressions may lead to different emotional states in accordance to the specific environment to these feelings are expressed. The adaptation is performed using concepts derived from functional analysis. The presented adaptable architecture requires low memory and processing capabilities and thus it can be embedded in smart pervasive devices of low processing require-ments. Experimental results on real-life databases illustrate the efficiency of the proposed scheme in recognizing the emotion of different people or even the same under different circumstances.

### Poster: Nonlinear Signal Processing - 4 papers

Room: Poster Area
Chair: Acar Savaci (Izmir Institute of Technology, Turkey)
A greedy algorithm for optimizing the kernel alignment and the performance of kernel machines
Jean-Baptiste Pothin (Université de Technologie de Troyes (UTT), France); Cedric Richard (UT Troyes, France)
Kernel-target alignment has recently been proposed as a criterion for measuring the degree of agreement between a reproducing kernel and a learning task. It makes possible to find a powerful kernel for a given classification problem without designing any classifier. In this paper, we present an alternating optimization strategy, based on a greedy algorithm for maximizing the alignment over linear combinations of kernels, and a gradient descent to adjust the free parameters of each kernel. Experimental results show an improvement in the classification performance of support vector machines, and a drastic reduction in the training time.
Identifying non-linear fractional chirps using unsupervised hilbert approach
Arnaud Jarrot (ENSIETA - E3I2 laboratory, France); Patrick Oonincx (Netherlands Defense Academy, The Netherlands); Cornel Ioana (ENSIETA - E3I2 laboratory, France); Andre Quinquis (ENSIETA, France)
Nonlinear timefrequency structures, naturally present in large number of applications, are difficult to apprehend by means of Cohens class methods. In order to improve readability, it is possible to generate other class of timefrequency representations using time and/or frequency warping operators. Nevertheless, this requires the knowledge of a nonlinear warping function which characterizes the timefrequency content. For this purpose, an unsupervised approach to estimate the warping function is proposed here in the case where timefrequency structures can be represented by chirps with a fractional order. To this end, a Hilbert transformbased technique is applied in order to robustify phases jumps detection. Since those phases jumps define the fractional order in a unique way, the chirp order can be estimated by a bisection method. Results obtained from synthetic data illustrate the attractive outlines of the proposed method.
Correlated Discrete Disturbers Cancelation in Data Communication Systems
David Schwingshackl (Infineon Technologies, Austria); Dietmar Straeussnigg (Infineon Technologies, Austria)
In this paper we propose a method for suppressing discrete disturbers in data communication systems where the modulation scheme is implemented using the FFT (Fast Fourier Transform) algorithm. Similar to radio frequency interference (RFI) cancelation in the frequency domain, the compensation is performed after the FFT in the receiver. As opposed to the RFI methods it is not necessary to reserve some of the subchannels for the compensation purpose. However, the new method requires at least one reference tone and all discrete disturbers impairing the data transmission performance must be related to it. For example, this is the case for harmonics where the fundamental acts as reference tone. A detailed derivation of the compensation method is presented and illustrated by means of an example.
Supervised Compression of Multivariate Time Series Data
Victor Eruhimov (Intel, Russia); Vladimir Martyanov (Intel, Russia); Eugene Tuv (Intel, USA); Peter Raulefs (Intel, USA)
A problem of supervised learning from the multivariate time series (MTS) data where the target variable is potentially a highly complex function of MTS features is considered. This paper focuses on finding a compressed representation of MTS while preserving its predictive potential. Each time sequence is decomposed into Chebyshev polynomials, and the decomposition coefficients are used as predictors in a statistical learning model. The feature selection method capable of handling true multivariate effects is then applied to identify relevant Chebyshev features. MTS compression is achieved by keeping only those predictors that are pertinent to the response.

### Poster: Spectral Analysis - 13 papers

Room: Poster Area
Chair: Kvs Hari (Indian Institute of Science, India)
Automatic Identification of Bird Calls using Spectral Ensemble Average Voiceprints
Hemant Tyagi (IIT Madras, India); Rajesh Hegde (UCSD, USA); Hema Murthy (Indian Institute of Technology, Madras, India); Anil Prabhakar (IIT Madras, India)
Automatic identification of bird calls without manual intervention has been a challenging task for meaningful research on the taxonomy and monitoring of bird migrations in ornithology. In this paper we apply several techniques used in speech recognition to the automatic identification of bird calls. A new technique which computes the ensemble average on the FFT spectrum is proposed for identification of bird calls. This ensemble average is computed on the FFT spectrum of each bird and is called the Spectral Ensemble Average Voice Print (SEAV) of that particular bird. The SEAV of various birds are computed and are found to be different when compared to each other. A database of bird calls is created from the available recordings of fifteen bird species. The SEAV is then used for the identification of bird calls from this database. The results of identification using SEAV are then compared against the results derived from common classifiers used in speech recognition like dynamic time warping (DTW), Gaussian mixture modeling (GMM). A one level and two level classi er combination is also tried by combining SEAV classifier with the DTW classifier. The SEAV is computationally less expensive when compared to DTW or the GMM based classifiers while performing better than the DTW technique. Several new possibilities in automatic bird call identification are also listed.
Spectral Analysis of a Signal Driven Sampling Scheme
Mian Saeed (Laboratory TIMA, France); Laurent Fesquet (TIMA, France)
This work is a part of a drastic revolution in the classical signal processing chain required in mobile systems. The system must be low power as it is powered by a battery. Thus a signal driven sam-pling scheme based on level crossing is adopted, delivering non-uniformly spaced out in time sampled points. In order to analyse the non-uniformly sampled signal obtained at the output of this sam-pling scheme a new spectral analysis technique is devised. The idea is to combine the features of both uniform and non-uniform signal processing chains in order to obtain a good spectrum quality with low computational complexity. The comparison of the proposed technique with General Discrete Fourier transform and Lombs algorithm shows significant improvements in terms of spectrum quality and computational complexity.
Performance of root-MUSIC algorithm using Real-World Arrays
Fabio Belloni (Helsinki University of Technology, Finland); Andreas Richter (Helsinki University of Technology, Finland); Visa Koivunen (Helsinki University of Technology, Finland)
In this paper we study the performance of root-MUSIC algorithm and its extensions to non-ULA configurations using real-world antenna arrays. These arrays are non-ideal and built using directional elements, where each sensor has its own directional beampattern. The manifold separation technique and the novel Element-Space (ES) root-MUSIC algorithm for Direction-of-Arrival (DoA) estimation, that allow extending the root-MUSIC to non-ULA configurations, are considered. Here, we first describe how to select the number of modes in order to minimize the array modelling error. Then we test the novel ES root-MUSIC algorithm on real-world antenna arrays. Throughout simulation results we verify that the algorithm can perform on real-world arrays with different configurations by showing statistical performance close to the CRB regardless of the array imperfections.
Frequency Estimation Based On Adjacent DFT bins
Michael Betser (France Telecom R&D, France); Patrice Collen (France Telecom R&D, France); Gael Richard (ENST Paris, France)
This paper presents a method to derive efficient frequency estimators from the Discrete Fourier Transform (DFT) of the signal. These estimators are very similar to the phase-based Discrete Fourier Spectrum (DFS) interpolators but have the advantage to allow any type of analysis window (and especially non-rectangular windows). As a consequence, it leads to better estimations in the case of a complex tone (cisoid) perturbed by other cisoids. Overall, our best estimator leads to results similar to those of phase vocoder and reassignment estimators but at a lower complexity since it is based on a single Fast Fourier Transform (FFT) computation.
On the Equivalence of Phase-Based Methods for the Estimation of Instantaneous Frequency
Sylvain Marchand (LaBRI, University of Bordeaux 1, France); Mathieu Lagrange (University of Bordeaux 1, France)
Estimating the frequency of sinusoidal components is a key problem in many applications, such as in sinusoidal sound modeling, where the estimation has to be done with a low complexity, on short-term spectra. Many estimators have therefore been proposed in the literature. Among these, we focus in this paper on a class known as the phase-based'' estimators. Despite their different theoretical backgrounds, we prove that four of these estimators are equivalent, at least in theory. We also demonstrate that these estimators perform roughly similarly in practice, however small differences remain which are mainly due to numerical properties of the mathematical operators used in their implementations.
Matched Time-frequency Representations and Warping Operator for Modal Filtering
Grégoire Le Touzé (Laboratory of Images and Signals, France); Jerome Mars (Laboratoire des Images et des Signaux, France); Jean-Louis Lacoume (Laboratory of Images and Signals, France)
In waveguide, pressure signals break up into modes. This paper exposes two signal processing tools to realise a mode filtering adapted for guided waves. The first one is a time-frequency representation invertible on which filtering is possible. The second is an axe warping operator. We test both methods on real data coming from a survey in North Sea and we compare performances on synthetic dataset. We finally show that these two tools supplement each other.
Improvements in HMM Based Spectral Frequency Line Estimation
Tuncay Gunes (Florida Atlantic University, USA); Nurgun Erdol (Florida Atlantic University, USA)
This paper considers the application of Hidden Markov Models to the problem of tracking frequency lines in spectrograms of strongly non-stationary signals such as encountered in aero-acoustics and sonar where tracking difficulties arise from low SNR and large variances associated with spectral estimates. In the proposed method, we introduce a novel method to determine the observation (measurement) likelihoods by interpolation between local maxima. We also show that use of low variance AutoRegressiveMultiTaper (ARMT) spectral estimates results in improved tracking. The frequency line is tracked using the Forward-Backward and Viterbi algorithms.
Analysis of multi-component non-stationary signals using Fourier-Bessel transform and Wigner distribution
Ram Bilas Pachori (Indian Institute of Technology Kanpur, India); Pradip Sircar (Indian Institute of Technology Kanpur, India)
We present a new method for time-frequency representation (TFR), which combines the Fourier-Bessel (FB) transform and the Wigner-Ville distribution (WVD). The FB transform decomposes a multi-component signal into a number of mono-component signals, and then the WVD technique is applied on each component of the composite signal to analyze its time-frequency distribution (TFD). The simulation results show that the proposed technique based on the FB decomposition is a powerful tool for analyzing multi-component non-stationary signals and for obtaining the TFR of the signal without cross terms.
Formant Estimation of Speech Signals using Subspace-based Spectral Analysis
Sotiris Karabetsos (Institute for Language and Speech Processing (ILSP), Greece); Pirros Tsiakoulis (Institute for Language and Speech Processing, Greece); Evita Fotinea (Institute for Language and Speech Processing, Greece); Ioannis Dologlou (ILSP, Greece)
The objective of this paper is to propose a signal processing scheme that employs subspace-based spectral analysis for the purpose of formant estimation of speech signals. Specifi-cally, the scheme is based on decimative Spectral estimation that uses Eigenanalysis and SVD (Singular Value Decompo-sition). The underlying model assumes a decomposition of the processed signal into complex damped sinusoids. In the case of formant tracking, the algorithm is applied on a small amount of the autocorrelation coefficients of a speech frame. The proposed scheme is evaluated on both artificial and real speech utterances from the TIMIT database. For the first case, comparative results to standard methods are provided which indicate that the proposed methodology successfully estimates formant trajectories.
Root Spectral Estimation for Location based on TOA
Luis Blanco (UPC, Spain); Jordi Serra (UPC, Spain); Montse Najar (UPC, Spain)
As it is well known, Non Line Of Sight (NLOS) and multipath propagation bias Time Of Arrival (TOA) estimate, reducing the accuracy of positioning algorithms. High-resolution first arriving path detector from propagation channel estimates based on the minimum variance (MV) and normalized minimum variance (NMV) estimates of the power delay profile yield high accurate TOA estimation even in high multipath scenarios. The aim of this paper is to present root versions of the MV and NMV algorithms and to analyse the improvements in positioning accuracy obtained by the new approaches.
Phase Noise Mitigation in the Autocorrelation Estimates With Data Windowing: The Case of Two Close Sinusoids
Mustafa Altinkaya (Izmir Institute of Technology, Turkey); Emin Anarim (Bogazici University, Turkey); Bulent Sankur (Bogazici University, Turkey)
We address the phase noise and the superresolution problem in Toeplitz matrix-based spectral estimates. The Toeplitz autocorrelation (AC) matrix approach in spectral estimation brings in an order of magnitude computational advantage while the price paid is the phase noise that becomes effective at high signal-to-noise ratios (SNR). This noise can be mitigated with windowing the data though some concomitant loss in resolution occurs. The trade-offs between additive noise SNR, resolvability of sinusoids closer than the resolution limit, and behavior of the estimated AC lags and tone frequencies are investigated.
Estimation of the spectral exponent of 1/f process corrupted by white noise
Suleyman Baykut (Istanbul Technical University, Turkey); Melike Erol (Istanbul Technical University, Turkey); Tayfun Akgül (Istanbul Technical University, Turkey)
1/f noise is used to model a large number of processes; such as network traffic data, GPS (Global Positioning System) noise, financial and biological data. However, observations on real data have shown that assumption of a purely 1/f model may be inadequate, as the measured data may contain trend, periodicity or noise. These are considerable factors effecting the estimation of the spectral exponent. In this work, we examine real data from GPS noise and network traffic data and apply a wavelet based method for the removal of the effect of white noise in these data sets.
A spectral identity card
Corinne Mailhes (ENSEEIHT - IRIT - Tésa, France); Nadine Martin (LIS, INP Grenoble, France); Kheira Sahli (LIS, INP Grenoble, France); Gerard Lejeune (LIS, INP Grenoble, France)
This paper studies a new spectral analysis strategy for detecting, characterizing and classifying spectral structures of an unknown stationary process. The spectral structures we consider are defined as sinusoidal waves, narrow band signals or noise peaks. A sum of an unknown number of these structures is embedded in an unknown colored noise. The proposed methodology provides a way to calculate a spectral identity card, which features each of these spectral structures, similarly to a real I.D. The processing is based on a local Bayesian hypothesis testing, which is defined in frequency and which takes account of the noise spectrum estimator. Thanks to a matching with the corresponding spectral window, each I.D. card permits the classification of the associated spectral structure into one of the following four classes: Pure Frequency, Narrow Band, Alarm and Noise. Each I.D. card is actually the result of the fusion of intermediate cards, obtained from complementary spectral analysis methods.

### Poster: Synchronization and Channel Equalization - 12 papers

Room: Poster Area
Chair: Paolo Banelli (University of Perugia, Italy)
Nonlinear Equalization Structure For High-Speed ADSL In Ideal And Non Ideal Conditions
François Nougarou (Universite du Quebec a Trois-Rivieres, Canada); Daniel Massicotte (Universite du Quebec a Trois-Rivieres, Canada); Messaoud Ahmed-Ouameur (Universite du Quebec a Trois-Rivieres, Canada)
The ADSL G.DMT technology, based on discrete multi-tone (DMT) modulation, employs at the receiver an equalization structure consisting of a time-domain equalizer (TEQ) and a frequency-domain equalizer (FEQ). These two complementary systems allow good immunity against inter-symbol interference (ISI), to improve the bit error rate while optimizing the throughput. In literature, some of these equalization structures are proposed, but they are usually tested in ideal conditions; with perfect knowledge of channel characteristics and with linear channels. In this paper, we present a TEQ and a FEQ designed to establish equalization in non ideal transmission conditions, while answering to the technology requirements. A nonlinear TEQ based on neural network structure has proposed to attenuate the interference due to the non ideal conditions.
Blind feedforward symbol timing estimation with PSK signals
Tilde Fusco (University of Naples, Italy); Mario Tanda (Università  di Napoli Federico II, Italy)
This paper deals with the problem of blind symbol timing estimation with M-ary phase-shift keying signals. A least-squares (LS) estimator exploiting the structure of the received signal when the convolution of the transmitter's signaling pulse and the receiver filter satisfies the Nyquist criterion, is proposed. Since the derived LS algorithm requires a maximization with respect to a continuous variable, a closed-form approximate LS (ALS) algorithm, suitable for digital implementation, is proposed. Computer simulation results show that with small excess bandwidth factors the derived ALS algorithm outperforms previously proposed algorithms at moderate and high signal-to-noise ratios.
Digital Modulation Classification in Flat-Fading Channels
Anchalee Puengnim (INP-ENSEEIHT, France); Nathalie Thomas (INP-ENSEEIHT, France); Jean-Yves Tourneret (IRIT/ENSEEIHT/TéSA, France)
This paper addresses the problem of classifying digital modulations in a Rayleigh fading environment. The first step of the proposed classifier consists of estimating the parameters unknown by the receiver, i.e., the fading amplitude, phase offset, and residual carrier frequency. These unknown parameters appearing in the class conditional densities are then replaced by their estimates, resulting in a so-called plug-in classifier. The performance of this classifier is compared to another classification strategy recently proposed to solve the modulation classification problem in a fading environment.
Semi-blind Bussgang Equalization Algorithm
Stefania Colonnese (Università  "La Sapienza" di Roma, Italy); Gianpiero Panci (Università  "La Sapienza" di Roma, Italy); Gaetano Scarano (Università  "La Sapienza" di Roma, Italy)
This paper addresses the problem of semi-blind equalization following a Bussgang approach. An equalization scheme that integrates the training information into the iterative Bussgang algorithm is analyzed. The semi-blind equalization scheme allows flexible introduction of redundancy in the transmission scheme. The accuracy is assessed in the reference case of the Global System for Mobile communication (GSM), i.e. GMSK modulated signals received through typical mobile radio channels. Numerical simulations show that the semi-blind Bussgang equalization algorithm achieves performance comparable with the Maximum Likelihood Sequence Estimator (MLSE) implemented by Viterbi algorithm. Its flexibility allows to consider different amounts of training information as well as higher order constellations.
A Novel Channel Estimator for a Superposition-based Cooperative System
Yu Gong (Queen's University of Belfast, United Kingdom); Zhiguo Ding (Queen's University Belfast, United Kingdom); Tharmalingam Ratnarajah (Queens University of Belfast, United Kingdom); Colin Cowan (The Queen's University of Belfast, United Kingdom)
This paper investigates the channel estimation and equalization for the relay node of a cooperative diversity system based on superposition modulation. Exploring the superposition structure of the transmission data, we propose a novel channel estimator for the relay node. Without any pilot sequence, the proposed estimator can converge to the ideal case as if the transmission data is known to the receiver. We also re-derive the soft-in-soft-out MMSE equalizer to match the superimposed data structure of the cooperative scheme. Finally computer simulation results are given to verify the proposed algorithm.
Effect of Noise on Blind Adaptive Multichannel Identification Algorithms: Robustness Issue
Md. Hasan (Bangladesh University of Engineering and Technology, Bangladesh); Patrick Naylor (Imperial College London, United Kingdom)
An analysis of the noise effect on the convergence characteristic of the least-mean-squares (LMS) type adaptive algorithms for blind channel identification is presented. It is shown that the adaptive blind algorithms misconverge in the presence of noise. A novel technique for ameliorating such misconvergence characteristic, using a frequency domain energy constraint in the adaptation rule, is proposed. Experimental results demonstrate that the robustness of the blind adaptive algorithms can be significantly improved using such constraints.
Channel estimation in the presence of multipath Doppler by means of pseudo-noise sequences
Olivier Rabaste (ENST Bretagne, France); Thierry Chonavel (ENST Bretagne, France)
This paper addresses the problem of multipath channel estimation when channel paths are subject to Doppler shifts. The proposed method builds a rough approximation of the signal ambiguity function by means of a filter bank. Each output of the filter bank is deconvolved by means of an MCMC approach, that provides estimates of the path delays. The estimation of amplitudes and Doppler frequencies is then carried out at each detected path delay. The method is able to cope with simultaneous paths subject to distinct Doppler offsets and with interferences occuring among paths. Cramer-Rao Lower Bounds are derived and presented with simulation results.
Implementation of a Blind Adaptive Decision Feedback Equalizer
Goulven Eynard (ENST Bretagne, France); Christophe Laot (GET/ENST Bretagne, France)
This paper propose an efficient implementation of the blind Self-Adaptive Decision Feedback Equalizer (SA-DFE). This innovative low-complexity equalizer has the particularity to adjust its structure with the difficulty of the channel. The equalizer switches to a linear recursive equalizer when the channel is severe and to a decision-directed DFE when the channel is sufficiently easy. This paper gives details on the two structures of the SA-DFE and justify why these structures are relevant. But this paper focuses more particularly on the transition between the two different modes of the SA-DFE. Indeed, simulations show that if the transition from one mode to another is not well-implemented, the equalizer can oscillate between the two structures involving a major loss of convergence rate. Several ways to obtain smooth transitions are proposed and simulation results are presented and analyzed. This unsupervised self-adaptive equalizer has been widely tested in case of underwater communications and shows good ability in practice to deal with severe frequency-selective and quickly time-varying channel.
A Nonlinear Channel Equalization Using an Algebraic Approach and The Affine Projection Algorithm
Arfa Hichem (Ecole Nationale d'Ingenieurs de Tunis, Tunisia); EL Asmi Sadok (SUPCOM, Tunisia); Safya Belghith (SYSCOM Laboratory/ ENIT, Tunisia)
In this paper, the problem of nonlinear equalization is adressed. We use an algebraic approach which allows us to define the existence conditions of a left inverse, for a nonlinear system and therefore the equalization conditions. These existence conditions need the computation of the rank of some jacobian matrices. This approach is applied to a Volterra filter, which represents a nonlinear system. We will show also that these equalizability conditions depend to the coefficients of the nonlinear system and input values therefore we can verify for any channel with assumed known coefficients if this system is ideally equalizable or not. The appropriate algorithm that we have used to test the performance of the equalization is the APA (Affine Projection Algorithm). The choice of APA is justified by the use of a colored excitation signal as an input signal due to the nonlinear channel characteristics.
Adaptive Equalizer Based on a Power-of-Two-Quantized-LMF algorithm
Musa Otaru (KFUPM, Saudi Arabia); Azzedine Zerguine (KFUPM, Saudi Arabia); Lahouari Cheded (KFUPM, Saudi Arabia); Asrar Sheikh (King Fahd Univ. of Petroleum and Minerals, Saudi Arabia)
High speed and reliable data transmission over a variety of communication channels, including wireless and mobile radio channels, has been rendered possible through the use of adaptive equalization. In practice, adaptive equalizers rely heavily on the use of the least-mean square (LMS) algorithm which performs sub-optimally in the real world that is largely dominated by non-Gaussian interference signals. This paper proposes a new adaptive equalizer which relies on the judicious combination of the least-mean fourth (LMF) algorithm, which ensures a better performance in a non-Gaussian environment, and the power-of-two quantizer (PTQ) which reduces the high computational load brought about by the LMF and hence renders the proposed low-complexity equalizer capable of tracking fast-changing channels. This paper also presents a performance analysis of the proposed adaptive equalizer, based on a new linear approximation of the PTQ. Finally, the extensive simulation carried out here using the quantized LMF corroborates very well the theoretical predictions provided by the analysis of the linearized proposed algorithm.
Mitigation of Narrowband Interference Using Adaptive Equalizers
Arun Batra (University of California at San Diego, USA); James Zeidler (Univ. of California, San Diego, USA); A. A. (Louis) Beex (University of Virginia Tech, Blacksburg, USA)
It has previously been shown that an LMS decision-feedback filter can mitigate the effect of narrowband interference. An adaptive implementation of the filter was seen to converge relatively quickly for mild interference. It is shown here, however, that in the case of severe narrowband interference, the decision-feedback equalizer (DFE) requires a convergence time that makes it unsuitable for some types of communication systems. The introduction of a linear predictor, as a pre-filter to this equalizer, greatly reduces the total convergence time. There is a trade-off, however, between convergence time, and steady-state performance that is evaluated in this paper.
Hypothesis-Feedback Equalization For Multicode Direct-Sequence Spread Spectrum Underwater Communications
Jianguo Huang (Coll. Of Marine Engineering, Northwestern Polytechnical University, XiÂ¯an 710072, China, P.R. China); Jing Han (Northwestern Polytechnical University, P.R. China); Xiaohong Shen (Northwestern Polytechnical University, P.R. China)
In this paper, multicode direct-sequence spread spectrum is considered to achieve high-speed data transmission in underwater acoustic channel, where extended multipath and rapid time-variability is encountered and the conventional RAKE receiver usually fails to function. To track and compensate the channel distortion, a decentralized hypothesis-feedback equalization (HFE) algorithm which updates coefficients at chip rate is a promising method and has been used in multi-user underwater communication. But for multicode system, its performance is degraded by inter-channel interference (ICI). For this reason, a parallel interference cancellation hypothesis-feedback equalization (PIC-HFE) algorithm is proposed, which combines the capabilities of tracking the time-varying channel and suppressing the ICI. Simulation results proved that the proposed algorithm could significantly improve the performance of multicode system.

### Thu.6.3: Radar and Remote Sensing - 5 papers

Room: Room 4
Chair: Fabrizio Berizzi (University of Pisa, Italy)
Diagonal loaded array interpolation methods for multibaseline cross-track SAR interferometry
Matteo Pardini (University of Pisa, Italy); Fabrizio Lombardini (Univ. of Pisa, Italy); Fulvio Gini (University of Pisa, Italy)
This work deals with the problem of interferometric radar phase (IP) estimation in the presence of layover. The focus here is on multichannel interferometric synthetic aperture radar (InSAR) systems with a low number of phase centres and nonuniform array geometry. Interpolated array (IA) approaches allow the application of parametric spectral estimation techniques designed for uniform linear arrays (ULAs). The need for obtaining a well conditioned IA transformation matrix results in the estimation of a virtual ULA output with a number of elements lower than or equal to that of the actual non uniform linear array (NLA). Here we extend the IA approach to allow a greater number of virtual elements by means of a diagonal loading technique. The performance of the proposed technique is compared with that obtained by means of another interpolation algorithm, that is optimal in a mean square error (MSE) sense, and with the Cramér-Rao lower bound (CRLB) calculated for the actual NLA.
New SAR Processor Based on a Subspace Detector
Rémi Durand (SONDRA - Supelec, France); Guillaume Ginolhac (GEA- Univeristé Paris X, France); Philippe Forster (GEA, Université Paris X, France); Laetitia Thirion (SONDRA, France)
This paper deals with a new SAR Processor based on a subspace detector used for Man Made Target (MMT) detection. This new algorithm aims at using new models, different from the isotropic point one commonly used in SAR processors. The implementation of Subspace Detector SAR (SDSAR) algorithm is described along the paper and a simple example shows the interest of using models matched to the target.
Estimation of Ocean Wave Heights from Temporal Sequences of X-Band Marine Radar Images
Jose Nieto-Borge (Universidad de Alcala de Henares, Spain); Pilar Jarabo (University of Alcalà¡, Spain); David Mata-Moya (University of Alcalà¡, Spain); Francisco Lopez (University of Alcalà¡, Spain)
Marine radars scan the water surface at grazing incidence with HH polarization. Unlike other remote sensing systems, marine radars cover smaller areas, but these sensors are able to obtain short-term temporal information about wave fields using consecutive antenna rotations. This work deals with a method to estimate the significant wave height from sea clutter image time series. This method is based on similar techniques developed for Synthetic Aperture Radar (SAR) systems. The basic idea is the significant wave height is linearly dependent on the root square of the signal-noise ratio, where the signal is assumed as the radar analysis estimation of the wave spectral energy and the noise is computed as the energy due to the sea surface roughness.
An Evolutionary Approach for 3D Superresolution Imagery
Felix Totir (ENSIETA, France); Emanuel Radoi (Ecole Nationale Supérieure des Ingénieurs d'Etudes et Techniques d'Armement, France); Andre Quinquis (ENSIETA, France); Stefan Demeter (Military Technical Academy, Romania)
The paper presents an evolutionary approach for 3D superresolution imagery combining the CLEAN method and an optimization procedure based on genetic algorithms. CLEAN technique is basically an iterative method aimed to recover the information about the reflectivity map of a radar target. The presented approach allows to increase the reconstruction robustness and to skip the polar formatting step, usually required by other imagery methods. The main idea is to consider the reconstruction process as an optimization problem related to the residual energy of acquired data after each scattering center (SC) extraction and cancellation. We propose an effective solution to this problem, which takes advantage of some powerful convergence properties of genetic algorithms, a special class of evolutionary techniques.
A Modified Conjugate Gradient algorithm for low sample support in STAP radar
Hocine Belkacemi (LSS-CNRS/SUPELEC, France); Sylvie Marcos (Laboratoire des Signaux et Systèmes, France); marc lesturgie (Office National d'àtudes et de Recherches Aérospatiales (ONERA), France)
The Conjugate Gradient (CG) has been shown recently to be equivalent to the MultiStage Wiener Filter (MSWF) which is an effective tool for interference suppression in space-time adaptive processing radar. In this paper, we give further insight on the interconnection between the MSWF and the CG. We propose a modified version of the CG for low sample support where we use the forward/backward (f/b) averaging for estimating the covariance estimation. The new algorithm takes benefits of the CG for rank compression and of the (f/b) subaperture smoothing for sample support compression. The effectiveness of the algorithm is demonstrated through simulations.

### Thu.4.3: Speech Enhancement I - 5 papers

Room: Sala Onice
Chair: Saeed Vaseghi (Brunel University, United Kingdom)
An Extended Normalized Multichannel FLMS Algorithm for Blind Channel Identification
Rehan Ahmad (Imperial College London, United Kingdom); Andy Khong (Imperial College London, United Kingdom); Md. Hasan (Bangladesh University of Engineering and Technology, Bangladesh); Patrick Naylor (Imperial College London, United Kingdom)
Blind channel estimation algorithms for acoustic channels have generated much interest in recent years due to the innovations in consumer products including, but not limited to, tele- and video-conferencing. The direct path constrained NMCFLMS algorithm was proposed to enhance noise robustness of the conventional NMCFLMS algorithm. In this paper, we propose to extend the direct path constrained NMCFLMS algorithm with the aim of achieving a higher rate of convergence. This objective is achieved by introducing a penalty component to the multichannel blind adaptive cost function and we further derive the proposed extended-NMCFLMS algorithm from first principles. Simulation results show an improvement, both in convergence rate and noise robustness, compared to existing NMCFLMS algorithms.
Generalized sidelobe canceller based acoustic feedback cancellation
Geert Rombouts (KULeuven, Belgium); Ann Spriet (KULeuven/ESAT-SCD, Belgium); Marc Moonen (Katholieke Universiteit Leuven, Belgium)
We propose a combination of the well known generalized sidelobe canceller (GSC) or GriffithsJim beamformer, and the socalled PEM-AFROW algorithm for closed loop room impulse response estimation, resulting in a system for multimicrophone proactive acoustic feedback cancellation. For public address applications in lowreverberant environments, the computational complexity is reduced dramatically compared to state of the art proactive acoustic feedback cancellers, while performance is only marginally degraded.
Minima Controlled Noise Estimation for KLT-based Speech Enhancement
Adam Borowicz (Bialystok Technical University, Poland); Alexander Petrovsky (Bialystok Technical University, Poland)
This paper addresses the problem of noise estimation for the Karhunen-Loeve transform (KLT) based speech enhancement. The eigenvalues and eigenvectors of the noise covariance matrix are tracked using recursive averaging algorithm. This process is controlled by the noise power minima obtained from the noisy signal even during the speech activity periods. The proposed approach is especially recommended for a class of signal subspace methods where a whitening transformation is required. Experiments show that the noise tracking algorithm offers similar performance as the method based on idealized voice activity detector (VAD).
Optimum Post-Filter Estimation for Noise Reduction in Multichannel Speech Processing
Stamatis Leukimmiatis (National Technical University of Athens, Greece); Petros Maragos (National Technical University of Athens, Greece)
This paper proposes a post-filtering estimation scheme for multichannel noise reduction. The proposed method is an extension and improvement of the existing Zelinski and McCowan post-filters which use the auto- and cross-spectral densities of the multichannel input signals to estimate the transfer function of the Wiener post-filter. A drawback in previous two post-filters is that the noise power spectrum at the beamformer's output is over-estimated and therefore the derived filters are sub-optimal in the Wiener sense. The proposed method overcomes this problem and can be used for the construction of an optimal post-filter which is also appropriate for a variety of different noise fields. In experiments with real noise multichannel recordings the proposed technique has been shown to obtain a significant gain over the other studied methods in terms of signal-to-noise ratio, log area ratio distance and speech degradation measure. In particular the proposed post-filter presents a relative SNR enhancement of 17.3% and a relative decrease on signal degradation of 21.7% compared to the best of all the other studied methods.
A general optimization procedure for spectral speech enhancement methods
Jan Erkelens (Delft University of Technology, The Netherlands); Jesper Jensen (Delft University of Technology, The Netherlands); Richard Heusdens (Delft University of Technology, The Netherlands)
Commonly used spectral amplitude estimators, such as those proposed by Ephraim and Malah, are only optimal when the statistical model is correct and the speech and noise spectral variances are known. In practice, the spectral variances have to be estimated. A simple analysis of the "decision-directed" approach for speech spectral variance estimation reveils the presence of an important bias at low SNRs. To correct for modeling errors and estimation inaccuracies, we propose a general optimization procedure, with two gain functions applied in parallel. The unmodified algorithm is run in the background, but for the final reconstruction a different gain function is used, optimized for a wide range of signal-to-noise ratios. When this technique is implemented for the algorithms of Ephraim and Malah, a large improvement is obtained (in the order of 2 dB Segmental SNR improvement and 0.3 points increase in PESQ). Moreover, less smoothing is needed in the decision-directed spectral variance estimator.

### Thu.3.3: Independent Component Analysis - 5 papers

Room: Sala Verde
Chair: Pierre Comon (CNRS, France)
Extraction of Cognitive Activity Related Waveforms from Functional Near Infrared Signals
Ceyhun Akgül (Bogazici University and ENST-Paris, Turkey); Bulent Sankur (Bogazici University, Turkey); Ata Akin (Bogazici University, Turkey)
We address the problem of prototypical waveform extraction from functional near infrared spectroscopy (fNIRS) signals in cognitive experiments. Extracted waveforms represent the brain hemodynamic response (BHR) to visual stimuli pro-vided in an oddball type experimental protocol. We use and evaluate two statistical signal processing tools, namely inde-pendent component analysis (ICA) and waveform clustering, in a comparative manner. Based on the conformance to a parametric BHR model, we determine that the ICA waveform extraction method is superior. We measure and comment on the intra-subject and inter-subject waveform and parameter variability.
How fast is FastICA?
Vicente Zarzoso (Université de Nice - Sophia Antipolis, Laboratoire I3S, France); Pierre Comon (CNRS, France); Mariem Kallel (ENIT, Tunisia)
The present contribution deals with the statistical tool of Independent Component Analysis (ICA). The focus is on the deflation approach, whereby the independent components are extracted one after another. The kurtosis-based FastICA is arguably one of the most widespread methods of this kind. However, its features, particularly its speed, have not been thoroughly evaluated or compared, so that its popularity seems somewhat unfounded. To substantiate this claim, a simple quite natural modification is put forward and assessed in this paper. It merely consists of performing exact line search optimization of the contrast function. Speed is objectively measured in terms of the computational complexity required to reach a given source extraction performance. Illustrative numerical results demonstrate the faster convergence and higher robustness to initialization of the proposed approach, which is thus referred to as RobustICA.
Blind signal separation by combining two ICA algorithms: HOS-based EFICA and time structure-based WASOBI
Petr Tichavsky (Academy of Sciences of the Czech Republic, Czech Republic); Zbynek Koldovsky (Technical University of Liberec, Czech Republic); Eran Doron (Tel-Aviv University, Israel); Arie Yeredor (Tel-Aviv University, Israel); German Gomez-Herrero (Tampere University of Technology, Finland)
The aim of this paper is to combine two recently derived powerful ICA algorithms to achieve high performance of blind source separation. The first algorithm, abbreviated as EFICA, is a sophisticated variant of a popular algorithm FastICA, and is based on minimizing a nonlinear HOS criterion. That means that the algorithm ignores the time structure of the separated signals. The second algorithm is WASOBI, it is a weight-adjusted variant of popular algorithm SOBI, which utilizes the time structure of sources for their separation and does not exploit non-Gaussianity of the sources. For both algorithms it is possible to estimate their separation ability and thus optimally choose the most appropriate separating algorithm. The proposed combination of these algorithms is tested on separating autoregressive signals that are driven by i.i.d. random sequences drawn from a general Gaussian distribution with parameter alpha, and on separating linear instantaneous mixture of speech signals.
Risk-averting cost function for independent component analysis in signals with multiplicative noise
The FMICA is a method to extract the mixture of independent sources when they are contaminated with multiplicative noise, and notably improves the standard ICA methods in the presence of this kind of noise, although its results worsen when the level of noise increases. In this paper, whether this worsening is due to the existence of local minima or problems in the convergence of the statistical functions used is studied by a modification in the cost function that appears in FMICA. This new cost function has the property that, asymptotically, it does not present local minima, so it provides insights on the global convergence of the original cost function and it leads to the improvement of the behaviour of the FMICA for high noise levels, increasing the applicability of the method.
Extraction of Local Features From Image Sequences Using Spatio-Temporal Independent Component Analysis
Victor Chen (US Naval Research Laboratory, USA)
Independent component analysis (ICA) has been used to extract salient features from natural video images. The purpose of this paper is to study how the ICA can be ap-plied to extracting features from a sequence of images of an object taken at different time, from different view an-gles, or with different spatial structures. Local features are considered as basic building blocks of images. To ex-tract localized features in image sequences, spatio-temporal ICA, which maximizes the degree of independence over space and time, is a suitable method for analyz-ing such image sequences and extracting local features.

Room: Auditorium

## 5:10 PM - 6:30 PM

### Thu.2.4: Machine Learning - 4 papers

Chair: Simon Wilson (Trinity College Dublin, Ireland)
Grassmann clustering
Peter Gruber (University of Regensburg, Germany); Fabian Theis (University of Regensburg, Germany)
An important tool in high-dimensional, explorative data mining is given by clustering methods. They aim at identifying samples or regions of similar characteristics, and often code them by a single codebook vector or centroid. One of the most commonly used partitional clustering techniques is the k-means algorithm, which in its batch form partitions the data set into k disjoint clusters by simply iterating between cluster assignments and cluster updates. The latter step implies calculating a new centroid within each cluster. We generalize the concept of k-means by applying it not to the standard Euclidean space but to the manifold of subvectorspaces of a fixed dimension, also known as the Grassmann manifold. Important examples include projective space i.e. the manifold of lines and the space of all hyperplanes. Detecting clusters in multiple samples drawn from a Grassmannian is a problem arising in various applications. In this manuscript, we provide corresponding metrics for a Grassmann k-means algorithm, and solve the centroid calculation problem explicitly in closed form. An application to nonnegative matrix factorization illustrates the feasibility of the proposed algorithm.
Quality assessment of a supervised multilabel classification rule with performance constraints
Edith Grall (Université de Technologie de Troyes, France); Pierre Beauseroy (Université de Technologie de Troyes, France); Bounsiar Abdenour (Université de Technologie de Troyes, France)
A multilabel classification rule with performance constraints for supervised problems is presented. It takes into account three concerns : the loss function which defines the criterion to minimize, the decision options which are defined by the admissible assignment classes or subsets of classes, and the constraints of performance. The classification rule is determined using an estimation of the conditional probability density functions and by solving an optimization problem. A criterion for assessing the quality of the rule and taking into account the loss function and the issue of the constraints is proposed. An example is provided to illustrate the classification rule and the relevance of the criterion.
Improving Content-based Image Retrieval by Modelling the Search Process: a Bayesian Approach
Simon Wilson (Trinity College Dublin, Ireland)
In this paper we look at a simple image retrieval with relevance feedback scenario where we model simple properties of the search process. A content-based image retrieval method based on Bayesian inference is proposed that infers these search properties, as well as providing relevant images, from relevance feedback data. The approach is evaluated by performing searches for categories of image that invoke different emotional reactions.
Incorporating prior information into Support Vector Machines in the form of ellipsoidal knowledge sets
Jean-Baptiste Pothin (Université de Technologie de Troyes (UTT), France); Cedric Richard (UT Troyes, France)
This paper investigates a learning model in which the training set contains prior information in the form of ellipsoidal knowledge sets. We handle this problem in a minimax setting, which consists of maximizing the worst-case -- minimum -- margin between the knowledge sets from the two classes and the decision surface. The problem is solved using an alternating optimization scheme and an active learning strategy, i.e., the training set is created progressively according to the prior information. Our approach is evaluated on toy examples and on a usual benchmark database. It is successfully compared to state-of-the-art techniques.

### Thu.5.4: HW and SW architectures for multimedia streaming Systems (Special session) - 4 papers

Chair: Luca Fanucci (University of Pisa, Italy)
Chair: Fabrizio Rovati (STMicroelectronics, Italy)
Current and future trends in embedded VLIW microprocessors applied to multimedia and signal processing
Giuseppe Desoli (STMicroelectronics, Switzerland); Thierry Strudel (STMicroelectronics, France); Jean-Philippe Cousin (STMicroelectronics, France); Kaushik Saha (STMicroelectronics, India, India)
Although Very Long Instruction Word (VLIW) processors mix of performance, power consumption, flexibility and cost is a very good match for embedded systems in general and multimedia streaming ones in particular; they might be adversely exposed to increasing memory latencies, code size bloat and to some extent performance scalability with increasing issue width. This paper presents two extensions for such VLIW micros that have a large potential impact when applied to the highly competitive market of multimedia con-sumer applications and more recently streaming: Symmetric Multi Processor (SMP) cache coherency and multithreading; we present a quick summary of those developments carried out by STMicroelec-tronics within the framework of the ST200 family of embedded mi-croprocessor and preliminary results obtained from their use in selected video and audio applications
Hardware co-processor for real-time and high quality H.264/AVC video coding
Maurizio Martina (Politecnico di Torino, Italy); Guido Masera (Politecnico di Torino, Italy); Luca Fanucci (University of Pisa, Italy); Sergio Saponara (University of Pisa, Italy)
Real-Time and High-Quality video coding is gaining a wide interest in the research community, mainly for entertainment and leisure applications. Furthemore H.264/AVC, the most recent standard for high performance video coding, can be successfully exploited in such a critical scenario. The need for high-quality imposes to sustain up to tens of Mbits/s. To that purpose in this paper optimized architectures for H.264/AVC most critical tasks, Motion Estimation (ME) and Context Aware Binary Arithmetic Coding (CABAC) are pro-posed. Post synthesis results on a 0.18 µm standard cells technology show that the proposed architectures can actu-ally process in real time 720x480 video sequences at 30 Hz and grant more than 20Mbits/s in the simplest configuration. Keywords: Video coding, H.264/AVC, Hardware architec-tures, motion estimation, entropy coder
Performance Optimization for Multimedia Transmission in Wireless Home Networks
Diego Melpignano (STMicroelectronics, Italy); Gabriella Convertino (STMicroelectronics, Italy); Andrea Vitali (STM, Italy); Juan Carlos De Martin (Politecnico di Torino, Italy); Paolo Bucciol (Politecnico di Torino, Italy); Antonio Servetti (Dipartimento di Automatica ed Informatica - Politecnico di Torino, Italy)
This paper describes a network adaptive real-time demonstrator for converged applications (audio, video, voice, and data) on an IEEE802.11g Wireless Home Network. Video transmission qual-ity is optimised by dynamically adapting the source video bit-rate to a real-time estimate of the available bandwidth on the wireless network and by introducing data redundancy to recover packet losses (Forward Error Correction). Video adaptation is done by DCT-domain video transcoding algorithms performed in real-time on a digital signal processor. Voice over Internet Protocol (VoIP) services are offered manag-ing the coexistence of 802.11g terminals and Bluetooth headsets. Audio time-scale modification and adaptive playout algorithms enable robust and high quality interactive voice communications minimizing the effect of packet losses and jitter typical of wireless scenarios. All devices can share and remotely control content via Universal Plug and Play (UPnP).
Design of Application Specific Instruction-set Processor for Image and Video Filtering
Sergio Saponara (University of Pisa, Italy); Luca Fanucci (University of Pisa, Italy); Stefano Marsi (University of Trieste, Italy); Giovanni Ramponi (University of Trieste, Italy); Martin Witte (University of Aachen, Germany); David Kammler (University of Aachen, Germany)
Two architectures for cost-effective and real-time implementation of non-linear image and video filters are presented in the paper. The first architecture is a traditional VHDL-based ASIC (Application Specific Integrated Circuit) design while the second one is an ADL (Architecture Description Language) based ASIP (Application Specific Instruction Set Processor). A system to improve the visual quality of images, based on Retinex-like algorithm, is referred as case study. First, starting from a high-level functional description the design space is explored to achieve a linearized structural C model of the algorithm with finite arithmetic precision. For the algorithm design space exploration visual and complexity criteria are adopted while a statistical analysis of typical input images drives the algorithm optimization process. The algorithm is implemented both as ASIC and ASIP solution in order to explore the trade-off between the flexibility of a software solution and the power and complexity optimization of a dedicated hardware design. The aim is to achieve the desired algorithmic functionality and timing specification at reasonable complexity and power costs. Taking advantage of the processor programmability, the flexibility of the system is increased, involving e.g. dynamic parameter adjustment and color treatment. Gate level implementation results in a 0.18Âµm standard-cell CMOS technology are presented for both the ASIC and ASIP approach

### Thu.1.4: Genomic signal processing II (Invited special session) - 3 papers

Room: Auditorium
Chair: Alfred Hero (University of Michigan, USA)
Probabilistic Data Integration and Visualization for Understanding Transcriptional Regulation
Arvind Rao (University of Michigan, Ann Arbor, USA); Alfred Hero (University of Michigan, USA); David States (University of Michigan, Ann Arbor, USA); James Douglas Engel (University of Michigan, Ann Arbor, USA)
In this paper we propose a manifold embedding methodology to integrate heterogeneous sources of genomic data for the purpose of interpretation of transcriptional regulatory phenomena and subsequent visualization. Using the Gata3 gene as an example, we ask if it is possible to determine which genes (or their products) might be potentially involved in its tissue-specific regulation - based on evidence obtained from various available data sources. Our approach is based on co-embedding of genes onto a manifold wherein the proximity of neighbors is influenced by the probability of their interaction as reported from diverse data sources - i.e. the stronger the evidence for that gene-gene interaction, the closer they are.
Large-scale analysis of the human genome: from DNA sequence analysis to the modeling of replication in higher eukaryotes
Alain Arneodo (CNRS-Ecole Normale Suprieure de Lyon, France); Yves D'Aubenton-Carafa (CGM-CNRS, France); Benjamin Audit (ENS-Lyon, France); Edward Brodie of Brodie (ENS-Lyon, France); Samuel Nicolay (ENS-lyon, France); Philipppe St-Jean (ENS-Lyon, France); Claude Thermes (CGM-CNRS, France); Marie Touchon (CGM-CNRS, France); Cedric Vaillant (genopole, Evry, France)
We explore large-scale nucleotide compositional fluctuations along the human genome through the optics of the wavelet transform microscope. Analysis of the TA and GC skews reveals the existence of strand asymmetries associated to transcription and/or replication. The investigation of 14854 intron-containing genes shows that both skews display a characteristic step-like profile exhibiting sharp transitions between transcribed (finite bias) and non-transcribed (zero bias) regions. As we observe for 7 out of 9 origins of replication experimentally identified so far, the (AT+GC) skew exhibits rather sharp upward jumps, with a linear decreasing profile in between two successive jumps. We describe a multi-scale methodology that allows us to predict 1012 replication origins in the 22 human autosomal chromosomes. We present a model of replication with well-positioned replication origins and random termination sites that accounts for the observed characteristic serrated skew profiles. We emphasize these putative replication initiation zones as regions where the chromatin fiber is likely to be more open so that DNA be easily accessible. In the crowded environment of the cell nucleus, these intrinsic decondensed structural defects actually predisposes the fiber to spontaneously form rosette-like structures that provide an attractive description of genome organization into replication foci that are observed in interphase mammalian nuclei.
An Equivalent Markov Model for Gillespie's Stochastic Simulation Algorithm for biochemical systems
Ronit Bustin (Tel-Aviv University, Israel); Hagit Messer (Tel-Aviv University, Israel)
Mathematical/statistical modeling of biological systems is a desired goal for many years. It aims to be able to accurately predict the operation of such systems under various scenarios using computer simulations. In this paper we revisit Gillespie's Stochastic Simulation Algorithm for biochemical systems and we suggest an equivalent Markov Model for it. We show that under certain conditions it is a first order homogenous Markov process and we analyze these conditions. Our suggested model can be used to simulate the probability density function of a biochemical processes which, in turn, can be used for applying statistical signal processing and information theory tools on them.

### Thu.6.4: Biomedical imaging I - 4 papers

Room: Room 4
Chair: Gloria Menegaz (University of Siena, Italy)
Modeling ultrasound images with the generalized K model
Torbjørn Eltoft (University of Tromsà¸, Norway)
In this paper we interpret the statistics of ultrasonic backscatter in the framework of a normal variance-mean mixture model. This is done by considering the complex envelope of the echo signal as a double stochastic circular Gaussian variable, in which both the variance and the mean are linearly scaled by a stochastic factor Z By assuming Z to be Gamma distributed, we re-derive the generalized K distribution, and present a new iterative algorithm for estimating its parameters. We also derive a maximum a posteriori (MAP) filter based on the generalized K model. The appropriateness of the generalized K model in representing the local amplitude statistics of medical ultrasound images, and the filtering performance of the the new MAP filter, are tested in some preliminary experiments.
Motor Imagery Based Brain Computer Interface with Subject Adapted Time-Frequency Tiling
Nuri Ince (University of Minnesota, USA); Ahmed Tewfik (Prof. University of Minnesota, USA); Sami Arica (Cukurova University, Turkey)
We introduce a new technique for the classification of motor imagery electroencephalogram (EEG) recordings in a Brain Computer Interface (BCI) task. The technique is based on an adaptive time-frequency analysis of EEG signals computed using Local Discriminant Bases (LDB) derived from Local Cosine Packets (LCP). Unlike prior work on adaptive time-frequency analysis of EEG signals, this paper uses arbitrary non-dyadic time segments and adaptively selects the size of the frequency bands used for feature extractions. In an offline step, the EEG data obtained from the C3/C4 electrode locations of the standard 10/20 system is adaptively segmented in time, over a non-dyadic grid. This is followed by a frequency domain clustering procedure in each adapted time segment to maximize the discrimination power of the resulting time-frequency features. Then, the most discriminant features from the resulting arbitrarily segmented time-frequency plane are sorted. A Principal Component Analysis (PCA) step is applied to reduce the dimensionality of the feature space. The online step simply computes the reduced dimensionality features determined by the offline step and feeds them to the linear discriminant. The algorithm was applied to all nine subjects of the BCI Competition 2002. The classification performance of the proposed algorithm varied between 70% and 92.6% across subjects using just two electrodes. The average classification accuracy was 80.6%. For comparison, we also implemented an Adaptive Autoregressive model based classification procedure that achieved an average error rate of 76.3% on the same subjects, and higher error rates than the proposed approach on each individual subject.
A robust thresholding method with applications to brain MR image segmentation
Ahmet Ekin (Philips Research Europe, The Netherlands); Radu Jasinschi (Philips Research, The Netherlands)
This paper brings forward two main novel aspects: 1) a generic thresholding method that is robust to degradation in the image contrast; hence, quality and 2) a new knowledge-based segmentation framework for brain MR images that first utilizes a clustering algorithm, and then the proposed thresholding method. The new thresholding method accurately computes a threshold value even for images with very low visual quality having very close class means. It also consistently outperforms known thresholding methods. The segmentation algorithm, on the other hand, generates almost constant segmentation performance in a wide range of scan parameter values. It utilizes first a clustering algorithm to identify the cerebrospinal fluid region and then focuses on white matter gray matter separation by using the novel thresholding method. We show the robustness of the proposed algorithms with a simulated dataset obtained with various parameter values and a real dataset of brain MR dual-echo sequences of patients with possible iron accumulation.
A new implementation of the ICP algorithm for 3D surface registration using a comprehensive look up matrix
Ahmad Almhdie (University of Orléans, France); Christophe Léger (Université d'Orléans, France); Mohamed Deriche (King Fahd University of Petroleum & Minerals, Saudi Arabia); Roger Lédée (University of Orléans, France)
The iterative closest point algorithm is one of the most effi-cient algorithms for robust rigid registration of 3D data. It consists in finding the closest points between two sets of data which are then used to estimate the parameters of the global rigid transformation to register the two data sets. All the steps of the algorithm are highly dependent upon the accuracy with which correspondence pairs of points are found. In this paper, a new enhanced implementation of the ICP algorithm proposes to use a look up matrix for finding the best correspondence pairs. It results in reducing the minimum mean square error between the two data sets after registration, compared to existing implementations. The algorithm was implemented and tested to evaluate its con-vergence properties and robustness to noise. Performance improvements are obtained. The new algorithm has success-fully been applied to register 3D medical data.

### Thu.4.4: Speech Enhancement II - 4 papers

Room: Sala Onice
Chair: Saeed Vaseghi (Brunel University, United Kingdom)
Continuous tracheoesophageal speech repair
Arantza del Pozo (University of Cambridge, United Kingdom); Steve Young (University of Cambridge, United Kingdom)
This paper describes an investigation into the repair of continuous tracheoesophageal (TE) speech. Our repair system resynthesises TE speech using a synthetic glottal waveform, reduces its jitter and shimmer and applies a novel spectral smoothing and tilt correction algorithm, derived from a comparative study of normal and TE spectral envelopes. The perceptual enhancement achieved by each correction and the performance of the whole system are evaluated on a corpus of thirteen TE speakers using listening tests. Results show that our repair algorithms reduce the perceived breathiness and harshness and out-perform previous enhancement attempts overall.
A Numerical Approach for Estimating Optimal Gain Functions in Single-Channel DFT based Speech Enhancement
Jesper Jensen (Delft University of Technology, The Netherlands); Richard Heusdens (Delft University of Technology, The Netherlands)
We treat the problem of finding minimum mean-square error (MMSE) spectral amplitude estimators for discrete Fourier transform (DFT) based single-channel noise suppression algorithms. Existing schemes derive gain functions analytically based on distributional assumptions with respect to the speech (and noise) DFT coefficients and on mathematically tractable distortion measures. In this paper we pro pose a methodology to estimate the MMSE gain functions directly from speech signals, without assuming that speech DFT coefficients follow a certain parametrized probability density function. Furthermore, the proposed scheme allows for estimation of MMSE gain functions for pdf/distortion measure combinations for which no analytical solutions are known. Simulation experiments where noisy speech is enhanced using the estimated gain functions show promising results. Specifically, the estimated gain functions perform better than standard schemes, as measured by a range of objective speech quality criteria.
Kalman Filter With Linear Predictor And Harmonic Noise Models For Noisy Speech Enhancement
Qin yan (Brunel University, United Kingdom); Saeed Vaseghi (Brunel University, United Kingdom); Esfandiar Zavarehei (Brunel University, United Kingdom); Ben Milner (University of East Anglia, United Kingdom)
This paper presents a method for noisy speech enhancement based on integration of a formant-tracking linear prediction (FTLP) model of spectral envelope and a harmonic noise model (HNM) of the excitation of speech. The time-varying trajectories of the parameters of the LP and HNM models are tracked with Viterbi classifiers and smoothed with Kalman filters. A frequency domain pitch estimation is proposed, that searches for the peak SNRs at the harmonics. The LP-HNM model is used to deconstruct noisy speech, de-noise its LP and HNM models and then reconstitute cleaned speech. Experimental evaluations show the performance gains resulting from the formant tracking, harmonic extraction and noise reduction stages.
Two-stage blind identification of SIMO systems with common zeros
Xiang Lin (Imperial College London, United Kingdom); Nikolay Gaubitch (Imperial College London, United Kingdom); Patrick Naylor (Imperial College London, United Kingdom)
Blind identification of SIMO systems is dependent on identifiability conditions which include the requirement that there are no common zeros between multiple channels. We demonstrate that common zeros are likely to exist for long channels, such as acoustic impulse responses and we illustrate the performance degradation due to common zeros in blind system identification. Subsequently, we propose a scheme to separate common zeros from characteristic zeros and a method to identify the common zeros. Simulation results confirm the efficiency of our method.

### Thu.3.4: Source coding - 4 papers

Room: Sala Verde
Chair: Vincenzo Lottici (University of Pisa, Italy)
A Practical Algorithm for Distributed Source Coding Based on Continuous-Valued Syndromes
Lorenzo Cappellari (University of Padova, Italy); Gian Antonio Mian (University of Padova, Italy)
Recent advances in video coding are based on the principles of distributed source coding to both relax the inherent complexity of classical encoding algorithms and offer robustness against transmission errors. Several practical frameworks for distributed source coding take advantage of channel coding principles to encode a suitably pre-quantized version of the source. In this paper, we present an alternative two-step approach for the problem of coding a source which is correlated with another one that is available only at the decoder. First, based on the correlation of the side information, a continuous-valued syndrome is computed. Then, according to the given rate-constraint, the syndrome is encoded and transmitted as in classical source coding. The great flexibility of the coding system is confirmed by the simulation results: for source coding of i.i.d. Gaussian sources, the performance is always within 4 dB from the Wyner-Ziv bound, for a wide range of side information correlations and target bit-rates.
Lifting-Based Paraunitary Filterbanks for Lossy/Lossless Image Coding
Taizo Suzuki (Keio University, Japan); Yuichi Tanaka (Keio University, Japan); Masaaki Ikehara (Keio University, Japan)
This paper introduces one of the image transform methods using M-channel Paraunitary Filterbanks (PUFBs) based on Householder matirx. First, redundant parameters of PUFB are eliminated by using the fact that they can be factorized into Givens rotation matices. Next, we propose an eliminating redundant parameters method based on Householder matrix using relationship between Givens rotation and Householder matrices. In addition, PUFBs are factorized into the lifting structure for lossless image coding, and we call them Lifting-Based PUFBs (LBPUFBs). LBPUFBs based on Householder matrix have less number of rounding operators than Givens rotation matrix version, since proposed structure is efficiency for lossless image coding. Finally, we show some exsamples to validate our method in lossy/lossless image coding.
Image Coding with Rational Spatial Scalability
Gregoire Pau (GET - Telecom Paris, France); Beatrice Pesquet-Popescu (Ecole Nationale Superieure des Telecommunications, France)
In the digital TV and mobile operators scene, there is an increasing demand for a wider scalability range when delivering content to heterogeneous client devices which have different screen sizes. Subband image coding with classical wavelet transform only provide dyadic scalability factors but we have shown that M-band transforms can offer rational scalability factors with an appropriate modification of the synthesis filters. In this paper, we evaluate the image coding efficiency offered by some M-band transforms combined with the rational scalability feature. We show that they offer an increased compression performance compared to the popular dyadic 9/7 wavelet transform.
Joint Data Hiding-Source Coding of Still Images
Cagatay Dikici (INSA de Lyon, France); Khalid Idrissi (INSA de Lyon, France); Atilla Baskurt (INSA de Lyon, France)
There exists a strong duality between Source Coding with side information (known as Distributed Source Coding) and Channel Coding with Side Information (informed data-hiding). Inspired by the system in [12], which is a combination of Data Hiding and Distributed Source Coding scheme, we have extended this system to 2D natural images. Hence a cascade of informed data hiding for a still image is followed by a distributed source coding with a fidelity criterion, given that a noisy version of the input image is available only at the decoder. We used baseline JPEG compression with different quality values for creation of the Side Information, a memoryless quantization for informed data hiding, and LDPC based coset construction for source coding. The preliminary experimental results are given using the theoretical findings of the duality problem.

## 8:40 AM - 11:00 AM

### Fri.2.1: Biomedical imaging II - 7 papers

Chair: Jean-Philippe Thiran (Swiss Federal Institute of Technology (EPFL), Switzerland)
Locally Regularized Snakes through Smoothing B-Spline Filtering
Jérôme Velut (CREATIS, CNRS UMR 5515, Inserm U 630, France); Hugues Benoit-Cattin (CREATIS, CNRS UMR 5515, Inserm U 630, France); Christophe Odet (CREATIS, CNRS UMR 5515, Inserm U 630, France)
In this paper we propose a locally regularized snake based on smoothing-spline filtering. The regularization level is controlled through a unique parameter that can vary along the contour, offering a powerful framework to introduce prior knowledge. Associated to this local regularization, a force equilibrium scheme leads the snakes deformation. Examples on synthetic and MRI images compare global and local regularization.
Normalization of Transcranial Magnetic Stimulation Points by Means of Atlas Registration
Dominique Zosso (Ecole Polytechnique Fédérale de Lausanne (EPFL), Switzerland); Quentin Noirhomme (Communications and Remote Sensing Laboratory TELE, Université catholique de Louvain, Belgium); Marco Davare (University of Louvain, Belgium); Benoit Macq (Universite catholique de Louvain, Belgium); Etienne Olivier (Neurophysiology Unit NEFY, Université catholique de Louvain, Belgium); Jean-Philippe Thiran (Swiss Federal Institute of Technology (EPFL), Switzerland); Mathieu De Craene (Université catholique de Louvain, Belgium)
Transcranial magnetic stimulation (TMS) is a well-known technique to study brain function. Location of TMS points can be visualized on the subjects Magnetic Resonance Image (MRI). However inter-subject comparison is possible only after a normalization i.e. a transformation of the stimulation points to a reference atlas image. Here, we propose a generic and automatic image processing pipe-line for normalizing a collection of subjects MRI and TMS points. The normalization uses a loop of rigid transform followed by Basis-Splines registration. The used reference atlas is the common Montreal National Institute (MNI) brain atlas. We show preliminary results from 10 subjects. Those normalized points were compared to the SPM normalization and validated by TMS experts.
Segmentation of Echocardiographic Images Based On Statistical Modelling Of The Radio-Frequency Signal
Olivier Bernard (Creatis, France); Jan D'hooge (Cardiac Imageing Research, Belgium); Denis Friboulet (CREATIS, France)
This work presents an algorithm for segmentation of ultrasound images based on the statistics of the radio-frequency (RF) signal. We first show that the Generalized Gaussian distribution can reliably model both fully (blood pool) and partially (tissue area) developed speckle in echocardiographic RF images. We then show that this probability density function may be used in a maximum likelihood framework for tissue segmentation. Results are presented on both simulations and ultrasound cardiac images of clinical interest.
Feature-based Brain Mid-Sagittal Plane Detection by RANSAC
Ahmet Ekin (Philips Research Europe, The Netherlands)
Mid-sagittal plane passes through the border between the two hemispheres of a brain, which are roughly symmetric. Image-based detection of mid-sagittal plane has applications to a number of computer and human tasks, such as image registration and diagnosis. The problem requires robust methods to inherent asymmetries between the two hemispheres, pathalogical abnormalities that further degrade the hemispheric symmetry, and degradations in image quality. Furthermore, it is desirable to have a computationally feasible method because mid-sagittal plane detection is often a pre-processing step that is followed by more compute-intensive algorithms. In this paper, we introduce a novel feature-based mid-sagittal plane detection algorithm for MR brain images. The proposed method is robust even in the presence of very large abnormalities, can cope with outliers in the detected features, and is very fast. Its robustness to abnormalities stems from its hierarchical operation. A 3-D MR data is first processed as 1-D image lines, then as 2-D slices, and finally 3-D volume. This makes it possible to detect the mid-sagittal plane as long as two image lines are not affected by pathalogical abnormality, which is a significant improvement over the literature. Furthermore, the use of outlier-robust RANSAC algorithm for fitting a mid-sagittal line to the detected feature points in each slice provides robustness to the inaccuracies in the detected feature points.
Segmentation of remodelling regions in bone micro-CT images: influence of denoising
Zsolt Peter (CREATIS - ESRF, France); Sabine Rolland (ESRF, France); Francoise Peyrin (CREATIS - ESRF, France)
Synchrotron radiation 3D micro-CT providing quantitative images of bone samples opens new perspectives for assessing bone metabolism. The purpose of this paper is to evaluate the possibility of segmenting remodeling regions from such images acquired at a spatial resolution of 10 Âµm. Segmenting remodeling regions within the bone phase is not an easy task. Although remodeling regions can be visually distinguished, their contrast with respect to bone may be very weak and close to the standard deviation of noise. We propose a segmentation scheme based on a customized region growing associated to a denoising process. We consider denoising methods based on wavelets and anisotropic diffusion. Our results show that a corrupted image can be well restored, essentially without compromising image resolution. Thus the segmentation of correct remodeling zones in the bone is more realistic on the denoised images.
Compensation of MRI T1-weighted spin-echo images for radio-frequency inhomogeneities
Guylaine Collewet (Cemagref, France); Jérôme Idier (IRCCyN, France)
We propose a correction method for magnetic resonance (MR) images to eliminate the effects of the inhomogeneity of the radio-frequency (RF) pulses and of the sensivity of the RF reception particularly in the case of T1-weighted images. In this case the effects of the pulse inhomogeneities vary with the tissues, which prevents the use of simpler correction techniques based on a global multiplicative bias field model. Here, the MR signal is modeled as a sum of contributions of all the tissues present in the object. For sake of generality, each pixel is assumed to contain an unknown proportion of each tissue, so that the usually adopted segmentation based approach is not valid in the present context. The number of tissues composing the object as well as the MR characteristics of each tissue are required. Several images with different acquisition parameter values are also needed. A penalized least-square criterion is proposed to estimate the RF emitted field, the RF sensitivity reception and the proportion of each tissue. The criterion contains smoothness regularization for both RF fields. We solve the optimization problem using a conjugate gradient algorithm within a Gauss-Seidel iterative scheme. Results based on real MR images of fish demonstrate the effectiveness of the method.
3D Detection And Measurement Of Facial Swellings
Anna Paviotti (University of Padova, Italy); Nicola Brusco (University of Padova, Italy); Guido Cortelazzo (, ? )
This work presents a procedure for the 3D detection and measurement of facial swellings, an issue of paramount importance in oral and maxillofacial surgery. The considered problems are the automatic acquisition of face masks of adequate precision, the registration of the 3D masks of the same patient taken at different times during the edema evolution, and the detection and estimate of the edema volumes.

### Fri.5.1: Time-Frequency Analysis - 7 papers

Chair: Patrick Flandrin (CNRS-ENS, France)
Multitaper time-frequency reassignment
Jun Xiao (ENS Lyon, France); Patrick Flandrin (CNRS-ENS, France)
A method is proposed for estimating chirp signals embedded in nonstationary noise, with the twofold objective of a sharp localization for the chirp components and a reduced level of statistical fluctuations for the noise. The technique consists in combining time-frequency reassignment with multitapering. The principle of the method is outlined, its implementation based on Hermite functions is justified and discussed, and some examples are provided for supporting the efficiency of the approach, both qualitatively and quantitatively.
Covariance-Based Frequency Reassignment for Detection of Non-Stationary Sinusoids
Miroslav Zivanovic (Universidad Publica de Navarra, Spain); Javier Frances (Universidad Publica de Navarra, Spain)
A new approach to the frequency reassignment for sinusoidal energy localization in the time-frequency plane is presented. While the classical reassignment method operates statically on each point in the plane, a more global approach focuses on the dynamics of the frequency reassignment operator along the frequency grid. It is shown that the first mixed moment of the frequency reassignment operator and linear frequency well describes the Discrete-Time Fourier Transform (DTFT) spectral peaks in terms of sinusoids and noise. The new method eludes spurious artifacts in noisy regions, produced by the classical reassignment method randomly clustering noise. The proposed method is shown to have very good behavior for low SNR and it is computationally efficient.
Motion Estimation by the Pseudo-Wigner Ville Distribution and the Hough Transform
Noemí Carranza (Instituto de àptica (CSIC), Spain); Filip Sroubek (Instituto de Optica, CSIC, Spain); Gabriel Cristobal (Instituto de Optica (CSIC), Spain)
A fundamental problem in processing sequences of images is the computation of the optical flow, an approximation to the real image motion. This paper presents a new algorithm for optical flow estimation based on a spatiotemporal-frequency approach, specifically on the computation of the Wigner-Ville distribution and the Hough Transform. Experimental results are compared with an implementation of the varia-tional technique for local and global motion estimation, where it is shown that the results are accurate and robust to noise degradations.
Kernel-based estimators for the Kirkwood-Rihaczek time-frequency spectrum
Heidi Hindberg (University of Tromso, Norway); Yngve Birkelund (Tromso University College, Norway); Tor Oigard (University of Tromso, Norway); Alfred Hanssen (University of Tromsà¸, Norway)
In this paper, we examine kernel-based estimators for the Kirkwood-Rihaczek time-frequency spectrum of harmonizable, nonstationary processes. Based on an inner product consideration, we propose and implement an estimator for the Kirkwood-Rihaczek spectrum. The estimator is constructed from a combination of the complex demodulate with a short-time Fourier transform. Our proposed estimator is less computationally intensive than a theoretically equivalent, known estimator. We compare and test the results of the proposed estimator with an existing estimator from a Matlab time-frequency toolbox on simulated and real-world data. We demonstrate that the proposed estimator is less sensitive to unwanted cross-terms, and less affected by edge effects than the Matlab time-frequency toolbox estimator.
Detection of anomalous events in biomedical signals by Wigner analysis and instant-wise Rényi entropy
Salvador Gabarda (Instituto de Optica (CSIC), Spain); Gabriel Cristobal (Instituto de Optica (CSIC), Spain); Juan Martinez-Alajarin (Universidad Politecnica de Cartagena, Spain); Ramon Ruiz (Universidad Politecnica de Cartagena, Spain)
Rényi entropy is receiving an important attention as a data analysis tool in many practical applications, due to its relevant properties when dealing with time-frequency repre-sentations (TFR). Its characterized for providing general-ized information contents (entropy) of a given signal. In this paper we present results of applying the Rényi entropy to a 1-D pseudo-Wigner distribution (PWD) of biomedical sig-nals. A processed filtered signal is obtained after the applica-tion of a Rényi entropy measure to the instant-wise PWD of the given biomedical signal. The Rényi entropy allows indi-vidually identify, from an entropic criterion, which instants have a higher amount of information along the temporal data. This provides specific localized information about normal and pathological events in biomedical signals; hence early diagnosis of diseases is facilitated in this way. The method is illustrated with examples of application to phono-cardiograms.
Nonstationary Signals Analysis by TEAGER-HUANG TRANSFORM (THT)
Jean Christophe Cexus (IRENav Ecole Navale, France); Abdel-Ouahab Boudraa (IRENav Ecole Navale, France)
In this paper a new method called Teager-Huang Transform (THT) for analysis of nonstationary signals is introduced. The THT estimates the Instantaneous frequency (IF) and the Instantaneous amplitude (IA) of a signal. The method is based on the Empirical mode decomposition (EMD) algorithm of Huang et al. and the Teager energy operator (TEO). Both EMD and TEO deal with non-stationary signals. The signal is first band pass filtered using the EMD into zero-mean AM-FM components called Intrinsic mode functions (IMFs) with well defined IF. Then TEO tracks the modulation energy of each IMF and estimates the corresponding IF and IA. The final presentation of the IF and the IA results is an energy Time frequency representation (TFR) of the signal. Based on the EMD, the THT is flexible enough to analyze any data (linear or nonlinear) and stationary or nonstationary signals. Furthermore, the THT is free of interferences. To show the effectiveness of the THT, TFRs of five synthetic signals are presented and results compared to those of the Hilbert-Huang transform (HHT), the spectrogram, the Smoothed pseudo Wigner-Ville distribution (SPWVD), the scalogram and the Choi-Williams distribution (CWD).
A Wigner-Ville, time-frequency approach to TNT detection in nuclear quadrupole resonance experiments.
Sven Aerts (Free University of Brussels, Belgium); Dirk Aerts (Free University Brussels, Belgium); Franklin Schroeck (Denver University, USA); Jürgen Sachs (TU Ilmenau, Germany)
To minimize the risks involved in humanitarian demining requires a sensitivity setting close to unity, resulting in a very high false alarm rate. Nuclear quadrupole resonance detection, based on the spin echoes from nuclear spin relaxation, is a promising example of a highly specific detector that directly addresses the properties of the explosive rather than the mine casing. However, the data aquisition time necessary to obtain a sufficiently high sensitivity is long due to the extremely poor SNR of the spin echoes. Besides improvements in the hardware, it is important to pursue better signal analytic techniques. We present a time-frequency approach based on the Wigner-Ville quasi-distribution for the analysis of nuclear quandrupole resonance. We calculate ROC curves for real data obtained under laboratory conditions and show the technique presents a susbtantial improvement over the popular demodulation technique.

### Fri.1.1: NEWCOM (Special session) - 7 papers (uno doppio)

Room: Auditorium
Chair: Erdal Panayirci (Bilkent University, Turkey)
Signal Processing Issues in Wireless Sensor Networks
H. Vincent Poor (Princeton University, USA)
This paper will review a number of recent results in the general area of signal processing for wireless sensor networks (WSNs). Among these will be results relating to the following distributed and collaborative algorithms for inferential problems arising in applications of WSNs, sensor scheduling and energy tradeoffs in detection networks, and related issues. NOTE: This paper has been invited for the Newcom Special Session on Advanced Signal Processing Techniques for Wireless Communications.
Code-aided Frequency Ambiguity Resolution and Channel Estimation for MIMO OFDM systems
Frederik Simoens (Ghent University, Belgium); Marc Moeneclaey (Ghent University, Belgium)
This contribution deals with channel estimation and frequency ambiguity resolution in a MIMO OFDM context. Existing blind frequency-recovery algorithms for OFDM are able to provide a reliable estimate of the frequency offset up to an integer multiple of the subcarrier spacing. To resolve the remaining ambiguity, one can employ either pilot symbols or the unknown coded data symbols. Clearly, the latter method results in a higher bandwidth efficiency. Similar considerations hold for the estimation of a frequency-selective MIMO channel. In this contribution, we propose a code-aided technique to jointly estimate the channel and resolve the frequency ambiguity. The estimator is based on the expectation-maximization (EM) algorithm and exploits information from the unknown coded data symbols and only a small number of pilot symbols. A significant performance gain is observed compared to existing, solely pilot-based estimation techniques.
Space-Time Block Coding for Noncoherently Detected CPFSK
Fabrizio Pancaldi (University of Modena - Dept. of Information Eng., Italy); Giorgio M. Vitetta (University of Modena and Reggio Emilia, Italy)
In this paper the problem of unitary rate space-time block coding for multiple-input multiple-output communication systems employing continuous phase frequency shift keying is investigated. First, the problem of optimal codeword by codeword noncoherent detection is analysed; then, design criteria for optimal space-time clock codes are proposed and some novel coding schemes are devised. Simulation results evidence that the proposed schemes can efficiently exploit spatial diversity and that their use can entail a limited energy loss with respect to other solutions available in the technical literature for coherent systems, with the substantial advantage, however, of a simple detection algorithm.
On the Error Exponent of the Wideband Relay Channel
Qiang Li (Texas A&M University, USA); Costas Georghiades (Texas A&M Univeristy, USA)
We investigated the error exponent of the wideband relay channel. By computing the random coding error exponent of three different relay strategies, i.e., amplify-and-forward (AF), decode-and-forward (DF) and block Markov code (BMC), we found that relayed transmission can enhance the wireless link reliability significantly in the wideband regime compared to direct transmission. We also studied optimal power allocation and relay placement by maximizing the reliability function. Analytical and numerical results show, for DF and BMC relays, placing the relay node in the middle of source and destination provides the best link reliability. But for the AF relay scheme, the optimal relay placement depends on the path-loss exponent; for large path-loss exponents, half-way relay placement is also optimal
New Results in Iterative Frequency-Domain Decision-Feedback Equalization
Frédérique Sainte-Agathe (Supélec, Thales Communication, France); Hikmet Sari (Ecole Supérieure d'Electricité (SUPELEC), France)
Single-carrier transmission with frequency-domain equalization (SCT/FDE) is today recognized as an attractive alternative to orthogonal frequency-division multiplexing (OFDM) for wireless applications with large channel dispersions. In this paper, we investigate iterative frequency-domain decision- feedback equalization (FD/DFE), which significantly improves performance compared to minimum mean-square error (MMSE) and zero-forcing (ZF) linear equalizers. We introduce a new FD/DFE and compare it to previously proposed equalizers.
Network Planning for Multi-radio Cognitive Wireless Networks
Xiaodong Wang (Columbia University, USA)
We propose a general network planning framework for multi-radio multi-channel cognitive wireless networks. Under this framework, data routing, resource allocation, and scheduling are jointly designed to maximize a network utility function. We treat such a cross-layer design problem with fixed radio distributions across the nodes and formulate it as a large-scale convex optimization problem. A primal-dual method together with the column-generation technique is proposed to efficiently solve this problem. Simulation studies are carried out to assess the performance of the proposed cross-layer network planning framework. It is seen that the proposed approach can significantly enhance the overall network performance.

### Fri.6.1: Efficient SP Algorithms and Architectures - 7 papers

Room: Room 4
Chair: Thomas Kaiser (University of Duisburg-Essen, Germany)
A Computationally Efficient High-Quality Cordic Based DCT
Benjamin Heyne (University of Dortmund, Germany); Chi-Chia Sun (National Taiwan University of Science and Technology, Taiwan); Juergen Goetze (University of Dortmund, Germany); Shanq-Jang Ruan (National Taiwan University of Science and Technology, Taiwan)
In this paper a computationally efficient and high-quality preserving DCT architecture is presented. It is obtained by optimizing the Loeffler DCT based on the Cordic algorithm. The computational complexity is reduced from 11 multiply and 29 add operations (Loeffler DCT) to 38 add and 16 shift operations (which is similar to the complexity of the binDCT). The experimental results show that the proposed DCT algorithm not only reduces the computational complexity significantly, but also retains the good transformation quality of the Loeffler DCT. Therefore, the proposed Cordic based Loeffler DCT can be used in low-power and high-quality CODECs, especially in battery-based systems.
An adaptive embedded architecture for real-time image velocimetry algorithms
Alain Aubert (LTSI, Université Jean Monnet, France); Nathalie Bochard (LTSI, Université Jean Monnet, France); Virginie Fresse (Université Jean Monnet, Saint Etienne, France)
Particle Image Velocimetry (PIV) is a method of im-aging and analysing fields of flows. The PIV tech-niques compute and display all the motion vectors of the field in a resulting image. Speeds more than thou-sand vectors per second can be required, each speed being environment-dependent. Essence of this work is to propose an adaptive FPGA-based system for real-time PIV algorithms. The proposed structure is ge-neric so that this unique structure can be re-used for any PIV applications that uses the cross-correlation technique. The major structure remains unchanged, adaptations only concern the number of processing operations. The required speed (corresponding to the number of vector per second) is obtained thanks to a parallel processing strategy. The image processing designer duplicates the processing modules to distrib-ute the operations. The result is a FPGA-based architecture, which is easily adapted to algorithm specifications without any hardware requirement. The design flow is fast and reliable.
Using RTOS in the AAA methodology automatic executive generation
Ghislain Roquier (IETR/INSA de Rennes, France); Mickael Raulet (IETR, France); Jean-François Nezan (IETR, France); Olivier Deforges (IETR / INSA Rennes, France)
Future generations of mobile phones, including advanced video and digital communication layers, represent a great challenge in terms of real-time embedded systems. Programmable multicomponent architectures may provide suitable target solutions combining flexibility and computation power. The aim of our work is to develop a fast and automatic prototyping methodology dedicated to signal processing application implementation on parallel heterogeneous architectures, two major features required by future systems. This paper aims to present the Algorithm Architecture Adequation methodology from both the description of application and multicomponent architecture to the dedicated real-time distributed executives. Then a comparison is done between executives which use or not a resident Real-Time Operating System. This work is finally illustrated by the execution of a multimedia application based on a scalable LAR video codec.
Efficient Implementation of Undecimated Directional Filter Banks
Truong Nguyen (University of Texas at Arlington, USA); Soontorn Oraintara (University of Texas at Arlington, USA)
In this paper, an efficient method to implement undecimated directional filter banks (UDFBs) is proposed. The method is based on an observation that many non-separable two-dimensional two-channel FBs can be efficiently implemented by a separable structure if their polyphase components are separable. Therefore, with appropriate delay and advance blocks, undecimated non-separable FBs can be computed with a comparable computational complexity to the separable case. Structures for 2-, 4- and 8-channel UDFBs are presented to illustrate the idea.
A Bit-serial Architecture for H.264/AVC Interframe Ddecoding
Pawel Garstecki (Poznan University of Technology, Poland); Tomasz Zernicki (Poznan University of Technology, Poland); Adam Luczak (Poznañ University of Technology, Poland)
The H.264/AVC is the most recent standard of video compression. In this paper, an original and efficient architecture of inter prediction block in an H.264/AVC decoder is presented. It is shown that the bit-serial arith-metic can be successfully used for interpolation filter im-plementation and the resulting architecture is fully pipe-lined. The inter prediction module was implemented in Verilog HDL and synthesized and then tested on Xilinx Virtex IV family devices. The simulation results indicate that the proposed bit-serial architecture of interpolation filter is very efficient and clock frequency close to the im-age sampling frequency is enough to perform image recon-struction.
Reconfigurable Architecture of AVC/H.264 Integer Transform
Adam Luczak (Poznañ University of Technology, Poland); Marta Stêpniewska (Poznañ University of Technology, Poland)
The paper presents original reconfigurable architecture of inverse integer transformation for H.264/AVC decoder. Proposed design can perform integer 4x4, 8x8 and Hadamard inverse transform including inverse quantization process as well. The design exploits pipelined architecture and supports FPGA devices. Simulation result indicates that proposed structure is characterized by low implementation cost and high efficiency. Final synthesis and test has been made for Xilinx Virtex family devices.
A Fixed-Point Smoothing Algorithm in Discrete-Time Systems with Correlated Signal and Noise
R.m. Fernández-Alcalá (University of Jaén, Spain); J. Navarro-Moreno (University of Jaén, Spain); J.c. Ruiz-Molina (University of Jaén, Spain); Antonia Oya (University of Jaén, Spain)
The linear least mean-square fixed-point smoothing problem in discrete-time systems is formulated in the general case where the signal is any nonstationary stochastic process of second order which is observed in the presence of an additive white noise correlated with the signal. Under the only assumption that the correlation functions involved are factorizable kernels, an efficient recursive computational algorithm for the fixed-point smoother is designed. Also, a filtering algorithm is devised.

### Fri.4.1: Adaptive Filtering - 7 papers

Room: Sala Onice
Chair: Colin Cowan (Queens University, Belfast, United Kingdom)
Improved LMS-type adaptive filtering algorithms using a new maximum eigenvalue bound estimation scheme
Yi Zhou (The University of Hong Kong, Hong Kong); Shing-Chow Chan (The University of Hong Kong, Hong Kong); K. l. Ho (The University of Hong Kong, Hong Kong)
This paper proposes a new recursive scheme for estimating the maximum eigenvalue bound for autocorrelation matrices and its application to the stepsize selection in least mean squares (LMS)-type adaptive filters. This scheme is developed from the Gershgorin circle theorem and the recursive nature of estimating the correlation matrix. The bound of the maximum eigenvalue of a correlation matrix can be recursively estimated in arithmetic complexity. Applying this new recursive estimate to the stepsize selection of LMS-type algorithms, the problem of over-estimating the maximum eigenvalue bound and hence the under-estimation of the stepsize in the conventional trace estimator is ameliorated. This significantly improves the transient convergence and tracking speed of LMS-type algorithms. To lower the extra steady state error caused by the use of bigger stepsizes, an effective switching mechanism is designed and incorporated into the proposed algorithms so that a smaller stepsize can be invoked near the steady state. The superior performance of the proposed algorithms is verified by numerical and computer simulations.
Self-optimizing adaptive notch filters -- comparison of three optimization strategies
Maciej Niedzwiecki (Gdansk University of Technology, Poland); Piotr Kaczmarek (Gdansk University of Technology, Poland)
The paper provides comparison of three different approaches to on-line tuning of adaptive notch filters (ANF) -- the algorithms used for extraction/elimination of complex sinusoidal signals (cisoids) buried in noise. Tuning is needed to adjust adaptation gains, which control tracking performance of ANF algorithms, to the unknown and/or time time-varying rate of signal nonstationarity. The first of the compared approaches incorporates sequential optimization of adaptation gains. The second solution is based on the concept of parallel estimation. It is shown that the best results are achieved when both approaches are combined in a judicious way. Such joint sequential/parallel optimization preserves advantages of both treatments: adaptiveness (sequential approach) and robustness to abrupt changes (parallel approach).
Generalized adaptive notch filters with frequency debiasing
Maciej Niedzwiecki (Gdansk University of Technology, Poland); Adam Sobocinski (Gdansk University of Technology, Algeria)
Generalized adaptive notch filters are used for identification/tracking of quasi-periodically varying dynamic systems and can be considered an extension, to the system case, of classical adaptive notch filters. Belonging to the class of causal adaptive filters, the generalized adaptive notch filtering algorithms yield biased frequency estimates. We show that this bias can be removed, or at least substantially reduced. The only price paid for the resulting improvement of the filter's tracking performance is in terms of a decision delay, which must be incorporated in the adaptive loop. Since decision delay is acceptable in many practical applications, the proposed bias/delay trade-off is an attractive alternative to the classical bias/variance compromise.
Robust adaptive filters using Student-t distribution
Guang Deng (La Trobe University, Australia)
An important application of adaptive filters is in system identification. Robustness of the adaptive filters to impulsive noise has been studied. In this paper, we propose an alternative way to developing robust adaptive filters. Our approach is based on formulating the problem as a maximum penalized likelihood (MPL) problem. We use student-t distribution to model the noise and a quadratic penalty function to play a regularization role. The minorization-maximization principle is used to solve the optimization problem. Based on the solution, we propose two LMS-type of algorithms called MPL-LMS and robust MPL-LMS. The robustness of the latter algorithm is demonstrated both theoretically and experimentally.
A new variable forgetting factor QR-based recursive least M-estimate algorithm for robust adaptive filtering in impulsive noise environment
Yi Zhou (The University of Hong Kong, Hong Kong); Shing-Chow Chan (The University of Hong Kong, Hong Kong); K. l. Ho (The University of Hong Kong, Hong Kong)
This paper proposes a new variable forgetting factor QR-based recursive least M-estimate (VFF-QRRLM) adaptive filtering algorithm for impulsive noise environment. The new algorithm is a QR-based implementation of the RLM algorithm, which offers better numerical stability and a similar robustness to impulsive noise. A new VFF control scheme based on the approximated derivatives of the filter coefficients is also proposed to improve its tracking performance. Simulation results show that the proposed algorithm not only offers improved robustness in impulsive noise environment but also possesses fast transient converging and tracking behaviours.
The Conjugate Gradient Partitioned Block Frequency-Domain Adaptive Filter for Multichannel Acoustic Echo Cancellation
Multichannel acoustic cancellation problem requires working with extremely large impulse responses. Multirate adaptive schemes such as the partitioned block frequency-domain adaptive filter (PBFDAF) are good alternatives and are widely used in commercial echo cancellation systems nowadays. However, being a Least Mean Square (LMS) derived algorithm, the convergence speed may not be fast enough under some circumstances. In this paper we propose a new scheme which combines frequency-domain adaptive filtering with conjugate gradient techniques in order to speed up the convergence time. The new algorithm (PBFDAF-CG) is developed and its behaviour is compared against previous PBFDAF schemes.
Adaptive Subband Notch Filter for RFI Cancellation in Low Interference to Signal Ratio
Asmi Rim (SUPCOM- Tunis, Tunisia); Sofian Cherif (SupCom, Tunisia); Hichem Besbes (Ecole Superieure de Communications de Tunis, Sup'Com TUNISIA, Tunisia); Jaidane Meriem (U2S, Tunisia)
In this paper, an adaptive-subband-based Radio Frequency Interference (RFI) cancellation method for VDSL systems is developed. The use of subband adaptive IIR notch filter with a specific developed excision algorithm offers a solution in low Interference to Signal Ratio (ISR) environment. The division of a fullband problem into several subbands improves the tracking ability of the Fullband Adaptive Notch Filter (FANF) in low ISR, since the noise variance is reduced in each subband and offers faster convergence speed, due to decimation. In this paper, we compare the performances of the subband method with the classical fullband method using the normalized-stochastic-algorithm-based IIR Adaptive Notch Filter (ANF).

### Fri.3.1: Multi-user MIMO Communications (Special session) - 7 papers

Room: Sala Verde
Chair: Christoph Mecklenbraeuker (FTW, Austria)
Linear receiver interfaces for multiuser MIMO communications
Alessandro Nordio (Politecnico di Torino, Italy); Giorgio Taricco (Politecnico di Torino, Italy)
We consider the uplink of a DS-CDMA wireless communication system with multiple users equipped with several transmit antennas. We assume that a multiple-antenna subsystem is added to an existing multiuser detector and we compare the performance of this receiver and that of an optimum receiver accounting for both spatial and multiple-access interference simultaneously. In the former case we say that the receiver is separate whereas in the latter we say that the receiver is joint. Several classes of separate and joint linear receivers are considered and their performance is evaluated asymptotically and by simulation.
Multiuser detection using random-set theory
Ezio Biglieri (Universitat Pompeu Fabra, Barcelona, Spain); Marco Lops (University of Cassino, Italy)
In mobile multiple-access communications, not only the location of active users, but also their number varies with time. In typical analyses, multiuser detection theory is developed under the assumption that the number of active users is constant and known at the receiver, and coincides with the maximum number of users entitled to access the system. This assumption is often overly pessimistic, since many users might be inactive at any given time, and detection under the assumption of a number of users larger than the real one may impair performance. This paper undertakes a different, more general approach to the problem of identifying active users and estimating their parameters and data in a dynamic environment where users are continuously entering and leaving the system. Using a mathematical tool known as Random Set Theory, we derive Bayesian-filter equations which describe the evolution with time of the a posteriori probability density of the unknown user parameters, and use this density to derive optimum detectors.
Subcarrier Allocation in a Multiuser MIMO Channel Using Linear Programming
Ari Hottinen (Nokia Research Center, Finland); Tiina Heikkinen (MTT Economic Research, Finland)
In this paper, we apply specialized linear programming algorithms in assigning users to orthogonal frequency channels or subcarriers in a channel-aware MIMO-OFDMA system. Efficient optimization techniques enable total utility (e.g. downlink capacity) maximization in polynomial time in the presence of strict fairness constraints.
Efficient Vector Perturbation in Multi-Antenna Multi-User Systems Based on Approximate Integer Relations
Dominik Seethaler (Institute of Communications and Radio-Frequency Engineering, Vienna University of Technology, Austria); Gerald Matz (Vienna University of Technology, Austria)
Approximate vector perturbation techniques assisted by LLL lattice reduction (LR) can exploit all the diversity that is available in multi-user multi-antenna broadcast systems. However, the required computational complexity of LLL-LR can be quite large. In this paper, we propose a much simpler and much more efficient LR algorithm than LLL. This LR technique is based on Brun's algorithm for finding approximate integer relations (IRs). The link between LR and IRs is established by considering poorly conditioned channels with a single small singular value. Simulation results show that our scheme can achieve large (but bot full) diversity at a fraction of the complexity required for LLL-assisted vector perturbation.
Iterative Transceiver Optimization for Linear Multiuser MIMO Channels with Per-User MMSE Requirements
Martin Schubert (Fraunhofer German-Sino Lab for Mobile Communications MCI, Germany); Shuying Shi (Fraunhofer German-Sino Lab for Mobile Communications MCI, Germany); Holger Boche (Fraunhofer Institute for Telecommunications HHI, Germany)
We address the problem of jointly optimizing linear transmit and receive filters for a multi-user MIMO system, under the assumption that all users have individual Minimum-Mean-Square-Error (MMSE) requirements. Each user can perform spatial multiplexing with several data streams (layers). All users and layers are coupled by interference, so the choice of filters is intricately interwoven with the power allocation strategy. The design goal is to minimize the total power subject to MMSE constraints. This results in a non-convex problem formulation, for which we propose an iterative algorithm. The iteration consists of alternating optimization of powers, transmit filters and receive filters. We prove that the total required power obtained by the algorithm is monotonically decreasing and converges to a limit.
Transmit Correlation-Aided Opportunistic Beamforming and Scheduling
Marios Kountouris (France Telecom R&D, France); David Gesbert (Eurecom Institute, France); Lars Pittman (Norwegian Univ. of Science & Technology, Norway)
The problem of multiuser scheduling and beamforming with partial channel state information at the transmitter (CSIT) is considered here in the particular context of opportunistic beamforming. We consider a random multiuser beamforming scheme and show how long-term statistical information of usersâ? channels can be efficiently combined with short term SINR feeback to increase system performance substantially over a conventional opportunistic beamforming scheme. We propose a channel estimation method to be used by the transmitter which exploits the second order channel statistics, the fading statistics and the information contained in the instantaneous SINR feedback of random opportunistic beamforming. This coarse (low feedback-based) channel estimate is shown to be particularly valuable for the purpose of user selection, as well as for the precoding matrix design.
A reduced complexity MIMO Broadcast scheme: a way between opportunistic and dirty paper implementation
Nizar Zorba (Telecommunications Technological Center of Catalonia (CTTC), Spain); Ana I. Perez-Neira (Universitat Politecnica de Catalunya, Spain); Miguel Angel Lagunas (Telecommunications Technological Center of Catalonia, Spain)
Departing from the opportunistic schemes in Multiuser Broadcast MIMO schedulers, practical transmission techniques to get closer to the channel capacity are proposed for the outdoor urban scenario. By considering the Spatial Power Density function of the arriving signal, the paper develops different setups based on the quantity of available Channel State Information at the transmitter side (CSIT). In a first approach, Signal to Noise Interference Ratio (SNIR) feedback scenario is considered, but to further decrease the gap between the channel capacity and the proposed scheme, the paper suggests that only the selected users are asked to provide their full CSIT to the transmitter side. The goal is to always keep a small load on the feedback link while at the same time providing almost all of the benefits of full CSIT scenarios. The proposed schemes are compared via simulations with other possible transmission strategies in terms of system sum rate.

Room: Auditorium

## 11:20 AM - 1:00 PM

### Fri.2.2: Hardware-Related Issues in Signal Processing - 5 papers

Chair: Fabrizio Rovati (STMicroelectronics, Italy)
The Rapid Prototyping experiences of image processing algorithms on FPGA
Vincent Brost (University of Burgundy, France)
Recent FPGA chips, with their large capacity memory and reconfigurability potential, have opened new frontiers for rapid prototyping of embedded systems. With the advent of high density FPGAs it is now feasible to implement a high-performance VLIW processor core in an FPGA. We describe research results of enabling the DSP TMS320 C6201 model for real-time image processing applications, by exploiting FPGA technology. The goals are, firstly, to keep the flexibility of DSP in order to shorten the development cycle, and secondly, to use powerful available resources on FPGA to a maximum in order to increase real-time performance. We present a modular DSP C6201 VHDL model with a variable instruction set. Some common algorithms of image processing were created and validated on an FPGA VirtexII-2000 multimedia board using the proposed application development cycle. Our results demonstrate that an algorithm can easily be, in an optimal manner, specified and then automatically converted to VHDL language and implemented on an FPGA device with system level software. This makes our approach suitable for developing co-design environments.
Fast Integer Arithmetic Wavelet Transform Properties and Application in FPGA/DSP System
Wojciech Polchlopek (AGH University of Science and Technology, Poland); Wojciech Maj (AGH University of Science and Technology, Poland); Wojciech Padee (Warsaw University of Technology, Poland)
The new fully integer processing of the wavelet scheme compression enables very fast application and thus it can be very useful for application in real-time systems. The most important property of this concept seems to be the possibility of simple and fast application into FPGA chip. The new Fast Integer Arithmetic Wavelet Transform (FIAWT) can be a very useful tool to compute DWT (and non-decimated DWT) for the time-restricted systems (real-time data processing) e.g. Data Acquisition (DAQ) systems with a waveform recorder. In this paper the authors show some important aspects of FIAWT. The application example (FPGA/DSP) in the Imaging Cosmic And Rare Underground Signals (ICARUS) DAQ system for compression with signal recognition is included as a part of this paper.
Automatic DSP cache memory management and fast prototyping for multiprocessor image applications
Fabrice Urban (Thomson R&D, France); Mickael Raulet (IETR, France); Jean-François Nezan (IETR, France); Olivier Deforges (IETR / INSA Rennes, France)
Abstract Image and video processing applications represent a great challenge in terms of real-time embedded systems. Programmable multicomponent architectures can provide suitable target solutions combining flexibility and computation power. On-chip memory size limitation forces the use of external memory. Its relatively small bandwidth induces performance loss which can be limited by the use of an efficient data access scheme. Cache mechanism offers a good trade-off between efficiency and programming complexity. However in a multicomponent context data consistency has to be ensured. The aim of our work is to develop a fast and automatic prototyping methodology dedicated to deterministic signal processing applications and their multicomponent implementations. This methodology directly generates distributed executives for various platforms from a high level application description. This paper presents the improvement provided by cache memory in the design process. Cache is automatically managed in multicomponent implementations, thus reducing development time. The improvement is illustrated by image processing applications.
Atilla Uygur (Istanbul Technical University, Turkey); Hakan Kuntman (Istanbul Technical University, Turkey)
In this paper, a quadrature oscillator topology is designed using CDTA (Current Differencing Transconductance Am-plifier) based allpass building blocks. Low-voltage CMOS realization of the CDTA element is also given, which is used to implement allpass filters. Resulting circuit is capable of working in supply rails between Â±0,75V and total power consumption is just 0,75 mW. SPICE simulation results are supplied to verify the theoretical study.
Vlsi Design of a High-Speed CMOS Image Sensor with in-situ 2D Programmable Processing
Jerome Dubois (LE2I - Universite de Bourgogne, France); Dominique Ginhac (Universite de Bourgogne, France); Michel Paindavoine (Université de Bourgogne, France)
A high speed VLSI image sensor including some pre-processing algorithms is described in this paper. The sensor implements some low-level image processing in a massively parallel strategy in each pixel of the sensor. Spatial gradi-ents, various convolutions as Sobel filter or Laplacian are described and implemented on the circuit. Each pixel in-cludes a photodiode, an amplifier, two storage capacitors and an analog arithmetic unit. The systems provides ad-dress-event coded outputs on tree asynchronous buses, the first output is dedicated to the result of image processing and the two others to the frame capture at very high speed. Frame rates to 10 000 frames /s with only image acquisition and 1000 to 5000 frames /s with image processing have been demonstrated by simulations.

### Fri.5.2: Pattern Recognition I - 5 papers

Chair: Bulent Sankur (Bogazici University, Turkey)
Anisotropy characterization of non stationnary grey level textures
Gerald Lemineur (University of Orleans, France); Rachid Harba (University of Orleans, France); Rachid Jennane (University of Orleans, France)
This communication deals with anisotropy characterization of grey level textures in a non stationary context. Three classes of characterization methods are tested. The first class of methods directly uses the non stationary data. We chose here the non stationary fractional Brownian motion (fBm) to model such images. The second class is based on a station-ary approach. Thus, the image is to be stationarized first. The third class of methods is founded on stationary tech-niques too, but in a binary framework. For this reason, the original image must be stationarized and binarized first. All the proposed methods are tested on various grey level non stationary textures. Results show that all the techniques are in good agreement with the anisotropy visible on the data. The fBm based method is the only one that does not require any choice or adjustment of parameters. For a real applica-tion where the anisotropy really matters, this technique should be tested.
Study of the performance of different alignment methods in pattern recognition
Roberto Gil (University of Alcalà¡, Spain); Manuel Rosa (University of Alcalà¡, Spain); Raúl Vicen (University of Alcalà¡, Spain); Francisco Lopez (University of Alcalà¡, Spain); Manuel Utrilla (University of Alcalà¡, Spain)
In this paper the alignment of noisy signals using different methods is studied. The methods studied in this paper are the Maximum Position method, the Cross-correlation method and the Zero Phase method. In order to evaluate the performance of the alignment methods, a database of high range resolution radar profiles containing patterns belonging to six different targets has been used, and the classification error using the k-Nearest Neighbor method is calculated. Results show the best performance of the Maximum Position method, in terms of error rate.
Audiovisual Speech Enhancement Experiments for Mouth Segmentation Evaluation
Pierre Gacon (Institut National Polytechnique de Grenoble, France); Pierre-Yves Coulon (Institut National Polytechnique de Grenoble, France); Gérard Bailly (Institut National Polytechnique de Grenoble, France)
Mouth segmentation is an important issue which applies in many multimedia applications as speech reading, face synthesis, recognition or audiovisual communication. Our goal is to have a robust and efficient detection of lips contour in order to restore as faithfully as possible the speech movement. We present a methodology which focused on the detection of the inner and outer mouth contours which is a difficult task due to the non-linear appearance variations. Our method is based on a statistical model of shape with local appearance gaussian descriptors whose theoretical responses were predicted by a non-linear neural network. From our automatic segmentation of the mouth, we can generate a clone of a speaker mouth whose lips movements will be as close as possible of the original ones. In this paper, results obtained by this methodology are evaluated qualitatively by testing the relevance of this clone. We carried out an experience which quantified the effective enhancement in comprehension brought by our analysis-resynthesis scheme in a telephone enquiry task.
A framework for histogram-induced 3D descriptors
Ceyhun Akgül (Bogazici University and ENST-Paris, Turkey); Bulent Sankur (Bogazici University, Turkey); Yücel Yemez (Koc University, Turkey); Francis Schmitt (ENST/TSI, France)
In this paper, we propose a novel histogram descriptor framework for 3D shapes based on modeling the probability density functions (pdf) of geometrical quantities. The shape functions are designed to reflect the 3D surface properties of objects, and their pdf is modeled as mixtures of Gaussians. We make use of the special geometry of triangular meshes in 3D and provide efficient means to approximate the moments of shape functions per triangle. Our framework produces a number of 3D shape descriptors that prove to be quite dis-criminative in a retrieval application. We test and compare our descriptors to other histogram-based methods on two 3D model databases. It is shown that our methodology im-proves the performance of existing descriptors and becomes a fertile ground to advance and tests new ones.
Print-to-Edge Registration Application for Banknotes
Volker Lohweg (Univ. Applied Sciences Lippe and Hoexter, Germany); Thomas Tuerke (KBA-Bielefeld, Germany); Harald Willeke (KBA-Bielefeld, Germany)
Spatial transforms and fuzzy pattern classification with uni-modal potential functions are established in signal process-ing. They have proved to be excellent tools in feature extrac-tion and classification. In this paper we present a concept on an image processing and classification scheme for a print-to-edge registration application. The application describes the position detection of objects in banknotes. The scheme is straightforward and therefore well suited for industrial ap-plications. Furthermore, it can be applied to other applica-tions. An implementation on one field programmable gate array (FPGA) will be proposed.

### Fri.1.2: Digital Signal Processing for UWB Applications I (Invited special session) - 5 papers

Room: Auditorium
Chair: Umberto Mengali (University of Pisa, Italy)
Reduced-complexity Multiple Symbol Differential Detection for UWB Communications
Vincenzo Lottici (University of Pisa, Italy); Zhi Tian (Michigan Technological University, USA)
In ultra-wideband (UWB) communications, the typical signal propagation through dense multipath fading offers potentially very large multipath diversity, but at the same time complicates receiver design as far as channel estimation and multipath energy capture are concerned. To strike a desired balance, we propose a multi-symbol differential detection framework that bypasses training or costly channel estimation by the use of autocorrelation principle. Furthermore, resorting properly to the Viterbi algorithm enables to attain an efficient performance versus affordable complexity tradeoff solution. Simulation results demonstrate that the proposed detection scheme is remarkably robust with respect to the effects of both noise and multiple access interference.
UWB Receiver Design for low Resolution Quantization
Stefan Franz (University of Southern California, USA); Urbashi Mitra (University of Southern California, USA)
Digital implementation of ultra-wideband receivers requires analog-to-digital conversion (ADC) at an extremely high speed, thereby limiting the available bit resolution. Herein, a new family of receiver structures optimized and tailored to quantized observations is presented. The generalized-likelihood ratio test (GLRT) based on the quantized samples is derived and shown to provide performance improvements in comparison to the infinite resolution GLRT rule employed on the quantized received signal. Furthermore, simulation results reveal that four bits of resolution are sufficient to closely approach the performance of a full resolution receiver.
Narrowband Interference Suppression in Transmitted Reference UWB Receivers Using Sub-Band Notch Filters
Marco Pausini (Delft University of Technology, The Netherlands); Gerard Janssen (Delft University of Technology, The Netherlands)
The Transmitted-Reference (TR) signaling scheme in conjunction with the Auto-correlation Receivers (AcR) has gained popularity in the last few years as low-complexity system architecture for Ultra Wide Band (UWB) communications. Since the signal template is directly obtained from the received signal, not only the noise but also the interference caused by a narrowband (NB) system operating in the same bandwidth corrupt both the data and the reference pulses. In this paper we study the effects of a single-tone interferer on the performance of a TR systems, measured in terms of probability of error. We also propose a simple but effective way to counteract the NB interference, consisting of a bank of notch filters, suppressing the sub-band containing the NB signal.
Narrowband interference mitigation for a transmitted reference ultra-wideband receiver
Quang Hieu Dang (Delft University of Technology, The Netherlands); Alle Jan van der Veen (Delft University, The Netherlands)
Narrowband inteference (NBI) is of specific concern in transmitted reference ultrawide band (TR-UWB) communication systems. We consider NBI in high data rate applications where significant interframe interference is present due to a very short frame rate. Oversampling of the correlator output with respect to the frame rate is used to gather more information for the receiver. We formulate an approximate data model that includes the NBI terms, subsequently a receiver algorithm is derived.
Finger Selection for UWB Rake Receivers
Sinan Gezici (Mitsubishi Electric Research Labs, USA); Mung Chiang (Princeton University, USA); Hisashi Kobayashi (Princeton University, USA); H. Vincent Poor (Princeton University, USA)
The problem of choosing the multipath components to be employed at a selective Rake receiver, the finger selection problem, is considered for an impulse radio ultra-wideband system. First, the finger selection problem for MRC-Rake receivers is considered and the suboptimality of the conventional scheme is shown by formulating the optimal solution according to the SINR maximization criterion. Due to the complexity of the solution, a convex formulation is obtained by means of integer relaxation techniques. Then, the finger selection for MMSE-Rake receivers is studied and optimal and suboptimal schemes are presented. Finally, a genetic algorithm based solution is proposed for the finger selection problem, which works for various multipath combining schemes. Simulation studies are presented to compare the performance of different algorithms. Index Terms Ultra-wideband (UWB), impulse radio (IR), Rake receiver, convex optimization, integer programming, genetic algorithm (GA).

### Poster: Digital Filters Design and Analysis - 18 papers

Room: Poster Area
Chair: Bogdan Dumitrescu (Tampere University of Technology, Finland)
WLS Quasi-Equiripple Design of Variable Fractional Delay FIR Filters
Jie-Cherng Liu (Tatung University, Taiwan)
To design an FIR filter, the Weighted Least Squares (WLS) method is a well-known technique. And it has already been extended to design a variable fractional delay FIR filter. It is reported that the efficiency is better than the method based on Lagrange interpolation. In this paper, the WLS method cooperated with an iterative procedure for obtaining a suitable frequency response weighting function, a quasi-equiripple variable fractional delay filter can be achieved. Simulation results show that after no more than eight iterations we can get rather satisfied results. Besides, the proposed design is superior to the plain WLS (weighting function is constant and fixed) design in peak absolute error by 6.7 dB.
Gram pair parameterization of multivariate sum-of-squares trigonometric polynomials
Bogdan Dumitrescu (Tampere University of Technology, Finland)
In this paper we propose a parameterization of sum-of-squares trigonometric polynomials with real coefficients, that uses two positive semidefinite matrices twice smaller than the unique matrix in the known Gram (trace) parameterization. Also, we formulate a Bounded Real Lemma for polynomials with support in the positive orthant. We show that the new parameterization is clearly faster and thus can replace the old one in several design problems.
Finite Wordlength Implementation for Recursive Digital Filters with Error Feedback
Masayoshi Nakamoto (Hiroshima University, Japan); Takao Hinamoto (Department of Artificial Complex Systems Engineering, Hiroshima University, Japan); Masaki Oku (Hiroshima University, Japan); Tsuyoshi Yoshiya (Hiroshima University, Japan)
In this work, we treat a design problem for recursive digital filters described by rational transfer function in discrete space with scaling and error feedback. First, we form the filter design problem using the least-squares criterion and express it as the quadratic form with respect to the filter coefficients. Next, we show the relaxation method using the Lagrange multiplier method in order to search for the good solution. Finally, we show the effectiveness of the proposed method using some numerical examples.
Roundoff Noise Analysis of Signals Represented Using Signed Power-of-Two Terms
Ya Jun Yu (Nanyang Technological University, Singapore); Yong Ching Lim (Nanyang Technological University, Singapore)
Multiplierless filtering is very attractive for digital signal processing, for the coefficient multiplier is the most complex and the slowest component. An alternative way to achieve multiplierless filtering other than designing digital filters with signed power-of-two (SPT) coefficient values is to round each input data to a sum of a limited number of SPT terms. However, a roundoff noise representing the roundoff error is introduced when signal data are rounded to SPT numbers. In the SPT space, the quantization step size is nonuniform and thus the roundoff noise characteristic is different from that produced when the quantization step size is uniform. This paper presents an analysis for the roundoff noise of signal represented using a limited number of SPT terms. Roundoff errors for Gaussian distributed inputs are estimated by using our analysis. Examples show that the estimated errors are very close to the actual ones.
Lower Limit Cycle Bounds for Narrow Transition Band Digital Filters
Jose Oses-del Campo (Universidad Politecnica de Madrid, Spain); Manuel Blanco-Velasco (University of Alcala, Spain); Fernando Cruz-Roldán (Universidad de Alcalà¡, Spain)
A new procedure for obtaining tighter bounds on zero-input limit cycles is presented. The determined new bounds are applicable to digital filters of arbitrary order described in state-space formulation and implemented with fixed-point arithmetic. In most filters, we obtain smaller bounds through this new algorithm easy to implement and to execute in a very short computer time. The bounds obtained for narrow-band filters are far more lower than those corresponding to classical procedures, yielding enormous computation savings to complete an exhaustive search. Simulation results are presented in different tables that show the validity of the proposed theory.
Strategies of adaptive nonlinear noise cancelling Volterra-Wiener filter structure selection
Pawel Biernacki (University of Technology Wroclaw, Poland)
In this article a nonlinear orthogonal noise cancelling filter parameters selection strategy is proposed to maximize noise cancelling quality to computation complexity ratio. Presented approach leads to the orthogonal realization of the nonlinear noise cancelling filter of Volterra-Wiener class which structure changes due to higher-order statistics of the filtered signals.
Lattice factorization and design of Perfect Reconstruction Filter Banks with any Length yielding linear phase
Zhiming Xu (Nanyang Technological University, Singapore); Anamitra Makur (Nanyang Technological University, Singapore)
This paper introduces the lattice factorizations and designs of a large class of critically sampled linear phase perfect reconstruction filter banks. We deal with FIR filter banks with real-valued coefficients in which all filters have the same arbitrary length and symmetry center. Refined existence conditions on the length are given first. Lattice structures are developed for both even and odd channel filter banks. Compared to most existing design methods, the proposed approach can offer better trade off between performance and filter length. Finally, several design and application examples are presented to validate our approach.
A Streamlined Approach to Adaptive Filters
John Håkon Husøy (University of Stavanger, Norway)
In this paper we present a streamlined framework for adaptive filters within which all major adaptive filter algorithms can be seen as special cases. The framework involves three ingredients: 1) A preconditioned Wiener Hopf Equation, 2) Its simplest possible iterative solution through the Richardson iteration, and 3) An estimation strategy for the autocorrelation matrix, the cross correlation vector and a preconditioning matrix. This results in a unified adaptive filter theory characterized by simplicity, elegance and economy of ideas suitable for an educational setting in which the similarities and differences between the many different adaptive filter algorithms are stressed.
About Ordering Sequences in Minimum-Phase Sequences
Corneliu Rusu (Tampere University of Technology, Finland); Jaakko Astola (Tampere University of Technology, Finland)
In the previous EUSIPCO 2004 paper, we have shown that certain types of sequences can be ordered such that the new sequence is a minimum-phase one. We also pointed out that this property is not valid for all real sequences. In this paper we prove that the set of points from the $N$-dimensional real or complex set having this characteristic is a non-empty open set. Moreover, we illustrate some new features of the sequences that can be ordered as minimum-phase sequences.
A Simplified FTF-Type Algorithm for Adaptive Filtering
Ahmed Benallal (Dammam college of Technology, Saudi Arabia)
Fast RLS algorithms are known to present numerical instability and this instability is originated in the forward prediction parameters. In this paper, A simplified FTF-Type algorithm for adaptive filtering is presented. The basic idea behind the proposed algorithm is to avoid using the backward variables. The algorithm obtained is less complex than the existing numerically stable fast FTF and shows the same performances.
Optimal Design of FIR Filter with SP2 Coefficients based on Semi-Infinite Linear Programming Method
Rika Ito (Toho University, Japan); Ryuichi Hirabayashi (Mejiro University, Japan)
In this paper, we propose a new design method of FIR filters with Signed Power of Two (SP2) coefficients. In the method proposed here, the design problem of FIR filters is formulated as an discrete semi-infinite linear programming problem (DSILP), and the DSILP is solved using a branch and bound technique. We will guarantee the optimality of the solution obtained. Hence, it is possible to obtain the optimal discrete coefficients. It is confirmed that the optimal coefficients of linear phase FIR filter with the SP2 coefficients could be designed with enough precisions by the computational experiments.
On Families of 2^N-dimensional Hypercomplex Algebras Suitable for Digital Signal Processing
Daniel Alfsmann (University of Bochum, Germany)
A survey of hypercomplex algebras suitable for DSP is presented. Generally applicable properties are obtained, including a paraunitarity condition for hypercomplex lossless systems. Algebras of dimension n = 2^N, N in Z, are classified by generation methods, constituting families. Two algebra families, which hold commutative and associative properties for arbitrary N, are examined in more detail: The 2^N-dimensional hyperbolic numbers and tessarines. Since these non-division algebras possess zero divisors, orthogonal decomposition of hypercomplex numbers is investigated in general.
Optimization of FIR Filters Synthesized Using the Generalized One-Stage Frequency-Response Masking Approach
Murat Kapar (Istanbul Technical University, Turkey); Cercis Özgür Solmaz (Istanbul Technical University, Turkey); Ahmet Kayran (Istanbul Technical University, Turkey)
Frequency-response masking (FRM) approach is an efficient technique for significantly reducing the number of multipli-ers and adders in implementing sharp linear-phase finite-impulse-response (FIR) digital filters. It has been shown that further savings in arithmetic operations can be achieved by using the generalized FRM approach where the masking filters have a new structure. In both the original and the generalized synthesis techniques, the subfilters in the overall implementation are designed separately. The arithmetic complexity in the original one-stage FRM approach has been considerably reduced by using a two-step technique for simultaneously optimizing all the subfilters. Such an efficient algorithm was also proposed for synthesizing multistage FRM filters. In this paper, the two-step optimization algo-rithm proposed for the multistage FRM approach is adapted to the generalized one-stage FRM filters. An example taken from the literature illustrates the efficiency of the proposed technique.
Optimal Linear Filtering with Piecewise-Constant Memory
Anatoli Torokhti (University of South Australia, Australia); Phil Howlett (University of South Australia, Australia)
The paper concerns the optimal linear filtering of stochastic signals associated with the notion of piecewise constant memory. The filter should satisfy a specialized criterion formulated in terms of a so called lower stepped matrix $A$. To satisfy the special structure of the filter, we propose a new technique based on a block-partition of the lower stepped part of matrix $A$ into lower triangular and rectangular blocks, $L_{ij}$ and $R_{ij}$ with $i=1,\ldots,l, j=1,\ldots, s_i$ where $l$ and $s_i$ are given. We show that the original error minimization problem in terms of the matrix $A$ is reduced to $l$ individual error minimization problems in terms of blocks $L_{ij}$ and $R_{ij}$. The solution to each problem is provided and a representation of the associated error is given.
Design of M-channel Perfect Reconstruction Filterbanks with IIR-FIR Hybrid Building Blocks
Shunsuke Iwamura (Keio University, Japan); Taizo Suzuki (Keio University, Japan); Yuichi Tanaka (Keio University, Japan); Masaaki Ikehara (Keio University, Japan)
This paper discusses a new structure of M-channel IIR perfect reconstruction filterbanks. The new design implementation for the new building block defined as product of IIR building block and FIR building block is presented. We introduce the condition how we obtain the new building block without increasing of the filter order in spite of product. Additionally, by showing the simulation results, we present the better stopband attenuation.
On Aliasing Effects in the Contourlet Filter Bank
Truong Nguyen (University of Texas at Arlington, USA); Soontorn Oraintara (University of Texas at Arlington, USA)
The pyramidal directional filter bank (PDFB) for the contourlet transform is analyzed in this paper. First, the PDFB is viewed as an overcomplete filter bank, and the directional filters are expressed in terms of polyphase components of the pyramidal filter bank and the conventional DFB. The aliasing effect of the conventional DFB and the Laplacian pyramid to the equivalent directional filters is then considered, and the conditions to reduce this effect are presented. Experiments show that the designed PDFBs satisfying these requirements have the equivalent filters with much better frequency responses. The performance of the new PDFB is verified by non-linear approximation of images. It is found that improvement of 0.2 to 0.5 dB in PSNR as compared to the existing PDFB can be achieved.
Designing linear-phase digital differentiators A novel approach
Ewa Hermanowicz (Gdansk University of Technology, Poland)
In this paper two methods of designing a digital linear-phase differentiator of an arbitrary degree of differentiation efficiently and accurately are proposed. The first one utilizes a symbolic expression for the coefficients of a generic fractional delay filter and is based on a fundamental relationship between the coefficients of a digital differentiator and the coefficients of the generic FD filter. The second profits from one of the core attributes of the structure invented by Farrow for FD filters. It lies in an alternate symmetry and anti-symmetry of sub-filters which are linear-phase differentiators. Some practical remarks concerning the usage of the true Farrow structure as opposed to a vitiated one often met in the literature are also included.
Design of IIR Notch Filters With Maximally Flat or Equripple Magnitude Characteristics
Alfonso Fernandez-Vazquez (INAOE, Mexico); Gordana Jovanovic Dolecek (INAOE, Mexico)
This paper presents the design of IIR (Infinite Impulse Response) notch filters with desired magnitude characteristic, which can be either maximally flat or equiripple. Butterworth polynomial, used for designing the allpass filter, will result in a maximally flat magnitude. Similarly, an equiripple characteristic is obtained by using Chebyshev I, Chebyshev II and Elliptic polynomials. The parameters of the design are notch frequency, rejection bandwidth and passbands ripple.

### Poster: Channel Coding and Decoding - 5 papers

Room: Poster Area
Chair: Martin Haardt (Ilmenau University of Technology, Germany)
Convergence analysis of the MAP turbo detector: New representation and convergence to the AWGN case
Noura Sellami (Institut Supérieur de l'Electronique et des Communications de Sfax, Tunisia); Aline Roumy (IRISA-INRIA, Campus de Beaulieu, Rennes, France); Inbar Fijalkow (Universite de Cergy-Pontoise, France)
In this paper, we consider a coded transmission over a frequency selective channel. In \cite{sellamiSP05,sellamispawc05}, we studied analytically the impact of a priori information on the MAP equalizer performance and gave the expression of the extrinsic Log Likelihood Ratios (LLRs) at its output. Based on these results, we propose in this paper to study the convergence of the turbo equalizer using a Maximum a posteriori (MAP) equalizer and a MAP decoder. We give an analysis of the decoder performance when it is provided with a priori. Then, we propose a new representation space for the convergence analysis of the turbo equalizer. This representation is interesting since for the equalizer it is independent of the signal to noise ratio (SNR). We also show that the performance of the turbo equalizer converges at high SNR to the Additive White Gaussian Noise (AWGN) channel performance under a certain condition on the channel and the code that we will derive.
Bit-Interleaved Coded Modulation with Iterative Decoding using Constellation Shaping
Boon Kien Khoo (School of Electrical, Electronic, and Computer Engineering, University of Newcastle upon Tyne, United Kingdom); Stephane Le Goff (University of Newcastle upon Tyne, United Kingdom); Bayan Sharif (University of Newcastle Upon Tyne, United Kingdom); Charalampos Tsimenidis (University of Newcastle Upon Tyne, United Kingdom)
We investigate the association between constellation shaping and bit-interleaved coded modulation with iterative decoding (BICM-ID). To this end, we consider a technique which consists of inserting shaping block codes between mapping and channel coding functions in order to achieve constellation shaping. By assuming the example of a 2-bit/s/Hz 16-QAM BICM-ID, it is demonstrated using computer simulations that this technique can improve the performance of BICM-ID schemes by a few tenths of decibels.
Predistorter Design Employing Parallel Piecewise Linear Structure and Inverse Coordinate Mapping for Broadband Communications
Mei Yen Cheong (Helsinki Univ. of Technology, ? ); Stefan Werner (Helsinki University of Tehcnology, Finland); Timo Laakso (Helsinki University of Technology, Finland); Juan Cousseau (Universidad Nacional del Sur, Argentina); Jose Figueroa (Universidad del Sur, Argentina)
This paper proposes a low-complexity predistorter (PD) for compensation of both the AM/AM and the AM/PM conversions with memory effect of a Wiener model type of nonlinear power amplifier (PA). The quasi-static nonlinearities are modeled using a class of piecewise linear (PWL) functions. The advantage of using the PWL function is that it facilitates the development of an efficient PD identification algorithm. The proposed algorithm involves a novel inverse coordinate mapping (ICM) method that maps the nonlinear characteristics of the PA to that of the PD, and parameter estimations that do not involve matrix inversion. The indirect learning architecture is used to provide an on-line adaptive compensation of the PA's memory. Simulation results show that the proposed quasi-static PD significantly improves the adjacent channel power ratio of the output signal as compared to a PD that does not consider the AM/PM distortion. The proposed PD is also shown to outperform the orthogonal polynomial PD.
Analysis of Vector Quantizers Using Transformed Codebook with Application to Feedback-Based Multiple Antenna Systems
Jun Zheng (University of California at San Diego, USA); Bhaskar Rao (University of California, San Diego, USA)
Transformed codebooks are often obtained by a transformation of a codebook, potentially optimum for a particular set of statistical conditions, to best match the statistical environment at hand. The procedure, though suboptimal, has recently been suggested for feedback MISO systems because of their simplicity and effectiveness. We first consider in this paper the analysis of a general vector quantizer with transformed codebook. Bounds on the average distortion of this class of quantizers are provided to characterize the effects of sub-optimality introduced by the transformed codebook on system performance. We then focus our attention on the application of the proposed general framework to providing capacity analysis of a feedback-based MISO system over correlated fading channels using channel quantizers with both optimal and transformed codebooks. In particular, upper and lower bounds on the channel capacity loss of MISO systems with transformed codebooks are provided and compared to that of the optimal quantizers. Numerical and simulation results are presented which confirm the tightness of the theoretical distortion bounds.
Design of MMSE filterbank precoder and equalizer for MIMO frequency selective channels
Vijaya Krishna Ananthapadmanabha (Indian Institute of Science, India); Kvs Hari (Indian Institute of Science, India)
In this paper, we consider the problem of designing minimum mean squared error (MMSE) filterbank precoder and equalizer for multiple input multiple output (MIMO) frequency selective channels. We derive the conditions to be satisfied by the optimal precoder-equalizer pair, and provide an iterative algorithm for solving them. The optimal design is very general, in that it is not constrained by channel dimensions, channel order, channel rank, or the input constellation. We also discuss some pertinent differences between the filterbank approach and the space-time approach to the design of optimal precoder and equalizer. Simulation results demonstrate that the proposed design performs better than the space-time systems while supporting a higher data rate.

### Poster: Radar Detection and Estimation - 9 papers

Room: Poster Area
Chair: Marco Martorella (University of Pisa, Italy)
Statistical method based on simultaneous diagonalisation for polsar images analysis
Salim Chitroub (Electronics and Computer Science Faculty, USTHB, Algeria)
In [1], we have proposed a PCA-ICA neural network model for POLSAR image analysis. We propose here a new method that is full based on an algebraic statistical formulation and that is well justified from the mathematical point of view. The advantage of the new proposed method is that it is easy in its implementation. It only requires certain subroutines of the inverse matrix computa-tion and the eigenvalues/ eigenvectors decomposition of the sym-metrical square matrices. While the PCA-ICA neural network model is very sensible to both the probabilistic model of the data [2], [3] (super-Gaussian or sub-Gaussian) and the power of the noise that corrupts the input data [1]. In addition, it requires more computation times in its learning process. Thus, the goal of this paper is to arise the power of each method and by this way we try to open new issues, in future research, in the concern of working out new methods that accumulate the advantages of each method while avoiding their disadvantages.
Optimum Permutation and Rank Detectors under K-Distributed Clutter in Radar Applications
In this paper, we realize a comparative performance analysis of nonparametric (permutation and rank) detectors against the parametric ones. The optimum permutation and rank tests are proposed for radar detection under K-distributed clutter, having considered an ideal case of independent and identically distributed (IID) clutter samples, and another more realistic case of spherically invariant random process (SIRP) clutter model. The detector performance analysis was realized for nonfluctuating and Swerling II target models by Monte-Carlo simulations, and the results are shown in curves of detection probability versus signal-to-clutter ratio.
A performance comparison of two time diversity systems using CMLD-CFAR detection for partially correlated chi-square targets and multiple target situations
Toufik Laroussi (Université de Constantine, Algeria); Mourad Barkat (American University of Sharjah, UAE)
In radar systems, detection performance is always related to target models and background environments. In time diversity systems, the probability of detection is shown to be sensitive to the degree of correlation among the target echoes. In this paper, we derive exact expressions for the probabilities of false alarm and detection of a pulse-to-pulse partially correlated target with 2K degrees of freedom for the Censored Mean Level Detector Constant False Alarm Rate (CMLD-CFAR). The analysis is carried out for the "non conventional time diversity system" (NCTDS). The obtained results are compared with the "conventional time diversity system" (CTDS) in both single and multiple target situations.
Performance evaluation of k out of n detector
Yaser Norouzi (Sharif University of Technology, Iran); Maria Greco (University of Pisa, Italy); Mohammad Mahdi Nayebi (Sharif University of Technology, Iran)
In this paper the problem of k out of n detection, be-tween sequence of M random bits is addressed. The main application of the result is in a radar system, when we want to detect a target (with unknown time of arrival (TOA)), us-ing binary integration. Another application is in ESM sys-tems, when the system wants to detect the existence of a swept jammer or a gated noise jammer. In this paper some simple equations for Pfa and Pd cal-culation of a detector which detects the existence of sequence of n ones between M random bits, is driven. These equations are then used to find the optimal detector structure in some special cases. Besides, an approximate equation is derived for gen-eral case of k out of n detector, and it is shown that, the ap-proximation is accurate.
Detection Performance for the GMF Applied to the STAP Data
Sébastien Maria (University of Rennes 1, IRISA, France); Jean-Jacques Fuchs (irisa/université de Rennes, France)
A major problem with most standard methods working on STAP data is their lack of robustness in the presence of heterogeneous clutter background. Indeed most of them rely on the assumption that the clutter remains homogeneous over quite a large range. We apply to the STAP data, a high-resolution method called the Global Matched Filter (GMF). Since it models and identifies both the interferences (clutter and jammer(s)) and the target(s) from the data by only using the snapshot of interest, it solves the above mentioned difficulty. We describe here how to apply the GMF to the STAP data and we compare its performance to the other STAP methods by establishing the target detection probability for a constant false alarm rate.
Stepped frequency waveform design with application to smr radar
Silvia Bruscoli (University of Pisa, Italy); Fabrizio Berizzi (University of Pisa, Italy); Marco Martorella (University of Pisa, Italy); Marcello Bernabò (Galileo Avionica, Italy)
Stepped frequency waveform is one of the most common synthetic bandwidth signal used in radar systems to increase range resolution. Range profiles are reconstructed by taking one samples per sweep and performing the Fourier Trans-form of the collected vector. Care must be taken to the signal parameter choice to guarantee the range profiling technique working properly. In this paper, a novel method for signal parameter setting is proposed and applied to Surface Movement Radar (SMR).
A Frequency Selection Method for HF-OTH Skywave Radar Systems
Amerigo Capria (University of Pisa, Italy); Fabrizio Berizzi (University of Pisa, Italy); Rocco Soleti (University of Pisa, Italy); Enzo Dalle Mese (University of Pisa (Italy), Algeria)
Functionality of a HF-OTH skywave radar is strongly dependent upon several factors. Firstly, the propagation channel is highly time-varying, secondly external radio noise is particularly severe and finally, the spectrum avail-ability is limited by the national plan of frequencies. In this paper we propose a frequency selection method that maxi-mizes the signal-to-noise ratio. The proposed method at-tempt to solve the aforementioned penalties by inspecting jointly the ionosphere and the external noise behaviour. The former is evaluated by means of simplified regional iono-spheric models, while the latter is estimated by considering the recommendation given in ITUs reports. A numerical simulation is used through the paper to ex-plain the method.
A Neural Network Approach to Improve Radar Detector Robustness
Pilar Jarabo (University of Alcalà¡, Spain); David Mata-Moya (University of Alcalà¡, Spain); Manuel Rosa (University of Alcalà¡, Spain); Jose Nieto-Borge (Universidad de Alcala de Henares, Spain); Francisco Lopez (University of Alcalà¡, Spain)
A NN based detector is proposed for approximating the ALR detector in composite hypothesis-testing problems. The case of detecting gaussian targets with gaussian ACF and unknown one-lag correlation coefficient($\rho_s$), in AWGN is considered. After proving the dependence of the simple hypothesis-testing problem LR detector on the assumed value of $\rho_s$, and the extreme complexity of the integral that involves the ALR detector, NN are proposed as tools to approximate the ALR detector. NN not only are capable of approximating this detector and its more robust performance with respect to $\rho_s$, but the implemented approximation is expected to have lower computational cost that other numerical approximations, a very important characteristic in real-time applications. MLPs of different sizes have been trained using a quasi-Newton algorithm to minimize the cross-entropy error. Results prove that MLPS with one hidden layer with 23 neurons can implement very robust detectors for $TSNR$ values lower than $10dB$.
Neural Network for Polarimetric Radar Target Classification
Rocco Soleti (University of Pisa, Italy); Leonardo Cantini (University of Pisa, Italy); Fabrizio Berizzi (University of Pisa, Italy); Amerigo Capria (University of Pisa, Italy); David Calugi (Galileo Avionica S.p.A., Italy)
In this paper, the Artificial Neural Network (ANN) paradigm is applied to radar target classification. Radar returns are simulated via an e.m code and time-domain polarimetric target features are extracted by means of Pronys algorithm. Two different type of feedforward neural network has been adopted in order to classify the target echo, namely the Multi Layer Perceptron (MLP) and the Self Organizing Maps (SOM). The above-mentioned network have been tested on two type of synthetic targets: a small tonnage ship with a low level of detail and medium tonnage ship with higher details. Each network has been trained on a wide range of signal-to-noise ratio, and with different data records number in order to assess the training invariant properties of each network. Finally, in the validation phase a fixed number of records has been considered to evaluate networks performances, which are given in terms of classification error.

### Poster: Microphone Array Processing - 5 papers

Room: Poster Area
Chair: Elio Di claudio (University of Rome La Sapienza, Italy)
An adaptive microphone array for optimum beamforming and noise reduction
Gerhard Doblinger (Vienna University of Technology, Austria)
We present a new adaptive microphone array efficiently implemented as a multi-channel FFT-filterbank. The array design is based on a minimum variance distortionless response (MVDR) optimization criterion. MVDR beamformer weights are updated for each signal frame using an estimated spatio-spectral correlation matrix of the environmental noise field. We avoid matrix inversion by means of an iterative algorithm for weight vector computation. The beamformer performance is superior to designs based on an assumed homogeneous diffuse noise field. The new design also outperforms LMS-adaptive beamformers at the expense of a higher computational load. Additional noise reduction is achieved with the well-known beamformer/postfilter combination of the optimum multi-channel filter. An Ephraim-Malah spectral amplitude modification with minimum statistics noise estimation is employed as a postfilter. Experimental results are presented using sound recordings in a reverberant noisy room.
DOA estimation method for an arbitrary triangular microphone arrangement
Amin Karbasi (EPFL, Switzerland); Akihiko Sugiyama (NEC Corporation, Japan)
This paper proposes a new DOA (direction of arrival) estimation method for an arbitrary triangular microphone arrangement. Using the phase rotation factors for the crosscorrelations between the adjacent-microphone signals, a general form of the integrated cross spectrum is derived. DOA estimation is reduced to a non-linear optimization problem of the general integrated cross spectrum. It is shown that a conventional DOA estimation for the equilateral triangular microphone arrangement is a special case of the proposed method. Sensitivity to the relative time-delay is derived in a closed form. Simulation results demonstrate that the deviation of estimation error in the case of 20~dB SNR is less than 1 degree which is comparable to high resolution DOA estimation methods.
A new adaptation mode controller for adaptive microphone arrays based on nested and symmetric leaky blocking matrices
Thanh Phong Hua (Université de Rennes 1, France); Akihiko Sugiyama (NEC Corporation, Japan); Régine Le Bouquin-Jeannès (University of Rennes 1, France); Gérard Faucon (University of Rennes 1, France)
An adaptation mode controller (AMC) based on an estimation of signal-to-interference ratio using multiple blocking matrices for adaptive microphone arrays is proposed. A new nested blocking matrix enhances detection of the interference power. A normalized cross-correlation between symmetric leaky blocking-matrix outputs improves directivity. The detection of hissing sounds in the target speech is enhanced by modifying the high frequency components of the fixed beamformer output. Evaluations are carried out using a four-microphone array in a real environment with reverberations for different signal-to-interference ratios, interference directions of arrival, and target distances from the array. They show that the proposed AMC contributes to an enhanced output quality as well as an increased speech recognition rate by as much as 31% compared to the conventional AMC.
Determination of the Number of Wideband Acoustical Sources in a Reverberant Environment
Angela Quinlan (Trinity College Dublin, Ireland); Frank Boland (Trinity College Dublin, Ireland); Jean-Pierre Barbot (Ecole Normale Superieure de Cachan, France); Pascal Larzabal (SATIE ENS-Cachan, France)
This paper addresses the problem of determining the number of wideband sources in a reverberant environment. In [1] an Exponential Fitting Test (EFT) is proposed based on the exponential profile of the noise only eigenvalues. We consider the performance of this test for the problem in question, and compare it with the results achieved by the well known Akaike Information Criterion (AIC) and Minimum Description Length (MDL). Once reverberation is present in the received signals the EFT is seen to perform much better than the AIC and MDL.
Accuracy of Gauss-Laguerre Polar Monopulse Receiver
Elio Di claudio (University of Rome La Sapienza, Italy); Giovanni Jacovitti (INFOCOM Dpt. University of Rome, Italy); Alberto Laurenti (Faculty of Engineering, University Campus Bio-medico, Italy)
In this paper the CRB accuracy of a monopulse receiver parametrized by two (off-boresight and revolution) angles obtained by combining polar-separable and angularly periodic Gauss-Laguerre directivity patterns is calculated and compared to the CRB of monopulse receivers based on cartesian separable beams.

### Fri.6.2: Motion Estimation and Compensation - 5 papers

Room: Room 4
Chair: Beatrice Pesquet-Popescu (Ecole Nationale Superieure des Telecommunications, France)
Model-Based Robust Variatinal Method for Motion De-blurring
Takahiro Saito (Kanagawa University, Japan); Taishi Sano (Kanagawa University, Japan); Takashi Komatsu (Kanagawa University, Japan)
Once image motion is accurately estimated, we can utilize those motion estimates for image sharpening and we can remove motion blurs. First, this paper presents a variational motion de-blurring method using a spatially variant model of motion blurs. The standard variational method is not proper for the motion de-blurring, because it is sensitive to model errors, and occurrence of errors are inevitable in motion estimation. To improve the robustness against the model errors, we employ a nonlinear robust estimation function for measuring energy to be minimized. Secondly, we experimentally compare the variational method with our previously presented PDE-based method that does not need any accurate blur model.
A Spatio-Temporal Competing Scheme for the Rate-Distortion Optimized Selection and Coding Of Motion Vectors
Guillaume Laroche (France Telecom R&D, France); Joël Jung (France Telecom R&D, France); Beatrice Pesquet-Popescu (Ecole Nationale Superieure des Telecommunications, France)
The recent H.264/MPEG4-AVC video coding standard has achieved a significant bitrate reduction compared to its predecessors. High performance texture coding tools and 1/4-pel motion accuracy have however contributed to an increased proportion of bits dedicated to the motion information. The future ITU-T challenge, to provide a codec with 50% bitrate reduction compared to the current H.264, may result in even more accurate motion models. It is consequently of highest interest to reduce the motion information cost. This paper proposes a competitive framework, with spatial and temporal predictors optimally selected by a rate-distortion criterion taking into account the cost of the motion vector and the predictor information. These new methods take benefit of temporal redundancies in the motion field, where the standard spatial median usually fails. Compared to an H.264/MPEG4-AVC standard codec, a systematic bitrate saving reaching up to 20% for complex sequences is reported.
Block Matching-Based Motion Compensation with Arbitrary Accuracy Using Adaptive Interpolation Filters
Ichiro Matsuda (Science University of Tokyo, Japan); Kazuharu Yanagihara (Science University of Tokyo, Japan); Shinichi Nagashima (Science University of Tokyo, Japan); Susumu Itoh (Science University of Tokyo, Japan)
This paper proposes a motion-compensated prediction method which can compensate precise motions using adaptive interpolation filters. In this method, plural number of non-separable 2D interpolation filters are optimized for each frame, and an appropriate combination of one of the filters and a motion vector with integer-pel accuracy is determined block-by-block. To reduce the amount of side information, coefficients of each filter are downloaded to a decoder only when the filter turns out to be effective in terms of a rate-distortion sense, or else the old filter applied to the previous frame is reused. Simulation results indicate that the proposed method provides coding gain of up to 3.0 dB in PSNR compared to the conventional motion-compensated prediction method using motion vectors with 1/2-pel accuracy.
Reduced Computation using Adaptive Search Window Size for H.264 Multi-frame Motion Estimation
Ji Liangming (Hong Kong Polytechnic University, Hong Kong); Wan-Chi Siu (The Hong Kong Polytechnic University, Hong Kong)
In the new H.264/AVC video coding, motion estimation takes advantages of the multiple references to select the best matching position. The full process of searching up to 5 past reference frames leads to high complexity and casts a heavy burden on the encoder. After a careful analysis, it is found that the motion vector obtained in the 1st reference frame can be used as a guide to adaptively set an appropriate size of the search window for the rest reference frames. Meanwhile, if several small blocks have the same motion vector, the combined large block can utilize such motion vector as an initial point for further refinement within a small search window size. The simulation result from JM9.6 shows that the proposed algorithm can reduce the complexity to 10% of the original full search and meanwhile keep the PSNR drop less than 0.02dB with an increase in bitrate of less than 1%.
Estimation of motion blur point spread function from differently exposed image frames
Marius Tico (Nokia Research Center, Finland); Markku Vehvilainen (Nokia Research Center, Finland)
In this paper we investigate the problem of recovering the motion blur point spread function (PSF) by fusing the information available in two differently exposed image frames of the same scene. The proposed method exploits the difference between the degradations which affect the two images due to their different exposure times. One of the images is mainly affected by noise due to low exposure whereas the other one is mainly affected by motion blur caused by camera motion during the exposure time. Assuming certain models for the observed images and the blur PSF, we propose a maximum a posteriory (MAP) estimator of the motion blur. The experimental results show that the proposed method has the ability to estimate the motion blur PSF caused by rather complex motion trajectories, allowing a significant increase in the signal to noise ratio of the restored image.

### Fri.4.2: Principal Components Analysis - 5 papers

Room: Sala Onice
Chair: Frederic Kerem Harmanci (Bogazici University, Turkey)
Bayesian estimation of the number of principal components
Abd-krim Seghouane (National ICT Australia, Australia)
Recently, the technique of principal component analysis (PCA) has been expressed as the maximum likelihood solution for a generative latent variable model. A central issue in PCA is choosing the number of principal components to retain. This can be considered as a problem of model selection. In this paper, the probabilistic reformulation of PCA is used as a basis for a Bayasian approach of PCA to derive a model selection criterion for determining the true dimensionality of data. The proposed criterion is similar to the Bayesian Information Criterion, BIC, with a particular goodness of fit term and it is consistent. A simulation example that illustrate its performance for the determination of the number of principal components to be retained is presented.
Unequal Error Protection of PCA-Coded Face Images for Transmission over Mobile Channels
Aykut Hocanin (Eastern Mediterranean University, Turkey); Hasan Demirel (Eastern Mediterranean University, Turkey); Sabina Hosic (Eastern Mediterranean University, Cyprus)
In this paper, a system for reliable communication of coded grey-level face images over noisy channels is proposed. Principal Component Analysis (PCA) is used for face image coding and an unequal error protection (UEP) joint source channel coding scheme is proposed for mobile communica-tion applications. Coded images are protected with convolu-tional codes for transmission over mobile channels. Recog-nition rates obtained with UEP system approach the 95% recognition performance for images from the ORL face da-tabase.
Separability of convolutive mixtures based on wiener filtering and mutual information criterion
Moussa Akil (Laboratoire des Images et des Signaux, France); Christine Servière (Laboratoire des Images et des Signaux, France)
In this paper, we focus on convolutive mixture, expressed in time-domain. We present a method based on the minimization of the mutual information and using wiener filtering. Separation is known to be obtained by testing the independence between delayed outputs. This criterion can be much simplified and we prove that testing the independence between the contributions of all sources on the same sensor at same time index also leads to separability. We recover the contribution by using Wiener filtering (or Minimal Distorsion Principal) which is included in the separation procedure. The independence is tested here with the mutual information. It is minimized only for non-delayed outputs of the Wiener filters. The test is easier and shows good results on simulation.
Blind Identification of MIMO-OSTBC Channels Combining Second and Higher Order Statistics
Javier Vía (University of Cantabria, Spain); Ignacio Santamaria (University of Cantabria, Spain); Jesus Perez (University of Cantabria, Spain)
It has been recently shown that some multiple-input multiple-output (MIMO) channels under orthogonal space-time block coding (OSTBC) transmissions can not be unambiguously identified by only exploiting the second order statistics (SOS) of the received signal. This ambiguity, which is due to properties of the OSTBC, is traduced in the fact that the largest eigenvalue of the associated eigenvalue problem has multiplicity larger than one. Fortunately, for most OSTBCs that produce ambiguity, the multiplicity is two. This means that the channel estimate lies in a rank-2 subspace, which can be easily determined applying a first principal component analysis (PCA) step. To eliminate the remaining ambiguity, we propose to apply a constant modulus algorithm (CMA). This combined PCA+CMA approach provides an effective solution for the blind identification of those OSTBCs that can not be identified using only SOS. Some simulation results are presented to show the performance of the proposed method.
Robustness of Phoneme Classification in Different Representation Spaces
Lena Khoo (King's College London, United Kingdom); Zoran Cvetkovic (King's College London, United Kingdom); Peter Sollich (King's College London, United Kingdom)
The robustness of phoneme recognition using support vector machines to additive noise is investigated for three kinds speech of representation. The representations considered are PLP, PLP with RASTA processing, and a high-dimensional principal component approximation of acoustic waveforms of speech. While the classification in the PLP and PLP/RASTA domains attains superb accuracy on clean data, the classification in the high-dimensional space proves to be much more robust to additive noise.

### Fri.3.2: Multi-user MIMO Communications II (Special session) - 5 papers

Room: Sala Verde
Chair: Christoph Mecklenbräuker (FTW, Austria)
Transmit and Receive Antenna Subset Selection for MIMO SC-FDE in Frequency Selective Channels
Andreas Wilzeck (University Duisburg-Essen, Germany); Patrick Pan (University Duisburg-Essen, Germany); Thomas Kaiser (University of Duisburg-Essen, Germany)
Antenna (subset) selection is a feasible scheme to reduce the hardware complexity of Multiple-Input Multiple-Output (MIMO) systems. Studies of antenna selection schemes are typically based on channel capacity optimizations employing frequency flat channel models, which are inconsistent with MIMO systems employing spatial-multiplexing. Such systems aim to offer a high data-rate transmission, so that the channel is of frequency selective nature. In this contribution we study antenna subset selection at transmitter- and receiver-side for theMIMO Single Carrier (SC) scheme with Frequency Domain Equalization (FDE) in frequency selective channels. As alternative selection metric the signal quality of the MIMO equalizer output is used.
Non-Linear Precoding for MIMO Multi-User Downlink Transmissions with different QoS requirements
Luca Sanguinetti (University of Pisa, Italy); Michele Morelli (University of Pisa, Italy)
An efficient non-linear pre-filtering technique based on Tomlinson-Harashima pre-coding (THP) has recently been proposed by Liu and Krzymien for multiple antenna multiuser systems. The algorithm is based on the Zero-Forcing (ZF) criterion and assumes a number of transmit antennas equals to the number of active users. In contrast to other methods, it ensures a fair treatment of the active users providing them the same signal-to-noise ratio. In multimedia applications, however, several types of information with different quality-of-service (QoS) must be supported. Motivated by the above problem, in the present work we design a ZF THP-based pre-filtering algorithm for multiple antenna multi-user networks in which the base station allocates the transmit power according to the QoS requirement of each active user. In doing so, we consider a system in which the number of active users may be less than the number of transmit antennas. As we will see, in such a case there exists an infinite number of solutions satisfying the ZF criterion. We address the problem of finding the best using as optimality criterion the maximization of the signal-to-noise ratios at all mobile terminals.
Levenberg-Marquardt Computation of the Block Factor Model for Blind Multi-user Access in Wireless Communications
Dimitri Nion (ETIS Laboratory, UMR 8051 (CNRS, ENSEA, UCP), France); Lieven de Lathauwer (E.E. Dept. (ESAT) - SCD-SISTA, Belgium)
In this paper, we present a technique for the blind separation of DS-CDMA signals received on an antenna array, for a multi-path propagation scenario with Inter-Symbol-Interference. Our method relies on a new third-order tensor decomposition, which is a generalization of the parallel factor model. We start with the observation that the temporal, spatial and spectral diversities give a third-order tensor structure to the received data. This tensor is then decomposed in a sum of contributions, where each contribution fully characterizes one user. We also present an algorithm of the Levenberg-Marquardt type for the calculation of this decomposition. This method is faster than the alternating least squares algorithm previously used.
A Dynamic Antenna Scheduling Strategy for Multi-User MIMO Communications
Mirette Sadek (University of California, Los Angeles, USA); Alireza Tarighat (University of California. Los Angeles (UCLA), USA); Ali Sayed (University of California, Los Angeles, USA)
The paper develops a dynamic antenna scheduling strategy for downlink MIMO communications, where the transmitted signal for each user is beamformed towards a selected subset of receive antennas at this user. It is shown in this paper both analytically and through simulations that increasing the number of antennas at one user degrades the SINR performance of other users in the network. This fact is then exploited to improve the systems performance in terms of target SINR outage probability for each user. Using the SINR outage criterion, allows us to lower the number of targeted antennas at a particular user if the user is already meeting its target SINR. By doing so, other users in the network originally below their target SINR can achieve their target SINR as well. In other words, the antenna scheduling schemes aims at maximizing the number of users meeting their target SINR values by dynamically changing the active antenna sub-sets for every user.
Linear Detectors for multi-user MIMO systems with correlated spatial diversity
Laura Cottatellucci (University of South Australia, Australia); Ralf Mueller (Norwegian University of Science and Technology, Norway); Merouane Debbah (Institut Eurecom, France)
A multiuser CDMA systems with both the transmitting and the receiving sites equipped with multiple antenna elements is considered. The multiuser MIMO channel is correlated at the transmitting and the receiving sites. Multistage detectors achieving near-linear MMSE performance with a complexity order per bit linear in the number of users are proposed. The large system performance is analyzed in a general framework including any multiuser detector that admits a multistage representation. The performance of this large class of detectors is independent of the channel correlation at the transmitter. It depends on the direction of the channel gain vector of the user of interest if the channel gains are correlated.

## 2:10 PM - 3:10 PM

### Plenary: Signal Processing Across the Layers in Wireless Networks

Room: Auditorium
Chair: H. Vincent Poor (Princeton University, USA)

## 3:10 PM - 4:50 PM

### Fri.2.3: Features Extraction - 5 papers

Chair: Patrizio Campisi (University of ROMA TRE, Italy)
Towards Multiple-Orientation Based Tensor Invariants for Object Tracking
Nicolaj Stache (RWTH Aachen University, Germany); Thomas Stehle (RWTH Aachen University, Germany); Matthias Mühlich (RWTH Aachen University, Germany); Til Aach (RWTH Aachen University, Germany)
We derive a new scale- and rotation-invariant feature for characterizing local neighbourhoods in images, which is applicable in tasks such as tracking. Our approach is motivated by the estimation of optical flow. Its least-squares estimate requires the inversion of a symmetric and positive semi-definite 2x2-tensor, which is computed from spatial image derivatives. Only if one eigenvalue of the tensor vanishes, this tensor describes the local neighbourhood in terms of orientation. Estimating optical flow, however, requires that this tensor be regular, i.e., that both its eigenvalues do not vanish. This indicates that the local region contains more than one orientation. Double-orientation neighbourhoods (like X junctions or corners) are especially suited for tracking or optical flow estimation, but the two underlying orientations cannot be extracted from the standard structure tensor. Therefore, we extend this tensor such that it can characterize double-orientation neighbourhoods. From this extended tensor, we derive a rotation- and scale-invariant feature which describes the orientation structure of the local regions, and analyze its performance.
Bayesian High Priority Region Growing for Change Detection
Ilias Grinias (University of Crete, Greece); Georgios Tziritas (University of Crete, Greece)
In this paper we propose a new method for image segmentation. The new algorithm is applied to the video segmentation task, where the localization of moving objects is based on change detection. The change detection problem in the pixel domain is formulated by two zero mean Laplacian distributions. The new method follows the concept of the well known Seeded Region Growing technique, while is adapted to the statistical description of change detection based segmentation, using Bayesian dissimilarity criteria in a way that leads to linear computational cost of growing.
Rotation Invariant Object Recognition Using Edge-Profile Clusters
Ryan Anderson (University of Cambridge, United Kingdom); Nick Kingsbury (University of Cambridge, United Kingdom); Julien Fauqueur (University of Cambridge, United Kingdom)
This paper introduces a new method to recognize objects at any rotation using clusters that represent edge profiles. These clusters are calculated from the Interlevel Product (ILP) of complex wavelets whose phases represent the level of edginess vs ridginess of a feature, a quantity that is invariant to basic affine transformations. These clusters represent areas where ILP coefficients are large and of similar phase; these are two properties which indicate that a stable, coarse-level feature with a consistent edge profile exists at the indicated locations. We calculate these clusters for a small target image, and then seek these clusters within a larger search image, regardless of their rotation angle. We compare our method against SIFT for the task of rotation-invariant matching in the presence of heavy Gaussian noise, where our method is shown to be more noise-robust. This improvement is a direct result of our new edge-profile clusters broad spatial support and stable relationship to coarse-level image content.
Contour detection by surround inhibition in the Circular Harmonic Functions domain
Giuseppe Papari (University of Groningen, The Netherlands); Patrizio Campisi (University of ROMA TRE, Italy); Nicolai Petkov (University of Groningen, The Netherlands); Alessandro Neri (Università  degli Studi "Roma TRE", Italy)
Standard edge detectors react to all not negligible luminance changes in an image, irrespective whether they are originated by object contours or by texture (e.g. grass, foliage, waves, etc.). Texture edges are often stronger than object contours, thus standard edge detectors fail in isolating object contours from texture. We propose a multiresolution contour detector, operating in the Circular Harmonic Function domain and motivated by biological principles. At each scale, texture is suppressed by using a bilateral sur-round inhibition process, applied after non-maxima suppression. The binary contours map is obtained by a contour oriented thresholding algorithm, proved to be more effective than the classical hysteretic thresholding. Robustness to noise is achieved by a Bayesian gradient estimation.
A mutual information approach to contour based object tracking
Evangelos Loutas (University of Thessaloniki, Greece); Nikos Nikolaidis (Aristotle University of Thessaloniki, Greece); Ioannis Pitas (ARISTOTLE UNIVERSITY OF THESSALONIKI, Greece)
An object tracking scheme based on mutual information is proposed in this paper. First, coarse tracking is performed by using mutual information maximization. Subsequently the tracking output is refined by the use of a deformable contour scheme based on image gradient and mutual information which allows the tracking system to capture the variation of the tracked object contour. The scheme was tested on hand sequences created for testing gesture recognition algorithms under difficult illumination conditions and found to perform better than a scheme based on the Kullback-Leibler distance and a scheme based on gradient information.

### Fri.5.3: Human Face Synthesis and Detection - 5 papers

Chair: Bulent Sankur (Bogazici University, Turkey)
Face verification with a MLP and BCH output codes
Marcos Faundez-Zanuy (Escola Universitaria Politecnica de Mataro, Spain)
This Paper studies several classifiers based on Multi-layer Perceptrons (MLP) for face verification. We use the Discrete Cosine Transform (DCT) instead of the eigenfaces method for feature extraction. Experimental results using a Nearest Neighbour classifier show a minimum Detection Cost Function (DCF) of 1.76% when using DCT, and 7.14% when using eigenfaces. We also study several MLP architectures, and we get better accuracies when using Bose-Chaudhuri-Hocquenghem (BCH) codes. In this case, we reduce the minimum DCF to 0.97% when using DCT feature extraction.
Illumination-Invariant Face Identification Using Edge-Based Feature Vectors in Pseudo-2D Hidden Markov Models
Yasufumi Suzuki (The University of Tokyo, Japan); Tadashi Shibata (The University of Tokyo, Japan)
A pseudo-2D Hidden Markov Model-based face identification system employing the edge-based feature representation has been developed. In the HMM-based face recognition algorithms, 2D discrete cosine transform (DCT) is often used for generating feature vectors. However, DCT-based feature representations are not robust against the variation in illumination changes. In order to enhance the robustness against illumination conditions, the edge-based feature representation has been employed. This edge-based feature representation has already been applied to robust face detection in our previous work and is compatible to processing in the dedicated VLSI hardware system which we have developed for real-time performance. The robustness against illumination change of the pseudo-2D HMM-based face identification system has been demonstrated using both the AT&T face database and the Yale face database B.
On Combining Evidence For Reliability Estimation In Face Verification
Krzysztof Kryszczuk (Swiss Federal Institute of Technology Lausanne (EPFL), Switzerland); Andrzej Drygajlo (EPFL Lausanne, Switzerland)
Face verification is a difficult classification problem due to the fact that the appearance of a face can be altered by many extraneous factors, including head pose, illumination conditions, etc. A face verification system is likely to pro-duce erroneous, unreliable decisions if there is a mismatch between the image acquisition conditions during the system training and the testing phases. We propose to detect and discard unreliable decisions based on the evidence originat-ing from the classifier scores- and signal domains. We pre-sent a method of combining the reliability evidence, nested in a probabilistic framework that allows high level of flexi-bility in adding new evidence. Finally, we demonstrate on a standard evaluation database (Banca) how the proposed methodology helps in discarding unreliable decisions in a face verification system.
Multi-View Face Detection and Pose Estimation Employing Edge-Based Feature Vectors
Daisuke Moriya (The University of Tokyo, Japan); Yasufumi Suzuki (The University of Tokyo, Japan); Tadashi Shibata (The University of Tokyo, Japan); Masakazu Yagi (Osaka University, Japan); Kenji Takada (Osaka University, Japan)
A multi-view face detection and pose estimation system has been developed employing the edge-based feature representations. Using the posed face images at four angles: 0, 30, 60, and 90 degrees as templates, the performance of pose estimation of about 80% has been achieved for test images in the entire angle range of 0 degrees - 90 degrees. In order to further enhance the performance, the concept of focus-of-attention (FOA) has been introduced in the vector generation. Namely, edge-based feature vectors are generated from the restricted area mostly containing essential information of facial images. As a result, the detection rate has been enhanced to more than 90% for profile images, which is difficult to achieve when original edge-based vectors are used.
A comparison of data representation types, features types and fusion techniques for 3D face biometry
Helin Dutagaci (Bogazici University, Turkey); Bulent Sankur (Bogazici University, Turkey); Yücel Yemez (Koc University, Turkey)
This paper focuses on the problems of person identification and authentication using registered 3D face data. The face surface geometry is represented alternately as a point cloud, a depth image or as voxel data. Various local or global feature sets are extracted, such as DFT/DCT coefficients, ICA- and NMF- projections, which results in a rich repertoire of representations/features. The identification and authentication performance of the individual schemes are compared. Fusion schemes are invoked, to improve the performance especially in the case when there are only few samples per subject.

### Fri.1.3: Digital Signal Processing for UWB Applications II (Invited special session) - 4 papers

Room: Auditorium
Chair: Umberto Mengali (University of Pisa, Italy)
Reduced Memory Modeling and Equalization of Second order FIR Volterra Channels in Non-coherent UWB Systems
Jac Romme (IMST GmbH, Germany); Klaus Witrisal (Graz University of Technology, Austria)
This paper investigates a combination of two approaches to obtain high-data-rate UWB communication over multipath radio channels, using low complexity, non-coherent receivers. The first approach targets to equalize the occurring \emph{non-linear} ISI using trellis-based equalization, while the second approach aims to reduce or even avoid ISI by dividing the spectral resources into (a few) sub-bands. Combination of both concepts allows for a complexity trade-off between equalizer and RF front-end. Firstly, a reduced-memory data model will be introduced for the non-linear sub-band channels, optimal in the sense of the MMSE criterion. This model is used to study the relationship between equalizer complexity and performance. The second part of the paper investigates the performance of the complete system, before and after forward error control. The system uses QPSK-TR signaling, but the key concepts are applicable to other non-coherent UWB systems as well.
Multi-Target Estimation of Heart and Respiration Rates Using Ultra Wideband Sensors
Natalia Rivera (Virginia Tech - Wake Forest University, USA)
Vital-signs monitoring devices continue to utilize invasive sensing methodologies, ranging from skin surface contact techniques such as the use of electrodes for measuring cardiac signals (ECG test), to more intrusive techniques such as the utilization of a facial mask for measuring gas exchange during respiration. In this paper, we present a wireless radar technique based on ultra-wideband (UWB) technology for non-invasive monitoring of heart and respiration rates. Our technique is based on the detection of chest-cavity motion through the measurement of UWB signal displacements due to this motion. We show that the technique provides accurate results even in the presence of multiple subjects. Specifically, we investigate the two techniques for estimating breathing and heart rates in the presence of multiple subjects: (1) the use of clustering algorithms to isolate the combined position and breathing/heart rate of multiple subjects and (2) the use of MUSIC to accurately estimate only the rates. Results are based on measurements from experiments with multiple subjects in a laboratory setting.
Analysis of Threshold-Based TOA Estimators in UWB Channels
Davide Dardari (University of Bologna, Italy); Chia-Chin Chong (NTT DoCoMo USA Labs, USA); Moe Win (Massachusetts Institute of Technology, USA)
In this paper we analyze and compare the performance of matched filter (MF) and energy detector (ED) time-of-arrival estimators based on thresholding in ultra-wide bandwidth (UWB) dense multipath channels. Closed-form expressions for the estimator bias and mean square error (MSE) are derived as a function of signal-to-noise ratio using a unified methodology. A comparison with results based on Monte Carlo simulation confirms the validity of our analytical approach. In addition, results based on experimental measurements in an indoor residential environment are presented as well. Our analysis enables us to determine the threshold value that minimizes the MSE, critical parameter for optimal estimator design. It is shown that the estimation accuracy mainly depends on large estimation errors due to peak ambiguities caused by multipath at the output of the MF or ED and on the fading statistics of the first path. The evaluation of the performance loss faced by ED estimators with respect to those based on MF is also carried out.
How to Efficiently Detect Different Data-Rate Communications in Multiuser Short-Range Impulse Radio UWB Systems
Simone Morosi (University of Firenze, Italy, Italy); Tiziano Bianchi (University of Florence, Italy)
Low and high data-rate applications can be foreseen for future ultra-wideband systems which are based on impulse radio and proper detection schemes have to be designed for the most general scenarios. In this paper an innovative frequency domain detection strategy is tested in two different indoor short-range communication scenarios where several mobile terminals transmit low or high data-rate flows to a base station. Both Zero Forcing (ZF) and Minimum Mean Square Error (MMSE) criteria have been investigated and compared with the classical RAKE. The results show that the proposed approach is well suited for the considered scenarios.

### Poster: Signal Detection and Estimation - 10 papers

Room: Poster Area
Chair: Fulvio Gini (University of Pisa, Italy)
Adaptive Threshold Nonlinear Correlation Algorithm for Robust Filtering in Impulsive Noise Environments
Shin'ichi Koike (Consultant, Japan)
In this paper, we first present mathematical models for two types of impulse noise in adaptive filtering systems; one in additive observation noise and another at filter input. To combat such impulse noise, a new algorithm named Adaptive Threshold Nonlinear Correlation Algorithm (ATNCA) is proposed. Through analysis and experiment, we demonstrate effectiveness of the ATNCA in making adaptive filters highly robust in the presence of both types of impulse noise while realizing convergence as fast as the LMS algorithm. Fairly good agreement between simulated and theoretical convergence behavior in transient phase and steady state proves the validity of the analysis.
Joint segmentation of multivariate Poissonian time series. Application to Burst and Transient Source Experiments
Nicolas Dobigeon (IRIT/ENSEEIHT/TéSA, France); Jean-Yves Tourneret (IRIT/ENSEEIHT/TéSA, France); Jeffrey D. Scargle (Space Science Division, NASA, USA)
This paper addresses the problem of detecting significant intensity variations in multiple Poissonian time-series. This detection is achieved by using a constant Poisson rate model and a hierarchical Bayesian approach. An appropriate Gibbs sampling strategy allows joint estimation of the unknown parameters and hyperparameters. An extended model that includes constraints on the segment lengths is also proposed. Simulation results performed on synthetic and real data illustrate the performance of the proposed algorithm.
On Interval Estimation for the Number of Signals
Pinyuen Chen (Syracuse University, USA)
We propose a multi-step procedure for constructing a confi-dence interval for the number of signals present. The pro-posed procedure uses the ratios of a sample eigenvalue and the sum of different sample eigenvalues sequentially to de-termine the upper and lower limits for the confidence inter-val. A preference zone in the parameter space of the popula-tion eigenvalues is defined to separate the signals and the noise. We derive the probability of a correct estimation, P(CE), and the least favorable configuration (LFC) asymp-totically under the preference zone. Some important proce-dure properties are shown. Under the asymptotic LFC, the P(CE) attains its minimum over the preference zone in the parameter space of all eigenvalues. Therefore a minimum sample size can be determined in order to implement our procedure with a guaranteed probability requirement.
An Improved Iterative Method to Compensate for 1-D Signal Interpolation Distortion of Each Kind
Ali Akhaee (EE Dept of Sharif University of Technology, Tehran, Iran, Iran)
We propose a new method to compensate for the distortion of any interpolation function. This is a hybrid method based on the iterative method proposed by one of the authors where modular harmonics are utilized instead of simple lowpass filter. The hybrid method is also improved by the Chebyshev acceleration algorithm. The proposed technique drastically improves the convergence rate with less computational complexity while it is robust to additive noise. This method could be used in any 1-D signals which must be interpolated during the process.
On Detection of a Real Sinusoid with Short Data Record
Hing Cheung So (City University of Hong Kong, Hong Kong); Chin Tao Chan (City University of Hong Kong, Hong Kong); Kit Wing Chan (City University of Hong Kong, Hong Kong)
In this paper, we investigate two binary detection problems for a single real tone in additive white Gaussian noise using short data records. In the first hypothesis-testing scenario, we decide if a sinusoid is present in the received signal where both cases of known and unknown sinusoidal frequency will be examined. In the second problem, differentiation between two distinct noisy tones is considered. Simulation results show that the nonlinear least squares and maximum likelihood methods give identical detection performance and they outperform the periodogram approach.
Parameter estimation of multicomponent quadratic FM signals using computationally efficient Radon-CPF transform
Pu Wang (University of Electronic Science and Technology of China, P.R. China); Jianyu Yang (School of Electronic Engineering, University of Electronic Science and Technology of China, P.R. China)
The identifiability problem of the cubic phase function (CPF) for multicomponent quadratic FM (QFM) signals is verified both in theory and in simulations. A computationally efficient technique based on the one dimensional (1-D) Radon-CPF transform (RCT) is proposed for this problem. The RCT first reassigns the time-frequency rate distribution (TFRD) with the information of the second-order coefficient, and then implements the 1-D Radon transform with only angle variety. Although this technique is supposed for multicomponent QFM signals, it is also efficient for monocomponent. The simulation results verify the proposed method both in illustrative examples and performance evaluations.
Common Pole Estimation with an Orthogonal Vector Method
Rémy Boyer (CNRS, Université Paris-Sud (UPS), Supelec, France); Guillaume Bouleux (LSS, France); Karim Abed-Meraim (Dept TSI, Télécom Paris, France)
In some applications as in biomedical analysis, we encounter the problem of estimation the common poles (angular-frequency and damping-factor) in a multi-channel set-up composed as the sum of Exponentially Damped Sinusoids. In this contribution, we propose a new subspace algorithm belonging to the family of the Orthogonal Vector Methods which solves the considered estimation problem. In particular, we expose a root-MUSIC algorithm which deals with damped components for an algorithmic cost comparable to the root-MUSIC for constant modulus components. Finally, we show by means of an example, that the proposed method is efficient, especially for low SNRs.
A Low-cost and Robust Multimodal Wireless Network with Adaptive Estimator and GLRT detector
Jianjun Chen (IT University of Copenhagen, Denmark); John Aasted Sorensen (IT University of Copenhagen, Denmark); Zoltan Safar (IT University of Copenhagen, Denmark); Kåre Kristoffersen (IT University of Copenhagen, Denmark)
The problem of using a multiple-node indoor wireless network as a distributed sensor network for detecting physical intrusion is addressed. The challenges for achieving high system performance are analyzed. A high-precision adaptive estimator and a high-precision signal level change estimator are derived. Based on the low computational complexity of the estimators, a low-cost and robust system architecture is proposed. Experiments show that the proposed system performs significantly better than the published prototype multimodal wireless network system.
A robust estimator for polynomial phase signals in non gaussian noise using parallel unscented kalman filters
Mounir Djeddi (Laboratoire des signaux et systèmes, France); Messaoud Benidir (Laboratoire des Signaux et Systmes, France)
In this paper, we address the problem of the estimation of polynomial phase signals (PPS) in epsilon-contaminated impulsive noise using Kalman filtering technique. We consider an original estimation method based on the exact non linear state space representation of the signal by using the unscented Kalman filter (UKF) instead of the classical approach which consists in the linearization of the system of equations and then applying the extended kalman filter (EKF). The observation noises probability density function is assumed to be a sum of two-component Gaussians weighted by the probability of appearance of the impulsive and gaussian noises in the observations. We propose to use two unscented Kalman filters operating in parallel (PUKF) as an alternative to the classical methods which generally handle the impulsive noise by using either clipping or freezing procedures. Simulation results show that the PUKF is less sensitive to impulsive noise and gives better estimation of signal parameters compared to the recently proposed algorithm.
Comparison of Different Approaches For Robust Identification Of a Lightly Damped Flexible Beam
Allahyar Montazeri (Iran University of Science and Technology, Iran); Hadi Esmaeilsabzali (Iran University of Science and Technology, Iran); Javad Poshtan (Iran University of Science and Technology, Iran); Mohammad-Reza Jahed-Motlagh (Iran University of Science and Technology, Iran)
The aim of this paper is robust identification of a lightly damped flexible beam model with parametric and non-parametric uncertainties. Our approach is based on worst case estimation theory where uncertainties are assumed to be unknown but bounded. We examine different outbounding algorithms (parallelotopic and ellipsoidal) for estimation of the feasible parameter set that has been delivered by the set membership identification algorithm. In order to proper handling with the high magnitude non-parametric uncertainties the proposed methods are compared and it is shown that the combination of set membership approach with model error modeling techniques will result in superior results.

### Poster: Speech and Audio Coding - 5 papers

Room: Poster Area
Chair: Marc Antonini (I3S-CNRS, France)
Conditional split lattice vector quantization for spectral encoding of audio signals
Adriana Vasilache (Nokia Research Center, Finland)
In this paper we propose a novel quantization method with application to audio coding. Because the lattice truncation based quantizers are finite, not all input points have nearest neighbors within the defined truncations. The proposed conditional split lattice vector quantizer (CSLVQ) allows the possibility of splitting to lower dimensions an input point falling outside the truncation enabling thus the preservation of a low distortion, with only a local payoff in bitrate. Furthermore, the proposed quantization tool is versatile with respect to the dimension of the input data, the same quantization functions being used for different dimensions. The new quantizer has been tested for spectral encoding of real audio samples by encoding each frequency subband of the audio signal using a vector quantizer consisting of a lattice truncated following a generalized Gaussian contour of equiprobability. The results of objective listening tests show similar results to the AAC for high bitrates and clearly better results than the AAC for lower bitrates.
An Optimal Path Coding System for DAWG Lexicon-HMM
Alain Lifchitz (LIP6 - CNRS, Paris, France); Frederic Maire (Faculty of Information Technology, Brisbane, Australia); Dominique Revuz (Institut Gaspard Monge, Marne-la-Vallée, France)
Lexical constraints on the input of speech and on-line handwriting systems improve the performance of such systems. A significant gain in speed can be achieved by integrating in a digraph structure the different Hidden Markov Models (HMM) corresponding to the words of the relevant lexicon. This integration avoids redundant computations by sharing intermediate results between HMM's corresponding to different words of the lexicon. In this paper, we introduce a token passing method to perform simultaneously the computation of the a posteriori probabilities of all the words of the lexicon. The coding scheme that we introduce for the tokens is optimal in the information theory sense. The tokens use the minimum possible number of bits. Overall, we optimize simultaneously the execution speed and the memory requirement of the recognition systems.
Low-Complexity Wideband LSF Quantization by Predictive KLT Coding and Generalized Gaussian Modeling
Marie Oger (France Télécom R&D, France); Stéphane Ragot (France Télécom R&D, France); Marc Antonini (I3S-CNRS, France)
In this paper we present a new model-based coding method to represent the linear-predictive coding (LPC) parameters of wideband speech signals (sampled at 16 kHz). The LPC coefficients are transformed into line spectrum frequencies (LSF) and quantized by switched AR(1)/MA(1) predictive Karhunen-Loeve transform (KLT) coding. Compared to prior art, the main novelty lies in the use of improved quantization to represent the (transformed) prediction error of LSF parameters. Generalized Gaussian modeling is applied for this purpose. We review existing methods to fit the free model parameter of a generalized Gaussian model to real data and show that that the distribution of the prediction error for LSF parameters is indeed very close to Laplacian. Experimental results show that the proposed LSF quantization method has a performance close to classical vector quantization (AMR-WB LPC quantization) at 36 and 46 bits per frame with a much lower complexity for both design and operation.
Stereo and multichannel audio coding with room response compensation for improved coding transparency
Maciej Bartkowiak (Poznan University of Technology, Poland)
The paper presents a very simple enhancement of joint coding of stereo and surround channels within a perceptual audio codec. Moreover, the paper proposes two improvements to standard parametric stereo and spatial audio compression in order to avoid the smearing of transients in the process of channel downmixing. The improvements consist in compensating of interchannel delays prior to mixdown, as well as additional encoding of the room response for realistic reconstruction of the stereo ambience. Two compres-sion scenarios are proposed: for high quality (transparent) and for low/very low bit rates.
Objective and Subjective Evaluation MPEG Layer III Perceived Quality
Giancarlo Vercellesi (University of Milan, Italy); Andrea Vitali (STM, Italy); Martino Zerbini (University of Milan, Italy)
Artefacts due to MPEG layer III (MP3) coding are ana-lyzed: non uniform quantization noise, hybrid filter bank aliasing, MDCT pre- and post-echoes. Both objective and subjective evaluation of several MP3 decoded audio files are presented. Finally, it is shown how to predict the subjec-tive perceived quality based on objective measures.

### Poster: MIMO Systems - 8 papers

Room: Poster Area
Chair: Franz Hlawatsch (Vienna University of Technology, Austria)
A Universal Interpretation of Receive Diversity Gain in MIMO Systems over Flat Channels
Shuichi Ohno (Hiroshima University, Japan); Kok Ann Donny Teo (Hiroshima university, Japan)
We present universal bit-error rate (BER) performance ordering for different receive antenna sizes in Multiple-Input Multiple-Output (MIMO) wireless systems with linear equalizations, which hold for all SNR. We show that when the number of transmit antennas is fixed, BER of each symbol degrades with a decrease in the number of receive antennas even if the received SNR is kept constant. This is due to the convexity property of the BER functions. Then for any i.i.d. channels, we show that the BER averaged over random channels also degrades with a decrease in the number of receive antennas. These highlight the advantage and the limit of MIMO with linear equalizations.
Iterative multiuser detection performance evaluation on a satellite multibeam coverage
Jean-Pierre Millerioux (TeSA-ENST, France); Marie-laure Boucheret (ENST, France); Caroline Bazile (CNES, France); Alain Ducasse (Alcatel Space, France)
This paper deals with the use of (non-linear) multiuser detec-tion techniques to mitigate co-channel interference on the reverse link of multibeam satellite systems. The considered system is inspired by the DVB-RCS standard (with the use of convolutional coding). The algorithms consist of iterative parallel interference cancellation schemes with semi-blind channel estimation. We propose a simple approach for statis-tical evaluation on a multibeam coverage, and present results obtained with a digital Ka-band focal array feed reflector antenna.
Maximum likelihood sequence estimation based on periodic time-varying trellis for LPTVMA systems
Bogdan Cristea (National Polytechnic Institute of Toulouse, France); Daniel Roviras (National Polytechnic Institute of Toulouse, France); Benoit Escrig (National Polytechnic Institute of Toulouse, France)
In this paper we show how Maximum Likelihood Sequence Estimation (MLSE) can be implemented in a Linear Periodic Time-Varying Multiple Access (LPTVMA) system. Assuming quasi-synchronous users, LPTVMA systems can be considered as Multi User Interference (MUI) free systems. Hence LPTVMA systems can be equalized. In multipath channels the equalization step uses a Zero Padding (ZP) technique. The equivalent channel to equalize is a zero-pad channel. For such zero-pad channels the MLSE can be efficiently implemented by using a parallel trellis Viterbi algorithm. The complexity of the MLSE can be further reduced because the padding zeros have known positions in the received signal. The proposed MLSE uses a periodic time-varying trellis. Thus the number of states of the MLSE can be reduced. The performances of LPTVMA systems using the proposed MLSE are evaluated by simulations.
Eigenspace Adaptive Filtering for Efficient Pre-Equalization of Acoustic MIMO Systems
Sascha Spors (Deutsche Telekom Laboratories, Germany); Herbert Buchner (University of Erlangen-Nuremberg, Germany); Rudolf Rabenstein (University of Erlangen-Nuremberg, Germany)
Pre-equalization of MIMO systems is required for a wide variety of applications, e.g. in spatial sound reproduction. However, traditional adaptive algorithms fail for channel numbers of some ten or more. It is shown that the problem becomes tractable by decoupling of the MIMO adaptation problem, e.g. by a generalized singular value decomposition. This method is called eigenspace adaptive filtering. The required singular vectors depend on the unknown system response to be equalized. An reasonable approximation by data-independent transformations is derived for the example of listening room compensation yielding the approach of wave-domain adaptive filtering.
Blind recovery of MIMO QAM signals : a criterion with its convergence analysis
Aissa Ikhlef (Supelec Campus de Rennes, SCEE Team, France); Daniel Le guennec (IETR/Supélec-Campus de Rennes, France)
In this paper, the problem of blind recovery of QAM signals for Multiple-Input Multiple-Output (MIMO) communication systems is investigated. We propose a new criterion based on the real (or equivalently the imaginary) part of the equalizer outputs, with a cross correlation constraint. A performance analysis reveals the absence of any undesirable local stationary points, which ensures perfect recovery of all transmitted signals and global convergence of the algorithm. From the proposed criterion an adaptive algorithm is derived. It is shown that the proposed algorithm presents a lower computational complexity compared to the constant modulus algorithm (CMA) and multimodulus algorithm (MMA). The effectiveness of the proposed algorithm is illustrated by some numerical results.
Bidirectional MIMO equalizer design
Davide Mattera (Università  degli Studi di Napoli Federico II, Italy); Francesco Palmieri (Seconda Università  di Napoli, Italy); Gianluca D'Angelo (Università  di Napoli Federico II, Italy)
The paper considers an important variation of the MIMO channel Decision Feedback (DF) equalizer, the bidirectional MIMO equalizer, that combines the classical forward DF equalizer (DFE) with the backward DFE, which is based on imposing anticausal properties to the feedback filter. With reference to the minimum mean square error (MMSE) criterion, we extend the bidirectional equalization from the single-input single-output (SISO) scenario to the more general multiple-input multiple-output (MIMO) scenario where different definitions of anticausal systems can be given. An original variation of the bidirectional DF equalizer is also proposed in order to reduce its performance loss in the presence of the error propagation.
SMC Algorithms for Approximate-MAP Equalization of MIMO Channels with Polynomial Complexity
Sequential Monte Carlo (SMC) schemes have been recently proposed in order to perform optimal equalization of multiple input multiple output (MIMO) wireless channels. The main feaures of SMC techniques that make them appealing for the equalization problem are (a) their potential to provide asymptotically optimal performance in terms of bit error rate and (b) their suitability for implementation using parallel hardware. Nevertheless, existing SMC equalizers still exhibit a very high computational complexity, relative to the dimensions of the MIMO channel, which makes them useless in practical situations. In this paper we introduce two new SMC equalizers whose computational load is only of polynomial order with respect to the channel dimensions, and avoid computationally heavy tasks such as matrix inversions. The performance of the proposed techniques is numerically illustrated by means of computer simulations.
Iterative Equalization For Severe Time Dispersive MIMO Channels
Sajid Ahmed (The Institute of ECIT Queen's University Belfast, United Kingdom); Tharmalingam Ratnarajah (Queens University of Belfast, United Kingdom); Colin Cowan (Queen's University Belfast, United Kingdom)
In this work, minimum mean squared error (MMSE) iterative equalization method for a severe time dispersive MIMO channel is proposed. To mitigate the severe time dispersiveness of the channel single carrier with cyclic prefix (SCCP) is employed and the equalization is performed in frequency domain. The use of cyclic prefix (CP) and equalization in frequency domain simplify the challenging problem of equalization in MIMO channels due to the both inter-symbol-interference (ISI) and co-channel interference (CCI). The proposed iterative algorithm work in two stages; first stage estimate the transmitted frequency domain symbols using low complexity MMSE equalizer. The second stage finds the \textit{a posteriori} probabilities of the estimated symbols to find their means and variances to use in the MMSE equalizer in the following iteration. Simulation results show the superior performance of the iterative algorithm when compared with the conventional MMSE equalizer.

### Poster: Motion Analysis - 6 papers

Room: Poster Area
Chair: Theo Vlachos (University of Surrey, United Kingdom)
Automatic Human Motion Analysis and Action Recognition in Athletics Videos
Costas Panagiotakis (University of Crete, Greece); Ilias Grinias (University of Crete, Greece); Georgios Tziritas (University of Crete, Greece)
In this paper, we present an unsupervised, automatic human motion analysis and action recognition scheme tested on athletics videos. First, four major human points are recognized and tracked using human silhouettes that are computed by a robust camera estimation and object localization method. The human major axis and the gait period during the running stage can be estimated by statistical analysis of the tracking points motion obtaining a temporal segmentation on running and jump stage. The method is tested on athletics videos of pole vault, high jump, triple jump and long jump recognizing them using robust and independent from the camera motion and the athlete performance features. The experimental results indicate the good performance of the proposed scheme, even in sequences with complicated content and motion.
Motion Estimation Using Hypercomplex Correlation in the Wavelet Domain
Vasileios Argyriou (University of Surrey, United Kingdom); Theo Vlachos (University of Surrey, United Kingdom)
We present a novel frequency-domain motion estimation technique, which employs hypercomplex correlation in the wavelet domain. Our method involves wavelet decomposition followed by cross-correlation in the frequency domain using the discrete quaternion Fourier transform. Experiments using both artificially induced motion and actual scene motion demonstrate that the proposed method outperforms the state-of-the-art in frequency-domain motion estimation, in the shape of phase correlation, in terms of sub-pixel accuracy for a range of test material and motion scenarios.
Camera motion classification based on transferable belief model
Mickael Guironnet (Laboratoire des Images et des Signaux, France); Denis Pellerin (Laboratoire des Images et des Signaux, France); Michele Rombaut (Laboratoire des Images et des Signaux, France)
This article presents a new method of camera motion classification based on Transferable Belief Model (TBM). It consists in locating in a video the motions of translation and zoom, and the absence of camera motion (i.e static camera). The classification process is based on a rule-based system that is divided into three stages. From a parametric motion model, the first stage consists in combining data to obtain frame-level belief masses on camera motions. To ensure the temporal coherence of motions, a filtering of belief masses according to TBM is achieved. The second stage carries out a separation between static and dynamic frames. In the third stage, a temporal integration allows the motion to be studied on a set of frames and to preserve only those with significant magnitude and duration. Then, a more detailed description of each motion is given. Experimental results obtained show the effectiveness of the method.
Tracking System Using Camshift and Feature Points
Ali Ganoun (University of Orleans, France); Ould-Driss Nouar (Orleans University, France); Raphael Canals (Orleans University, France)
In this paper, we present a new object tracking approach in grey-level image sequences. The method is based on the analysis of the histogram of distribution of the image and the feature points of the object. Among the essential contributions of this work, we can quote a better modelling of the object to track and the consideration of the target appearance changes during the sequence. Our approach is a prolongation of the CamShift algorithm (Continuously Adaptive MeanShift) applications. The goal is to widen the field of application of this algorithm in order to adapt it to grey-level image sequences presenting strong modifications of shape, luminosity and grey-level values.
Rate-distortion performance of dual-layer motion compensation employing different mesh densities
Andy Yu (University of Warwick, United Kingdom); Heechan Park (UNIVERSITY OF WARWICK, United Kingdom); Graham Martin (UNIVERSITY OF WARWICK, United Kingdom)
We describe a dual layer algorithm for mesh-based motion estimation. The proposed algorithm, employing meshes at different scales, aims to improve the rate-distortion perform-ance at fixed bit rates. Without the need for time-consuming evaluation, the proposed algorithm identifies the motion-active regions in a picture. By enhancing the mesh structure in the aforementioned regions, a significant improvement in motion estimation is evident. Furthermore, the scheme to con-struct two different significance maps at both encoder and decoder provides a reduction in transmission overhead. Simu-lations on test sequences possessing high motion activity show that the proposed algorithm results in a better PSNR-rate performance compared to single layer motion estimation using 16x16 triangular meshes. Improvements of up to 2.2dB are obtained.
Robust Scene Cut Detection by Supervised Learning
Guillermo Camara Chavez (University of Cergy-Pontoise, France)
The first step for video-content analysis, content-based video browsing and retrieval is the partitioning of a video sequence into shots. A shot is the fundamental unit of a video, it captures a continuous action from a single camera and represents a spatio-temporally coherent sequence of frames. Thus, shots are considered as the primitives for higher level content analysis, indexing and classification. Although many video shot boundary detection algorithms have been proposed in the literature, in most approaches, several parameters and thresholds have to be set in order to achieve good results. In this paper, we present a robust learning detector of sharp cuts without any threshold to set nor any pre-processing step to compensate motion or post-processing filtering to eliminate false detected transitions. Our experiments, following strictly the TRECVID 2002 competition protocol, provide very good results dealing with a large amount of features thanks to our kernel-based SVM classifier method.

### Fri.6.3: Pattern recognition II - 5 papers

Room: Room 4
Chair: Alberto Carini (University of Urbino, Italy)
Two-Dimensional Sampling and Representation of Folded Surfaces Embedded in Higher Dimensional Manifolds
Emil Saucan (Technion - Israel Institute of Technology, Israel); Eli Appleboim (Technion - Israel Institute of Technology, Electrical Engineering Department, Iceland); Yehoshua Zeevi (Technion - Israel Institute of Technology, Electrical Engineering Department, Israel)
The general problem of sampling and flattening of folded surfaces for the purpose of their two-dimensional representation and analysis as images is addressed. We present a method and algorithm based on extension of the classical results of Gehring and V\"{a}isal\"{a} regarding the existence of quasi-conformal and quasi-isometric mappings between Riemannian manifolds. Proper surface sampling, based on maximal curvature is first discussed. We then develop the algorithm for mapping of this surface triangulation into the corresponding flat triangulated representation. The proposed algorithm is basically local and, therefore, suitable for extensively folded surfaces such as encountered in medical imaging. The theory and algorithm guarantee minimal metric, angular and area distortion. Yet, it is relatively simple, robust and computationally efficient, since it does not require computational derivatives. In this paper we present the sampling and flattening only, without complementing them by proper interpolation. We demonstrate the algorithm using medical and synthetic data.
A Markov Random Field Based Skin Detection Approach
Kamal Chenaoua (KFUPM, Saudi Arabia); Ahmed Bouridane (Queen's University, United Kingdom)
A new color space is used in this paper together with a Markov Random Field model for the detection of skin pixels in colored images. The proposed color space is derived through the principal component analysis technique thus reducing the number of color components. The MRF model takes into account the spatial relations within the image that are included in the labeling process through statistical dependence among neighboring pixels. Since only two classes are considered the Ising model is used to perform the skin/non-skin classification process.
High-Quality Image Interpolation Based on Multiplicative Skeleton-Texture Separation
Takahiro Saito (Kanagawa University, Japan); Yuki Ishii (Kanagawa University, Japan); Yousuke Nakagawa (Kanagawa University, Japan); Takashi Komatsu (Kanagawa University, Japan)
This paper presents a high-quality interpolation approach that can adjust edge sharpness and texture intensity to reconstruct an image according to userfs taste in picture quality. Our interpolation approach first resolves an input image I into its skeleton image U and its texture generator V and its residual image D such that I = U·V + D, and then interpolates each of the three components independently with a proper interpolation method suitable to each. The skeleton image is a bounded-variation function meaning a cartoon approximation of I, and interpolated with a super-resolution deblurring-oversampling method that interpolates sharp edges without producing ringing artifacts. The texture generator is an oscillatory function representing regular distinct textures, and interpolated with a standard linear interpolation algorithm. The residual image is a vague function representing irregular weak textures and noise, and interpolated with a random re-sampling interpolation algorithm.
Dense Deformation Field Estimation for Atlas Registration using the Active Contour Framework
Valérie Duay (Ecole Polytechnique Fédérale de Lausanne (EPFL), Switzerland); Meritxell Bach (Ecole Polytechnique Fédérale de Lausanne (EPFL), Switzerland); Xavier Bresson (Ecole Polytechnique Fédérale de Lausanne (EPFL), Switzerland); Jean-Philippe Thiran (Swiss Federal Institute of Technology (EPFL), Switzerland)
In this paper, we propose a new paradigm to carry out the registration task with a dense deformation field derived from the optical flow model and the active contour method. The proposed framework merges different tasks such as segmentation, regularization, incorporation of prior knowledge and registration into a single framework. The active contour model is at the core of our framework even if it is used in a different way than the standard approaches. Indeed, active contours are a well-known technique for image segmentation. This technique consists in finding the curve which minimizes an energy functional designed to be minimal when the curve has reached the object contours. That way, we get accurate and smooth segmentation results. So far, the active contour model has been used to segment objects lying in images from boundary-based, region-based or shape-based information. Our registration technique will profit of all these families of active contours to determine a dense deformation field defined on the whole image. A well-suited application of our model is the atlas registration in medical imaging which consists in automatically delineating anatomical structures. We present results on 2D synthetic images to show the performances of our non rigid deformation field based on a natural registration term. We also present registration results on real 3D medical data with a large space occupying tumor substantially deforming surrounding structures, which constitutes a high challenging problem.
Facial Expression Synthesis Through Facial Expressions Statistical Analysis
Stelios Krinidis (Technological Institute of Kavala, Greece); Ioannis Pitas (Aristotle University of Thessaloniki, Greece)
This paper presents a method for generalizing human facial expressions or personalizing (cloning) them from one person to completely different persons, by means of a statistical analysis of human facial expressions coming from various persons. The data used for the statistical analysis are obtained by tracking a generic facial wireframe model in video sequences depicting the formation of the different human facial expressions, starting from a neutral state. Wireframe node tracking is performed by a pyramidal variant of the well-known Kanade-Lucas-Tomasi (KLT) tracker. The loss of tracked features is handled through a model deformation procedure increasing the robustness of the tracking algorithm. The dynamic facial expression output model is MPEG-4 compliant. The method has been tested on a variety of sequences with very good results, including a database of video sequences representing human faces changing from the neutral state to the one that represents a fully formed human facial expression.

### Fri.4.3: Speech Processing - 5 papers

Room: Sala Onice
Chair: Jesper Jensen (Delft University of Technology, The Netherlands)
Speech style conversion based on the statistics of vowel spectrograms and nonlinear frequency mapping
Toru Takahashi (Wakayama University, Japan); Hideki Banno (Meijo University, Japan); Toshio Irino (Wakayama University, Japan); Hideki Kawahara (Wakayama University, Japan)
A simple, efficient, and high-quality speech style conversion algorithm is proposed based on STRAIGHT. The proposed method uses only vowel information to design the desired conversion functions and parameters. So, it is possible to reduce the amount of training data required for conversion. The processing of the proposed method is : 1) to produce abstract spectra that is represented on the perceptual frequency axis and is derived as average spectrum for each vowel and each style; 2) to decompose the original spectrum into the abstract spectrum and the residual, fine structure; 3) to replace the abstract spectrum from the original to the target style; 4) to map the fine structure with nonlinear frequency warping for adapting the target style fine structure; 5) then to add them together to produce target speech. An efficient algorithm for this conversion was developed using an orthogonal transformation referred to as warped-DCT. A preliminary listening test indicated that the proposed method yields more natural and high-quality speech style conversion than the previous methods.
Robust Decomposition of Inverse Filter of Channel and Prediction Error Filter of Speech Signal for Dereverberation
Takuya Yoshioka (NTT Communication Science Laboratories, Japan); Takafumi Hikichi (NTT Communication Science Laboratories, Japan); Masato Miyoshi (NTT Communication Science Laboratories, Japan); Hiroshi Okuno (Kyoto University, Japan)
This paper estimates the inverse filter of a signal transmission channel of a room driven by a speech signal. Speech signals are often modeled as piecewise stationary autoregressive (AR) processes. The most fundamental issue is how to estimate a channel's inverse filter separately from the inverse filter of the speech generating AR system, or the prediction error filter (PEF). We first point out that by jointly estimating the channel's inverse filter and the PEF, the channel's inverse is identifiable due to the time varying nature of the PEF. Then, we develop an algorithm that achieves this joint estimation. The notable property of the proposed method is its robustness against deviation from the linear convolutive model of an observed signal caused by, for example, observation noise. Experimental results with simulated and real recorded reverberant signals showed the effectiveness of the proposed method.
A Path-Based Layered Architecture using HMM for Automatic Speech Recognition
Marta Casar (Universitat Politecnica de Catalunya, Spain); José Fonollosa (Universitat Politècnica de Catalunya, Spain); Albino Nogueiras (Universitat Politecnica de Catalunya, Spain)
Generally, speech recognition systems are based on one layer of acoustic HMM states where the recognition process consists on selecting a sequence of those states providing the best match with the speech utterance. In this paper we propose a new approach based on two layers. The first layer implements a standard acoustic modeling. The second layer models the path followed by the speech signal along the activated states of the acoustic models, defining a set of state-probability based HMMs. This method presents two main advantages in front of conventional recognizers: a consistent pruning of the possible paths preceding and following each state in the recognition process, and the possibility of modeling high-level information in the second layer in a somewhat independent fashion from the acoustic training. A testing database from a real voice recognition application has been used to study the performance of the system in a changeable environment.
Fast sequential floating forward selection applied to emotional speech features estimated on DES and SUSAS data collections
Dimitrios Ververidis (Aristotle University of Thessaloniki, Greece); Constantine Kotropoulos (Aristotle University of Thessaloniki, Greece)
In this paper, we classify speech into several emotional states based on the statistical properties of prosody features estimated on utterances extracted from Danish Emotional Speech (DES) and a subset of Speech Under Simulated and Actual Stress (SUSAS) data collections. The proposed novelties are in: 1) speeding up the sequential oating feature selection up to 60%, 2) applying fusion of decisions taken on short speech segments in order to derive a unique decision for longer utterances, and 3) demonstrating that gender and accent information reduce the classification error. Indeed, a lower classification error by 1% to 11% is achieved, when the combination of decisions is made on long phrases and an error reduction by 2%-11% is obtained, when the gender and the accent information is exploited. The total classification error reported on DES is 42.8%. The same figure on SUSAS is 46.3%. The reported human errors have been 32.3% in DES and 42% in SUSAS. For comparison purposes, a random classification would yield an error of 80% in DES and 87.5% in SUSAS, respectively.
Prosody Modeling for an Embedded TTS System Implementation
Dragos Burileanu ("Politehnica" University of Bucharest, Romania); Cristian Negrescu ("Politehnica" University of Bucharest, Romania)
Prosody quality strongly influences the intelligibility and the perceived naturalness of synthetic speech. But despite the significant progress in prosody modeling from the last years, incomplete linguistic knowledge that can be derived from text and various language-specific issues still limit the quality of todays commercial text-to-speech (TTS) systems. Moreover, obtaining a right pronunciation and intonation for embedded speech applications that have severe resource constraints, is a more challenging task. The paper describes an enhanced version of an embedded TTS system in Romanian language and proposes and discusses an efficient rule-based intonation model for prosody generation. Informal listening tests show that highly intelligible and fair natural synthetic speech can be produced with small memory footprint and low computational resources.

### Fri.3.3: Image Coding - 5 papers

Room: Sala Verde
Chair: Omer Gerek (Anadolu University, Turkey)
Toshiyuki Uto (Ehime University, Japan); Kenji Ohue (Ehime University, Japan)
This paper describes a multiple description coding (MDC) of images, introducing a phase scrambling with the adjustable dispersion extent of sample value. Whereas a phase scrambling gains robustness to transmission errors, it loses the time-localized information. Consequently, when this phase scrambling process is combined with a wavelet-based coder for the MDC, its objective coding performance is largely dependent on the spread range of phase scrambling. In this paper, we propose a novel MDC using phase scrambling with the variable spread capability of the pixel value. For the control of the spread range, our design of the phase scrambling is based on the group delay response, instead of the phase response. Furthermore, we modify a quadtree-based wavelet coder according to the spread range for the proposed MDC with our phase scrambler. Simulation results show the availability of the proposed technique.
Near-lossless distributed coding of hyperspectral images using a statistical model for probability estimation
Marco Grangetto (Politecnico di Torino, Italy); Enrico Magli (Politecnico di Torino, Italy); Gabriella Olmo (Politecnico di Torino, Italy)
In this paper we propose an algorithm for near-lossless compression of hyperspectral images based on distributed source coding (DSC). The encoding is based on syndrome coding of bit-planes of the quantized prediction error of each band, using the same information in the previous band as side information. The practical scheme employs an array of low-density parity-check codes. Unlike other existing DSC techniques, the determination of the encoding rate for each data block is completely based on a statistical model, avoiding the need of inter-source communication, as well as of a feedback channel. Moreover, the statistical model allows to estimate the statistics of the currently decoded bit-plane also using the information about the previously decoded ones in the same band; this boosts the performance of the DSC scheme towards the capacity of the conditional entropy of the multilevel (as opposed to binary) source. Experimental results have been worked out using AVIRIS data; a significant performance improvement is obtained with respect to existing DSC and classical techniques, although there is still a gap with respect to the theoretical coding bounds.
The RWHT+P for an improved lossless multiresolution coding
Olivier Deforges (IETR / INSA Rennes, France); Marie Babel (IETR / INSA Rennes, France); Jean Motsch (IETR / Saint-Cyr, France)
This paper presents a full multiresolution lossless coding method, with advanced semantic scalability. In particular, a reversible form of the usual Walsh Hadamard Transform (RWHT) is first introduced as an alternative to standard lossless transform. A pyramidal representation and decomposition schemes involving this basic transform are then proposed. Significant improvements are obtained using two additional concepts: the `locally adaptive resolution" through a quadtree representation and a prediction step. The given experimental results show that the proposed RWHT+P achieves excellent performances compared to state-of-the-art.
Scan-Based Compression of 3D Mesh Sequences With Geometry Compensation
Yasmine Boulfani (I3S-CNRS-UNSA, France); Marc Antonini (I3S-CNRS, France)
We introduce in this paper a new compression process to encode the geometry of 3D mesh sequences (with a fixed connectivity). The proposed method uses a scan-based wavelet transform with an original approach which consists to compensate the geometry displacements of the mesh sequence. The proposed coder is based on temporal lifting scheme and bit allocation using statistical models for the probability densities of the temporal wavelet coefficients. Then, scalar quantization and entropy coding are used to encode the quantized subbands and the geometry displacement vectors. The resulting compression scheme shows that good coding performances are obtained when the compensation of the geometry is done on the whole sequence, which requires large memory. Furthermore, we showed that when low memory scan-based compression is done, the performances of the encoder approach the ones obtained when the whole sequence is known. Also, simulation results show that the proposed algorithm provides better compression performances than some state of the art coders.
An analytical Gamma mixture based rate-distortion model for lattice vector quantization
Ludovic Guillemot (Université Henri Poincaré Nancy 1, France)
Here we propose a new blockwise Rate-Distortion (R-D) model dedicated to lattice vector quantization (LVQ) of sparse and structured signals. We first show that the clustering properties leads to a distribution of the norm La of source vectors close to a mixture of Gamma density. From this issue, we derive the joint distribution of vectors as a mixture of multidimensional generalized gaussian densities. This data modelling not only allows us to compute more accurate analytical R-D model for LVQ but provides a new theoretical framework to the design of codebooks better suited to data like dead zone LVQ.