<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Signal Processing Ideas</title>
	<atom:link href="http://signalprocessingideas.wordpress.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://signalprocessingideas.wordpress.com</link>
	<description>Just another WordPress.com weblog</description>
	<lastBuildDate>Tue, 11 Jan 2011 23:14:33 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='signalprocessingideas.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://s2.wp.com/i/buttonw-com.png</url>
		<title>Signal Processing Ideas</title>
		<link>http://signalprocessingideas.wordpress.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://signalprocessingideas.wordpress.com/osd.xml" title="Signal Processing Ideas" />
	<atom:link rel='hub' href='http://signalprocessingideas.wordpress.com/?pushpress=hub'/>
		<item>
		<title>The Spatial Fourier Transform</title>
		<link>http://signalprocessingideas.wordpress.com/2009/12/05/the-spatial-fourier-transform/</link>
		<comments>http://signalprocessingideas.wordpress.com/2009/12/05/the-spatial-fourier-transform/#comments</comments>
		<pubDate>Sat, 05 Dec 2009 12:33:43 +0000</pubDate>
		<dc:creator>sigprocideas</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://signalprocessingideas.wordpress.com/?p=149</guid>
		<description><![CDATA[Vision: Imagine a surveillance system looking for keywords (example: airports) and showing on a camera recording from where in the far crowd the words are coming from&#8230; The idea: is to add extra dimensions to the short-time fourier transform, leading to a time-frequency-space representation. Why? To assign spectral bins not only to time instances but [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=signalprocessingideas.wordpress.com&amp;blog=5770258&amp;post=149&amp;subd=signalprocessingideas&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><strong>Vision:<br />
Imagine a surveillance system looking for keywords </strong><strong></strong><strong>(example: airports) and showing on a camera recording from where in the far crowd the words are coming from&#8230;</strong></p>
<p><strong>The idea</strong>:<br />
is to add extra dimensions to the short-time fourier transform, leading to a <span style="color:#333399;"><span style="color:#0000ff;">time-frequency-space</span> </span>representation. Why? To assign spectral bins not only to time instances but also to points in space. Such multidimensional representation is a decomposition of the signal trough frequency (DFT), plus time (STFT) and  &#8220;space&#8221;, leading to the definition of the <span style="color:#0000ff;">Short-Time Spatial Fourier Transform (ST-SpFT)</span>.</p>
<p><strong>How?</strong><br />
Important thing is to note that the Spatial Fourier Transform (SFT) can be only applied on multi-channel audio with very exact synchronisation of the channels. One way of this multidimensional decomposition is through WEIGHTING THE FFT OF INDEPENDENT MICROPHONE CHANNELS by rotating their phase vectors according to the scanned DoA candidates.</p>
<p><strong>Applications:</strong><br />
- (NOT BLIND!) source separation (we know the spectrum of the sound and its origin in space)<br />
- Acoustic source localisation</p>
<p><strong>The challenges:</strong><br />
- one is to implement the decomposition in a way that provides clear <em>time-frequency space (simply: time-frequency-DoA (TFD)) atoms</em><br />
- another one is the fast implementation of this TFD decomposition ()</p>
<p><strong>Possible derivates of the idea:</strong><br />
- Spatial Wavelet Transform (replace FFT with DWT)<br />
- Spatial Harmonic Chirp Transform (ev. spatial Short-Time Harmonic Chirp Transform)</p>
<p>Nice images and code coming soon.</p>
<p>RELATED PROJECTS:<br />
1)<strong> MarPanning</strong> from Marsyas is providing visualisation of spectral bins through the &#8220;pan&#8221; space, derived from stereo musical recordings. <strong>MarPanning s</strong>eems to use angle information of cross-spectrum bins, meaning that every frequency bin is assigned ONLY TO ONE SPATIAL (Left-Right)<br />
Nice Vide <a href="http://www.youtube.com/watch?v=YWjHUFXKWSg&amp;feature=player_embedded">here.</a><br />
<span style="text-align:center; display: block;"><a href="http://signalprocessingideas.wordpress.com/2009/12/05/the-spatial-fourier-transform/"><img src="http://img.youtube.com/vi/YWjHUFXKWSg/2.jpg" alt="" /></a></span></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/signalprocessingideas.wordpress.com/149/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/signalprocessingideas.wordpress.com/149/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/signalprocessingideas.wordpress.com/149/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/signalprocessingideas.wordpress.com/149/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/signalprocessingideas.wordpress.com/149/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/signalprocessingideas.wordpress.com/149/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/signalprocessingideas.wordpress.com/149/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/signalprocessingideas.wordpress.com/149/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/signalprocessingideas.wordpress.com/149/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/signalprocessingideas.wordpress.com/149/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/signalprocessingideas.wordpress.com/149/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/signalprocessingideas.wordpress.com/149/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/signalprocessingideas.wordpress.com/149/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/signalprocessingideas.wordpress.com/149/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=signalprocessingideas.wordpress.com&amp;blog=5770258&amp;post=149&amp;subd=signalprocessingideas&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://signalprocessingideas.wordpress.com/2009/12/05/the-spatial-fourier-transform/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/f16f90ba8fd51b0cd618797744194e59?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">sigprocideas</media:title>
		</media:content>
	</item>
		<item>
		<title>Environmental Noise &#8211; Based Automatic Volume Control</title>
		<link>http://signalprocessingideas.wordpress.com/2009/08/29/environmental-noise-based-automatic-volume-control/</link>
		<comments>http://signalprocessingideas.wordpress.com/2009/08/29/environmental-noise-based-automatic-volume-control/#comments</comments>
		<pubDate>Sat, 29 Aug 2009 15:56:11 +0000</pubDate>
		<dc:creator>sigprocideas</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://signalprocessingideas.wordpress.com/?p=144</guid>
		<description><![CDATA[This intelligent automatic level control is another feature proposal for mobile and SW-based audio players, as well as communication devices and public places like airports, trams, trains, etc.. Problem examples: with recent music players people do need to set the volume much higher when traveling on a train, lowering the volume when the train stops [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=signalprocessingideas.wordpress.com&amp;blog=5770258&amp;post=144&amp;subd=signalprocessingideas&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><strong>This intelligent automatic level control</strong> is another feature proposal for mobile and SW-based audio players, as well as communication devices and public places like airports, trams, trains, etc..</p>
<p><strong>Problem examples:</strong> with recent music players people do need to set the volume much higher when traveling on a train, lowering the volume when the train stops at a station, raising it again when it starts to move, etc&#8230; People need to raise the ringer volume of their phones when  taking the phone to the street, lower the ringer volume when going to the library, etc.. Imagine a phone with a ringer volume set to 90% and, that rings at a very high level just in a middle of a meeting..</p>
<p><strong>The idea:</strong> to track the level of environmental noise and <span style="color:#800000;">automatically</span> ..<br />
-  raise or lower the sound volume in the mobile music player headphones based on the environmental noise..<br />
-  raise or lower the speaker volume of mobile phones during phone calls, when the user enters a noisy area (road crossing, etc)<br />
-  raise or lower the ringer volume&#8230;<br />
- adjust loudspeaker volumes in trains, buses, airports individually for every loudspeaker, based on its environmental noise level<br />
(how many times it did happen that you did not understand the announcement about a route change because of the noise in the train?)</p>
<p><strong>Application fields:<br />
</strong>- all CD, Mp3, etc mobile music players<br />
- all mobile phones..<br />
- all in-tram, in-train, airport, etc based loudspeaker installations</p>
<p><strong>Possible features:<br />
</strong>- simple automatic level control based on the environmental noise- parametric-equalizer-like control based on the features of the environmental noise (not necessarily all frequencies need to be boosted everytime..)<br />
- learn-by listening feature: (you take the mp3 player with you, enter the train, activate the LBL feature, start a music to play, and you show the player how you would set the music level at different noise levels and train speeds. with this the system learns how much it needs to update the volume at different cases..)</p>
<p><strong>Technology analogy:<br />
</strong>- the automatic contract control of mobile phone displays, based on the environmental light conditions<br />
<strong><br />
Keywords:</strong><br />
- Automatic (level/volume/gain) control,<br />
- Adaptive noise masking,<br />
- parametric equalizer,<br />
- environmental noise spectrum</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/signalprocessingideas.wordpress.com/144/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/signalprocessingideas.wordpress.com/144/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/signalprocessingideas.wordpress.com/144/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/signalprocessingideas.wordpress.com/144/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/signalprocessingideas.wordpress.com/144/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/signalprocessingideas.wordpress.com/144/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/signalprocessingideas.wordpress.com/144/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/signalprocessingideas.wordpress.com/144/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/signalprocessingideas.wordpress.com/144/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/signalprocessingideas.wordpress.com/144/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/signalprocessingideas.wordpress.com/144/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/signalprocessingideas.wordpress.com/144/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/signalprocessingideas.wordpress.com/144/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/signalprocessingideas.wordpress.com/144/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=signalprocessingideas.wordpress.com&amp;blog=5770258&amp;post=144&amp;subd=signalprocessingideas&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://signalprocessingideas.wordpress.com/2009/08/29/environmental-noise-based-automatic-volume-control/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/f16f90ba8fd51b0cd618797744194e59?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">sigprocideas</media:title>
		</media:content>
	</item>
		<item>
		<title>Audible Pause</title>
		<link>http://signalprocessingideas.wordpress.com/2009/08/10/audible-pause/</link>
		<comments>http://signalprocessingideas.wordpress.com/2009/08/10/audible-pause/#comments</comments>
		<pubDate>Mon, 10 Aug 2009 10:50:45 +0000</pubDate>
		<dc:creator>sigprocideas</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://signalprocessingideas.wordpress.com/?p=138</guid>
		<description><![CDATA[Audible Pause is a feature proposal for mobile and SW-based audio players. It is a feature that would keep playing the last &#8220;sound&#8221; after the Pause button is pressed. Just like the last video frame is displayed when Pause is pressed on the Video player. However, since playing a single audio sample in a loop [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=signalprocessingideas.wordpress.com&amp;blog=5770258&amp;post=138&amp;subd=signalprocessingideas&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><strong>Audible Pause</strong> is a feature proposal for mobile and SW-based audio players. It is a feature that would keep playing the last &#8220;sound&#8221; after the Pause button is pressed. Just like the last video frame is displayed when Pause is pressed on the Video player. However, since playing a single audio sample in a loop does not lead to anything audible, this <strong>Audio Pause</strong> feature needs a little signal processing behind the scenes.</p>
<p><img class="aligncenter size-full wp-image-141" title="pause" src="http://signalprocessingideas.files.wordpress.com/2009/08/pause.jpeg?w=450" alt="pause"   /></p>
<p>The trick would be to:<br />
1. look for frames of the signal with similar spectral envelope,<br />
2. smooth the envelope a bit in time [1],<br />
3. apply phase-synchronous overlap-add [2] in a loop in order to keep the the automatically chosen audio frames (point 1) play continuously.</p>
<p><strong>Applications:</strong><br />
- Useful feature for note transcription by listening<br />
- Useful for learning sounds of speech in foreign languages (ü, ö, ä, etc)</p>
<p><strong>References:</strong><br />
[1] anything about RASTA filtration<br />
[2] anything on phase vocoders</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/signalprocessingideas.wordpress.com/138/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/signalprocessingideas.wordpress.com/138/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/signalprocessingideas.wordpress.com/138/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/signalprocessingideas.wordpress.com/138/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/signalprocessingideas.wordpress.com/138/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/signalprocessingideas.wordpress.com/138/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/signalprocessingideas.wordpress.com/138/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/signalprocessingideas.wordpress.com/138/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/signalprocessingideas.wordpress.com/138/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/signalprocessingideas.wordpress.com/138/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/signalprocessingideas.wordpress.com/138/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/signalprocessingideas.wordpress.com/138/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/signalprocessingideas.wordpress.com/138/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/signalprocessingideas.wordpress.com/138/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=signalprocessingideas.wordpress.com&amp;blog=5770258&amp;post=138&amp;subd=signalprocessingideas&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://signalprocessingideas.wordpress.com/2009/08/10/audible-pause/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/f16f90ba8fd51b0cd618797744194e59?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">sigprocideas</media:title>
		</media:content>

		<media:content url="http://signalprocessingideas.files.wordpress.com/2009/08/pause.jpeg" medium="image">
			<media:title type="html">pause</media:title>
		</media:content>
	</item>
		<item>
		<title>The Chirprate-Pitch Plane</title>
		<link>http://signalprocessingideas.wordpress.com/2008/12/07/the-chirprate-pitch-plane/</link>
		<comments>http://signalprocessingideas.wordpress.com/2008/12/07/the-chirprate-pitch-plane/#comments</comments>
		<pubDate>Sun, 07 Dec 2008 11:23:28 +0000</pubDate>
		<dc:creator>sigprocideas</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://signalprocessingideas.wordpress.com/?p=35</guid>
		<description><![CDATA[Problem: Crossing pitch trajectories, that make multipitch tracking a difficult task. Looking for extra features that could be used for the tracking algorithm, assigning pitch trajectory to speakers (acoustic sources). Since we address single-channel recordings, the PoPi plane (ie. linking position to pitch) is not the way to go. Method Description: Using the pitch rate [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=signalprocessingideas.wordpress.com&amp;blog=5770258&amp;post=35&amp;subd=signalprocessingideas&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<div>
<p><strong>Problem:</strong><br />
Crossing pitch trajectories, that make multipitch tracking a difficult task. Looking for extra features that could be used for the tracking algorithm, assigning pitch trajectory to speakers (acoustic sources). Since we address single-channel recordings, the PoPi plane (ie. linking position to pitch) is not the way to go.</p>
<p><strong>Method Description:</strong><br />
Using the pitch rate as an additional cue (regardless HOW one can obtain that feature) for multipitch-tracking. Why? If the pitch trajectory of two speakers are crossing in a given frame, then definitely the pitch of both speakers is coming &#8220;from somewhere&#8221; and this &#8220;somewhere&#8221; is hopefully not the same for both, as the trajectories are JUST crossing now in the problematic frame. The information where the pitch is &#8220;coming from&#8221;, and where it is &#8220;going to&#8221; is nothing else but the <em>chirp-rate</em>. Of course the question is how to decompose the signal effectively into a representation showing pitch linked to its pitch change rate.</p>
<p>One of the possible solutions is to take the frame under analysis, pre-warp it with different chirp-rate candidates (just like in the Fast implementation of <a href="http://signalprocessingideas.wordpress.com/2008/12/07/short-time-chirp-transform/">Chirp Transform</a>), and extract all the pitch candidates for all given pre-warping factor. With this we get a<strong> Chirprate vs. Pitch Plane</strong>, that shows not only the actual pitch value of the speaker, but also from which &#8220;direction&#8221; the pitch is coming from, ie was the pitch value higher or lower in the previous frame, and how big this difference between the two frames is.</p>
<p>Below one possible Chirprate-Pitch decomposition:  depicting one acoustic source in the scene. Axis x: the <span class="nfakPe">pitch</span>-rate (chirpFactor), and axis y: the correlation lag (inverse of <span class="nfakPe">Pitch</span>):</p>
<p><img class="aligncenter size-full wp-image-41" title="warpedpitchestimation1" src="http://signalprocessingideas.files.wordpress.com/2008/12/warpedpitchestimation1.png?w=450" alt="warpedpitchestimation1"   /><br />
As you see, there is only one correlation lag candidate, and one chirpFactor candidate for our speaker, which is the result of using the <a href="http://signalprocessingideas.wordpress.com/2008/12/07/cepacf/">ACF-CEP</a> based pitch estimation mentioned in the previous post.</p>
<p><strong>References</strong>:<br />
[1] <a href="http://iie.fing.edu.uy/publicaciones/2010/CLR10/">&#8220;FAN CHIRP TRANSFORM FOR MUSIC REPRESENTATION&#8221;</a>, P. Cancela, E. Lopez, M. Rocamora,  Proc. of DAFx 2010, Graz, Austria, September 6-10, 2010. (Chirprate-Pitch Plane discussed in section 5.2)</p>
</div>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/signalprocessingideas.wordpress.com/35/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/signalprocessingideas.wordpress.com/35/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/signalprocessingideas.wordpress.com/35/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/signalprocessingideas.wordpress.com/35/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/signalprocessingideas.wordpress.com/35/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/signalprocessingideas.wordpress.com/35/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/signalprocessingideas.wordpress.com/35/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/signalprocessingideas.wordpress.com/35/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/signalprocessingideas.wordpress.com/35/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/signalprocessingideas.wordpress.com/35/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/signalprocessingideas.wordpress.com/35/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/signalprocessingideas.wordpress.com/35/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/signalprocessingideas.wordpress.com/35/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/signalprocessingideas.wordpress.com/35/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=signalprocessingideas.wordpress.com&amp;blog=5770258&amp;post=35&amp;subd=signalprocessingideas&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://signalprocessingideas.wordpress.com/2008/12/07/the-chirprate-pitch-plane/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/f16f90ba8fd51b0cd618797744194e59?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">sigprocideas</media:title>
		</media:content>

		<media:content url="http://signalprocessingideas.files.wordpress.com/2008/12/warpedpitchestimation1.png" medium="image">
			<media:title type="html">warpedpitchestimation1</media:title>
		</media:content>
	</item>
		<item>
		<title>STChT-based Noise Suppression</title>
		<link>http://signalprocessingideas.wordpress.com/2008/12/07/stcht-based-noise-suppression/</link>
		<comments>http://signalprocessingideas.wordpress.com/2008/12/07/stcht-based-noise-suppression/#comments</comments>
		<pubDate>Sun, 07 Dec 2008 11:23:06 +0000</pubDate>
		<dc:creator>sigprocideas</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://signalprocessingideas.wordpress.com/?p=33</guid>
		<description><![CDATA[Description: Pitch Estimation module (ACF+CEP) + Spectral estimation module (STChT) + Noise spectrum estimation (adaptive Quantile) + Noise removal (anything from Wiener filter to the most complicated TF-based methods) Building blocks: Pitch estimation. An enhanced version of the reindexing method published in [1] was used as a basis for this module. Short-Time Fan Chirp (STFCh)Transform. [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=signalprocessingideas.wordpress.com&amp;blog=5770258&amp;post=33&amp;subd=signalprocessingideas&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<div>
<p><strong>Description:</strong><br />
Pitch Estimation module (ACF+CEP) + Spectral estimation module (STChT) + Noise spectrum estimation (adaptive Quantile) + Noise removal (anything from Wiener filter to the most complicated TF-based methods)</p>
<p><img class="aligncenter size-full wp-image-126" title="onevoice_blockscheme" src="http://signalprocessingideas.files.wordpress.com/2008/12/onevoice_blockscheme.png?w=450&#038;h=182" alt="onevoice_blockscheme" width="450" height="182" /></p>
<p><strong>Building blocks:</strong></p>
<ul>
<li><em>Pitch estimation.</em> An enhanced version of the reindexing method published in [1] was used as a basis for this module.</li>
</ul>
<ul>
<li><em>Short-Time Fan Chirp (STFCh)Transform.</em> We use the fast version of the Fan-Chirp transform [2].</li>
</ul>
<ul>
<li><em>Noise estimation:</em> this module is based on the well known quantile filtration idea [3], acc. to which the noisy background can be estimated by applying empirically defined percentage of the sorted time-frequency atoms.</li>
</ul>
<ul>
<li><em>Noise suppression:</em> The “speech enhancement” is happening in this module. It applyes the estimated noise spectrum, and removes it from the representation provided by the STHChT module.</li>
</ul>
<p><strong><br />
References:</strong><br />
[1] Képesi, M and Weruaga, L.: “Harmonic Tracking based Short-Time Chirp Analysis of Speech Signals”, Robust2004 COST278 &amp; ISCA ITRW Workshop on Robustness Issues in Conversational Interaction, 30th and 31st August 2004, University of East Anglia, Norwich, UK<br />
[2] L. Weruaga, M. Kepesi, &#8220;The fan-chirp transform for non-stationary harmonic sounds&#8221;, Signal Proc., vol. 87, pp. 1504-1522, 2007.<br />
[3] Stahl, V.; Fischer, A.; Bippus, R: Quantile based noise estimation for spectral subtraction and Wiener filtering, Acoustics, Speech, and Signal Processing, 2000. Volume 3, Issue , 2000 Page(s):1875 &#8211; 1878 vol.3</div>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/signalprocessingideas.wordpress.com/33/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/signalprocessingideas.wordpress.com/33/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/signalprocessingideas.wordpress.com/33/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/signalprocessingideas.wordpress.com/33/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/signalprocessingideas.wordpress.com/33/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/signalprocessingideas.wordpress.com/33/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/signalprocessingideas.wordpress.com/33/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/signalprocessingideas.wordpress.com/33/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/signalprocessingideas.wordpress.com/33/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/signalprocessingideas.wordpress.com/33/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/signalprocessingideas.wordpress.com/33/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/signalprocessingideas.wordpress.com/33/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/signalprocessingideas.wordpress.com/33/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/signalprocessingideas.wordpress.com/33/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=signalprocessingideas.wordpress.com&amp;blog=5770258&amp;post=33&amp;subd=signalprocessingideas&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://signalprocessingideas.wordpress.com/2008/12/07/stcht-based-noise-suppression/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/f16f90ba8fd51b0cd618797744194e59?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">sigprocideas</media:title>
		</media:content>

		<media:content url="http://signalprocessingideas.files.wordpress.com/2008/12/onevoice_blockscheme.png" medium="image">
			<media:title type="html">onevoice_blockscheme</media:title>
		</media:content>
	</item>
		<item>
		<title>Cep(ACF)</title>
		<link>http://signalprocessingideas.wordpress.com/2008/12/07/cepacf/</link>
		<comments>http://signalprocessingideas.wordpress.com/2008/12/07/cepacf/#comments</comments>
		<pubDate>Sun, 07 Dec 2008 11:22:05 +0000</pubDate>
		<dc:creator>sigprocideas</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://signalprocessingideas.wordpress.com/?p=31</guid>
		<description><![CDATA[Problem: STChT requires reliable pitch estimation in order to provide sharp TF representation. This is a challenging task, as pitch estimation in noisy and multi-speaker environments is never an easy task. Method Description: A comibined Pitch estimation method, that combines Autocorrelation (ACF) with Cepstrum (Cep) and some additional tricks: We know, that the Autocorrelation extracts [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=signalprocessingideas.wordpress.com&amp;blog=5770258&amp;post=31&amp;subd=signalprocessingideas&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<div>
<p><strong>Problem:<br />
</strong>STChT requires reliable pitch estimation in order to provide sharp TF representation. This is a challenging task, as pitch estimation in noisy and multi-speaker  environments is never an easy task.</p>
<p><strong>Method Description:<br />
</strong>A comibined Pitch estimation method, that combines Autocorrelation (ACF) with Cepstrum (Cep) and some additional tricks: We know, that the Autocorrelation extracts the periodicity of the speech signal even in noisy background, but gives multiple pitch candidates because of double-pitch, half-pitch, etc.. This is taken care by the Cepstrum applied on top of the ACF, which merges all autocorrelation peak candidates into one cepstrum-based pitch candidate. And the trick is inbetween: Cepstrum is reliable only if the spectrum it uses is nice enough, ie. dominant, rich of harmonics, and as flat as possible. But how could be a noisy speech spectrum nice like that?</p>
<p><img class="aligncenter size-full wp-image-135" title="cep_acf" src="http://signalprocessingideas.files.wordpress.com/2008/12/cep_acf.png?w=450" alt="cep_acf"   /></p>
<p>Well, a half-way rectified autocorrelation, leads almost to a spectrum like that: with boosted periodicities and enhanced spectral representation of hidden harmonicities..</p>
<p><strong>References:<br />
</strong>No publications yet.</div>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/signalprocessingideas.wordpress.com/31/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/signalprocessingideas.wordpress.com/31/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/signalprocessingideas.wordpress.com/31/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/signalprocessingideas.wordpress.com/31/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/signalprocessingideas.wordpress.com/31/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/signalprocessingideas.wordpress.com/31/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/signalprocessingideas.wordpress.com/31/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/signalprocessingideas.wordpress.com/31/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/signalprocessingideas.wordpress.com/31/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/signalprocessingideas.wordpress.com/31/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/signalprocessingideas.wordpress.com/31/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/signalprocessingideas.wordpress.com/31/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/signalprocessingideas.wordpress.com/31/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/signalprocessingideas.wordpress.com/31/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=signalprocessingideas.wordpress.com&amp;blog=5770258&amp;post=31&amp;subd=signalprocessingideas&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://signalprocessingideas.wordpress.com/2008/12/07/cepacf/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/f16f90ba8fd51b0cd618797744194e59?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">sigprocideas</media:title>
		</media:content>

		<media:content url="http://signalprocessingideas.files.wordpress.com/2008/12/cep_acf.png" medium="image">
			<media:title type="html">cep_acf</media:title>
		</media:content>
	</item>
		<item>
		<title>Multiband PoPi Decomposition</title>
		<link>http://signalprocessingideas.wordpress.com/2008/12/07/multiband-popi-decomposition/</link>
		<comments>http://signalprocessingideas.wordpress.com/2008/12/07/multiband-popi-decomposition/#comments</comments>
		<pubDate>Sun, 07 Dec 2008 10:41:30 +0000</pubDate>
		<dc:creator>sigprocideas</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://signalprocessingideas.wordpress.com/?p=25</guid>
		<description><![CDATA[Problem: When applying the PoPi decomposition for concurrent speaker scenarios (coctail party effect) in order to track multiple speakers moving while speaking we see that the original formulation of the PoPi decomposition shows always only the more dominant speaker (ie. more dominant microperiodicies in given signal frame), and the othere speaker (let&#8217;s call him background [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=signalprocessingideas.wordpress.com&amp;blog=5770258&amp;post=25&amp;subd=signalprocessingideas&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<div>
<p><strong>Problem:</strong><br />
When applying the PoPi decomposition for concurrent speaker scenarios (coctail party effect) in order to track multiple speakers moving while speaking we see that the original formulation of the PoPi decomposition shows always only the more dominant speaker (ie. more dominant microperiodicies in given signal frame), and the othere speaker (let&#8217;s call him background speaker) is suppressed and hardly visible in the representation.</p>
<p><strong>Solution:</strong><br />
Similar problem is addressed in <a href="http://www.pampalk.at/mir-phds/abstract/Klapuri2004.html">Klapuri&#8217;s PhD work </a>that targets automatic transcription of simultaneous musical tones. Klapuri&#8217;s (and the most logical) way to enhance multiple  pitch candidates is to give them chance in multiple frequency bands to be dominant.</p>
<p><img class="aligncenter size-full wp-image-51" title="bandwisepopi" src="http://signalprocessingideas.files.wordpress.com/2008/12/bandwisepopi.png?w=450" alt="bandwisepopi"   /></p>
<p>We do the same:  the &#8220;multiband&#8221; version of the PoPi plane is based on subband processing. This provides good results already at using as few bands as 17. This has been proven on several double-talk and triple-talk scenarios recorded in 3 different rooms with different reverberation times. However, for non-speech like scenarios, and more speakers the 17 band might be a low number.</p>
<p><img class="aligncenter size-full wp-image-52" title="multispeaker_popi" src="http://signalprocessingideas.files.wordpress.com/2008/12/multispeaker_popi.png?w=450" alt="multispeaker_popi"   /></p>
<p>The image above shows the PoPi decomposition of 2 concurrent speakers. Their position and the corresponding pitch values is easy to read out. The recording shows the voice of Tania and Lukas.</p>
<p><strong>References:<br />
</strong>[1] T. Habib, L. Ottowitz, and M. Kepési, “Experimental Evaluation of Multi-band Position-Pitch Estimation (M-PoPi) Algorithm for Multi-Speaker Localization,” INTERSPEECH 2008, Sept. 22-26, Brisbane, Australia.<br />
[2] T. Habib, M. Kepési and L. Ottowitz, “Experimental Evaluation of the Joint Position-Pitch Estimation (PoPi) algorithm in Noisy Environments,” 5th IEEE Workshop on Sensor Array and Multi-Channel Signal Processing (SAM 2008), Jul. 21-23, Darmstadt, Germany.<br />
[3] M. Kepési, L. Ottowitz and T. Habib, “Joint Position-Pitch Estimation for Multiple Speaker Scenarios,” IEEE Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA 2008), May 6-8, Trento, Italy.</div>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/signalprocessingideas.wordpress.com/25/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/signalprocessingideas.wordpress.com/25/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/signalprocessingideas.wordpress.com/25/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/signalprocessingideas.wordpress.com/25/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/signalprocessingideas.wordpress.com/25/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/signalprocessingideas.wordpress.com/25/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/signalprocessingideas.wordpress.com/25/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/signalprocessingideas.wordpress.com/25/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/signalprocessingideas.wordpress.com/25/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/signalprocessingideas.wordpress.com/25/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/signalprocessingideas.wordpress.com/25/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/signalprocessingideas.wordpress.com/25/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/signalprocessingideas.wordpress.com/25/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/signalprocessingideas.wordpress.com/25/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=signalprocessingideas.wordpress.com&amp;blog=5770258&amp;post=25&amp;subd=signalprocessingideas&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://signalprocessingideas.wordpress.com/2008/12/07/multiband-popi-decomposition/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/f16f90ba8fd51b0cd618797744194e59?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">sigprocideas</media:title>
		</media:content>

		<media:content url="http://signalprocessingideas.files.wordpress.com/2008/12/bandwisepopi.png" medium="image">
			<media:title type="html">bandwisepopi</media:title>
		</media:content>

		<media:content url="http://signalprocessingideas.files.wordpress.com/2008/12/multispeaker_popi.png" medium="image">
			<media:title type="html">multispeaker_popi</media:title>
		</media:content>
	</item>
		<item>
		<title>Housing for Microphone Arrays for Size Minimization</title>
		<link>http://signalprocessingideas.wordpress.com/2008/12/07/housing-for-microphone-arrays-for-size-minimization/</link>
		<comments>http://signalprocessingideas.wordpress.com/2008/12/07/housing-for-microphone-arrays-for-size-minimization/#comments</comments>
		<pubDate>Sun, 07 Dec 2008 10:33:12 +0000</pubDate>
		<dc:creator>sigprocideas</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://signalprocessingideas.wordpress.com/?p=23</guid>
		<description><![CDATA[Problem: It would be a dream of every microphone array researcher to have a mic array with a size of a matchbox. Unfortunately this is not a case, and many mic arrays are of a size of hundreds of cm. This size restriction comes from the frequency of the signal we are trying to catch [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=signalprocessingideas.wordpress.com&amp;blog=5770258&amp;post=23&amp;subd=signalprocessingideas&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<div>
<p><strong>Problem:</strong><br />
It would be a dream of every microphone array researcher to have a mic array with a size of a matchbox. Unfortunately this is not a case, and many mic arrays are of a size of hundreds of cm. This size restriction comes from the frequency of the signal we are trying to catch with the array: the lower the frequency the bigger the array must be.</p>
<p><strong>Possible Solution:</strong><br />
if the array size depends on the wavelength, and we can NOT change the frequency of the signal under acquisition, what can we affect in order to use smaller arrays? The guess is right: the speed of sound.</div>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/signalprocessingideas.wordpress.com/23/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/signalprocessingideas.wordpress.com/23/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/signalprocessingideas.wordpress.com/23/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/signalprocessingideas.wordpress.com/23/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/signalprocessingideas.wordpress.com/23/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/signalprocessingideas.wordpress.com/23/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/signalprocessingideas.wordpress.com/23/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/signalprocessingideas.wordpress.com/23/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/signalprocessingideas.wordpress.com/23/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/signalprocessingideas.wordpress.com/23/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/signalprocessingideas.wordpress.com/23/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/signalprocessingideas.wordpress.com/23/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/signalprocessingideas.wordpress.com/23/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/signalprocessingideas.wordpress.com/23/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=signalprocessingideas.wordpress.com&amp;blog=5770258&amp;post=23&amp;subd=signalprocessingideas&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://signalprocessingideas.wordpress.com/2008/12/07/housing-for-microphone-arrays-for-size-minimization/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/f16f90ba8fd51b0cd618797744194e59?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">sigprocideas</media:title>
		</media:content>
	</item>
		<item>
		<title>The PoPi plane</title>
		<link>http://signalprocessingideas.wordpress.com/2008/12/07/the-popi-plane/</link>
		<comments>http://signalprocessingideas.wordpress.com/2008/12/07/the-popi-plane/#comments</comments>
		<pubDate>Sun, 07 Dec 2008 10:32:45 +0000</pubDate>
		<dc:creator>sigprocideas</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://signalprocessingideas.wordpress.com/?p=21</guid>
		<description><![CDATA[The idea: Although unvoiced sounds of speech dominanting at high frequencies (sounds &#8220;s&#8221; &#8220;z&#8221; &#8220;f&#8221;, etc) give very clear DoA estimation, the real information is still hidden in the not-so-clear and confusing micro-periodicities of the autocorrelation. These microperiodicities carry the information not only about the DoA of the source they are related to, but also [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=signalprocessingideas.wordpress.com&amp;blog=5770258&amp;post=21&amp;subd=signalprocessingideas&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<div>
<p><strong>The idea:</strong><br />
Although unvoiced sounds of speech dominanting at high frequencies (sounds &#8220;s&#8221; &#8220;z&#8221; &#8220;f&#8221;, etc) give very clear DoA estimation, the real information is still hidden in the not-so-clear and confusing micro-periodicities of the autocorrelation. These microperiodicities carry the information not only about the DoA of the source they are related to, but also its pitch information. This means, even in case of two moving, not always active acoustic sources, their ID could be described to their position by linking their pitch to their DoA.</p>
<p><strong>Method Description:</strong><br />
Decomposing a frame of a 2-channel acoustic signal into a Position-Pitch plane shows clearly where the source is (at which DoA angle) and what the Pitch (from correlation lag) of that speaker is. The image below demonstrates a PoPi plane extracted by a 16-channel circular mic array signal for a voiced speech frame.</p>
<p><img class="aligncenter size-full wp-image-28" title="picture-13" src="http://signalprocessingideas.files.wordpress.com/2008/12/picture-13.png?w=450" alt="picture-13"   /></p>
<div class="paragraph Free_Form"><strong>References:<br />
</strong>[1] M. Képesi, F. Pernkopf, M. Wohmayr, “Joint Position-Pitch Tracking for 2-Channel Audio,” CBMI 2007, Jun 25-27, Bordeaux, France</div>
<div class="paragraph Free_Form">[2] M. Wohmayr, M. Képesi, “Joint Position-Pitch Extraction from Multichannel Audio,” Proc. Interspeech 2007, August 27-31, Antwerpen, Belgium</div>
<div class="paragraph Free_Form">[3] M. Képesi, M. Wohmayr, T. Habib, “Pitch-Driven Position Estimation of Speakers in Multispeaker Environments,” The 3rd Congress of the Alps Adria Acoustics Association, September 27-28, 2007, Graz, Austria<br />
More publications under the<a href="http://signalprocessingideas.wordpress.com/2008/12/07/multiband-popi-decomposition/"> Multichannel PoPi</a> topic.</div>
<div class="paragraph Heading_1"><span><strong>Patent applications:</strong></span></div>
<div class="paragraph Free_Form">[P1] Képesi, M. &#8211; Wohlmayr, M. &#8211; Kubin, G.: “Joint Position-Pitch Estimation of Acoustic Sources for Their Tracking and Separation,” European patent, submitted: May, 2007.</div>
<p><strong>Links:</strong><br />
[1] <a href="http://members.spsc.tugraz.at/people/marian/mic_at_spsc/PoPi%20Plane.html">Related page at SPSC</a><br />
[2] <a href="http://www.spsc.tugraz.at/people/alumni/marian-kepesi/">PoPi Demo Videos and Audio Files</a><br />
[3] Elmar using the method for controlling the orientation of <a href="http://ca.youtube.com/watch?v=PUZeDeDL-Bo">his robot</a>.</div>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/signalprocessingideas.wordpress.com/21/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/signalprocessingideas.wordpress.com/21/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/signalprocessingideas.wordpress.com/21/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/signalprocessingideas.wordpress.com/21/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/signalprocessingideas.wordpress.com/21/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/signalprocessingideas.wordpress.com/21/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/signalprocessingideas.wordpress.com/21/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/signalprocessingideas.wordpress.com/21/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/signalprocessingideas.wordpress.com/21/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/signalprocessingideas.wordpress.com/21/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/signalprocessingideas.wordpress.com/21/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/signalprocessingideas.wordpress.com/21/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/signalprocessingideas.wordpress.com/21/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/signalprocessingideas.wordpress.com/21/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=signalprocessingideas.wordpress.com&amp;blog=5770258&amp;post=21&amp;subd=signalprocessingideas&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://signalprocessingideas.wordpress.com/2008/12/07/the-popi-plane/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/f16f90ba8fd51b0cd618797744194e59?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">sigprocideas</media:title>
		</media:content>

		<media:content url="http://signalprocessingideas.files.wordpress.com/2008/12/picture-13.png" medium="image">
			<media:title type="html">picture-13</media:title>
		</media:content>
	</item>
		<item>
		<title>Spectral Reindexing for Pitch Estimation</title>
		<link>http://signalprocessingideas.wordpress.com/2008/12/07/spectral-reindexing-for-pitch-estimation/</link>
		<comments>http://signalprocessingideas.wordpress.com/2008/12/07/spectral-reindexing-for-pitch-estimation/#comments</comments>
		<pubDate>Sun, 07 Dec 2008 10:32:26 +0000</pubDate>
		<dc:creator>sigprocideas</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://signalprocessingideas.wordpress.com/?p=19</guid>
		<description><![CDATA[Problem: Need for a powerful but straightforward pitch estimation method to drive the chirp transform. Maybe reordering the information represented by all the frequency bins of an FFT (or ChT) in order to get a precise pitch estimation. Method Description: The main idea is to scan through all possible pitch candidates (80-500 Hz) and assign [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=signalprocessingideas.wordpress.com&amp;blog=5770258&amp;post=19&amp;subd=signalprocessingideas&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<div>
<p><strong>Problem:</strong><br />
Need for  a powerful but straightforward pitch estimation method to drive the chirp transform. Maybe <em>reordering the information</em> represented by all the frequency bins of an FFT (or ChT) in order to get a precise pitch estimation.</p>
<p><strong>Method Description:</strong><br />
The main idea is to scan through all possible pitch candidates (80-500 Hz) and assign the mean of all the energy values corresponding to Fo, 2Fo, &#8230; , kFo. In equation it looks like this:</p>
<p>and an example of a pitch trajectory derived from a low-resolution spectrogram is shown below:</p>
<p>What we see is that it is feasible to achieve frequency resolution far below the frequency resolution of the FFT (usually 15-20Hz/fr. bin).</p>
<p><strong>References:</strong><br />
[1] M. Képesi, L. Weruaga, E. Schofield, &#8220;Detailed Multidimensional Analysis of our Acoustical Environment,&#8221; <em>Forum Acusticum. </em>Budapest (Hu), September 2005, pp. 2649-2654.</p>
<div>[2] M. Képesi and L. Weruaga, “High-resolution noise-robust spectral-based pitch estimation,” <em>Interspeech 2005, </em>pp. 313-316, Lisboa (P), Sep. 2005</div>
<div>[3] P. Cancela, &#8220;<em>Tracking melody in polyphonic audio</em>. mirex 2008,&#8221; in Proc. Music Inf. Retrieval Evaluation eXchange, 2008</div>
<div><a href="http://iie.fing.edu.uy/publicaciones/2010/CLR10/">[4] &#8220;FAN CHIRP TRANSFORM FOR MUSIC REPRESENTATION&#8221;</a>, P. Cancela, E. Lopez, M. Rocamora,  DAFx 2010.</div>
<div>(&#8220;F0-gram&#8221; ie. GlogS discussed in chapter 4)</div>
<p><strong>Related Methods:<br />
</strong>.Harmonic Product Spectrum<br />
.Harmonic Sum Spectrum</p>
</div>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/signalprocessingideas.wordpress.com/19/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/signalprocessingideas.wordpress.com/19/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/signalprocessingideas.wordpress.com/19/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/signalprocessingideas.wordpress.com/19/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/signalprocessingideas.wordpress.com/19/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/signalprocessingideas.wordpress.com/19/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/signalprocessingideas.wordpress.com/19/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/signalprocessingideas.wordpress.com/19/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/signalprocessingideas.wordpress.com/19/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/signalprocessingideas.wordpress.com/19/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/signalprocessingideas.wordpress.com/19/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/signalprocessingideas.wordpress.com/19/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/signalprocessingideas.wordpress.com/19/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/signalprocessingideas.wordpress.com/19/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=signalprocessingideas.wordpress.com&amp;blog=5770258&amp;post=19&amp;subd=signalprocessingideas&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://signalprocessingideas.wordpress.com/2008/12/07/spectral-reindexing-for-pitch-estimation/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/f16f90ba8fd51b0cd618797744194e59?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">sigprocideas</media:title>
		</media:content>
	</item>
	</channel>
</rss>
