The idea:
Although unvoiced sounds of speech dominanting at high frequencies (sounds “s” “z” “f”, etc) give very clear DoA estimation, the real information is still hidden in the not-so-clear and confusing micro-periodicities of the autocorrelation. These microperiodicities carry the information not only about the DoA of the source they are related to, but also its pitch information. This means, even in case of two moving, not always active acoustic sources, their ID could be described to their position by linking their pitch to their DoA.
Method Description:
Decomposing a frame of a 2-channel acoustic signal into a Position-Pitch plane shows clearly where the source is (at which DoA angle) and what the Pitch (from correlation lag) of that speaker is. The image below demonstrates a PoPi plane extracted by a 16-channel circular mic array signal for a voiced speech frame.

[1] M. Képesi, F. Pernkopf, M. Wohmayr, “Joint Position-Pitch Tracking for 2-Channel Audio,” CBMI 2007, Jun 25-27, Bordeaux, France
More publications under the Multichannel PoPi topic.
Links:
[1] Related page at SPSC
[2] PoPi Demo Videos and Audio Files
[3] Elmar using the method for controlling the orientation of his robot.