Simple Text to Speech using Xamarin for Android #2

As follow-up to my introduction to basic speech to text for Android, let’s discuss some more details about detecting the state of our speaker class. For example, how would we find out whether the speaker has finished reading out an utterance?

Let me introduce you to the concept of an UtteranceProgressListener. Technically, this is an abstract class provided as part of Android’s TextToSpeech framework. We can implement this and use it in combination with our speaker class in order to be informed about any utterance’s state.

Let’s start with the most simple (yet useless) implementation:

public class SpeakerProgressListener : UtteranceProgressListener
{
	public override void OnStart(string utteranceId)
	{
	}

	public override void OnDone(string utteranceId)
	{
	}

	public override void OnError(string utteranceId)
	{
	}
}

As you can see, this one provides three simple methods that are called when any utterance is about to be output, or finished, or when an error occurred, respectively. We can now react to any one (or all) of these events – let’s start with simply being notified whenever the speech-to-text engine has finished outputting an utterance.

To do so, we implement the OnDone method to invoke a (custom) event:

public class SpeakerProgressListener : UtteranceProgressListener
{
	public event EventHandler<string> Finished;
	
	public override void OnStart(string utteranceId)
	{
	}

	public override void OnDone(string utteranceId)
	{
		Finished?.Invoke(this, utteranceId);
	}

	public override void OnError(string utteranceId)
	{
	}
}

This snippet should be pretty self-evident. Note that the finished utterance’s identifier is simply passed to the event (remember that this is the identifier we specified as part of the Speak method). Alternatively, we could also retrieve the identifier and react to it immediately, e.g. by invoking different events.

The open question is, how to wire that up together with the existing code? Well, we’ll have to ensure two things:

First, we need to tell our TextToSpeech instance to use our progress listener for delivering detailed information. Fortunately, there is a single method for that:

_speaker.SetOnUtteranceProgressListener(listener);

Second, we need to consume the Finished event. All of this should be done as soon as the speech-to-text engine’s initialization has finished successfully, which means we add our code in the success branch of the OnInit method:

public void OnInit(OperationResult status)
{
	if (status == OperationResult.Success)
	{
		var listener = new SpeakerProgressListener();
		listener.Finished += (sender, utteranceId) =>
		{
			// TODO
		};
		_speaker.SetOnUtteranceProgressListener(listener);
	}
	else
	{
		// TODO
	}
}

As mentioned already, in production-ready projects you’ll want to add proper error handling to the else branch of the OnInit method.