Word accuracy
The word accuracy is a measure for assessing the accuracy of a speech recognition system. The word chain recognized by the system is compared with the actually spoken chain and the accuracy is determined based on the deviations.
In addition to the recognition speed, which is usually specified as a real-time factor (EZF) and the word recognition rate , the accuracy is the essential measure for assessing the quality of the speech recognition system.
N is the number of words spoken in the reference, I is the number of words inserted, D is the number of words deleted, and S is the number of words swapped. If there are a lot of inserted words, the accuracy can also become negative.
The assignment of the two word strings is usually generated with dynamic programming .
A small example shows the calculation of the WA:
Spoken sentence | Once | argued | themselves | North wind | and | Sun. | |
---|---|---|---|---|---|---|---|
Recognized words | First | argued | North wind | themselves | and | Sun. | |
Type of error | S. | D. | I. |
In the example, S = D = I = 1 and N = 6. This results in a WA of 50%.