This website allows interaction with our metric that's described in the Paper "Non-intrusive deep learning-based computational speech metrics with high-accuracy across a wide range of acoustic scenes".
There are 2 ways to interact with our metrics:
Using the Drang and Drop interface below. This works well if you only have a few files to evaluate.
By directly using the API endpoints. This is better if you have more files to evaluate.
If you have questions, comments or suggestions about the API, please reach out via email.
Try it!
Drag the audio files into the box, then press the submit button and wait for the results to be displayed. Depending on the current load and number of files submitted, this may take between a few seconds and several minutes. Please note the restrictions that apply below.
Drag & Drop Audio
Result
No files submitted.
API Restrictions
The API currently has the following restrictions:
You can only submit single channel audio files (mono audio).
Each audio snippets needs to be at least 4 seconds long and at most 20 seconds long.
Snippets shorter than 4 seconds cannot be submitted. Snippets that are longer than 4 seconds will be processed in 4 seconds chunks and the resulting predictions averaged. If the duration of a file is not a multiple of 4 seconds, the remainder will de discarded. For example, if you submit an audio file that is 6 seconds long, the returned score will be for the 4 first seconds only. If you submit an audio file that is 8 seconds long, the returned score will the average of the predictions for seconds 0 to 4 and 4 to 8.
The number of snippets you can submit to the API within a give time window is limited. The current limit is 1 hours of audio submitted per 24 hours rolling window.
You can submit at most 15 files per single request.