Automatic speech processing

EE-554

Whisper ASR Exercise

This page is part of the content downloaded from Whisper ASR Exercise on Wednesday, 25 December 2024, 16:58. Note that some content and any files larger than 50 MB are not downloaded.

Description

In this exercise, you will work with OpenAI's open-source Whisper speech recognition model to explore the capabilities and limitations of modern Automatic Speech Recognition (ASR) technology.

It consists of a notebook which you can run on Google Colab.

You will test the model across various scenarios, evaluate its performance in each case, and analyze its strengths and weaknesses. By the end of the exercise, you will gain insights into the current state of ASR, identify areas where Whisper struggles, and consider potential improvements for future models.