autocaptioning is just a perfect example of overfitting data tbh
it thinks that Al Cove is something that someone is likely to say because Al and Cove separately are more commonly said than alcove, even though Al Cove is just a nonsense transcription