Weekly - Build a Real-time Speech Recognition System

I tried following along the code for the June 15th weekly webinar with Vik, Real-Time Speech Recognition in Python, and am running into errors with the punctuation section (it seems to be with how the model is implemented with torch). I’m using an M1 Mac and get this warning when I start the microphone (it shows up after each 20 second interval during recording, which is in line with the subprocess routine and only when punctuation is actively working):

WARNING: reverting to cpu as cuda is not available
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertModel: [‘cls.predictions.transform.LayerNorm.bias’, ‘cls.predictions.decoder.weight’, ‘cls.predictions.transform.dense.weight’, ‘cls.seq_relationship.weight’, ‘cls.predictions.transform.dense.bias’, ‘cls.predictions.transform.LayerNorm.weight’, ‘cls.predictions.bias’, ‘cls.seq_relationship.bias’]

  • This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
  • This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).

I did have to install torch and transformers differently than the webinar showed, mostly because I’m running in a virtual Conda environment. That may also have something to do with it.

–Does anyone know what would be causing the warning and/or how to address it?

I found this, but couldn’t get it to work (basically torch audio won’t install using these or any other commands):
conda create -n torch-nightly python=3.9.12
conda activate torch-nightly
pip install --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/cpu