![]() Certain word processors have built-in voice recorders that transcribe as you speak, while specially designed apps and online editors like Transcribe allow you to upload audio files and get an automatic transcription within minutes. I'll either just uses from a virtualenv or just not bother.If you're looking to transcribe audio to text, then there are a variety of options at your service. it works from the virtualenv, but I'm clueless on how to package it - I've been hitting one dead end after another for the past few hours. I'll next see if I can package it as it's not available in the repo/AUR. I'm starting to think it will be a much better use of my time to just manually transcribe my recordings myself than to bother with these over-hyped but not-actually-functional (or documented) speech recognition tools.ĮDIT: I've just tested Vosk in a virtualenv and it performs very well on real samples that I've tested. ![]() I've been finding lots of Julius step-by-step guides where the authors solve one error after another only to get it to run but not produce any output (it's "search" fails to match the audio to any english text) - I can replicate this. So I'm still on the hunt for something that will actually work. Perhaps I gave it too much credit with the movie example. Any other feedback on the PKGBUILDs would of course be welcome.ĮDIT: scratch the above - sphinx runs well, but I just tested it on some voice recording under ideal conditions, and it gave English word-salad as a result, but not at all representative of the input audio. I may upload the above PKGBUILDs to the AUR soon, but I need to double check the dependencies in each first. So if pockesphinx produces some recognizable dialog from a noisy movie, it should do very well with recordings I plan to take of meetings. This is not the intended use, but it was a sample file I had handy to test with. It was pretty choppy, but it was recognizable, and the test input was the audio channel from a downloaded movie with A LOT of background noise / music / effects. This resulted in a text file with text of the dialog in the test input. Pocketsphinx_continuous -infile /tmp/test.wav >/tmp/out.txt Their example showed this command line:įfmpeg -i /tmp/new.wav -ar 16000 -ac 1 /tmp/test.wav I don't need high quality - just a rough outline of the discussion in the recording will be sufficient.ĭoes anyone have recommendations for tools that may acheive this goal? And if Julius is a good option, how is it used? I found one example in the Julius github page that was giving an example of using it specifically for my goals: to convert a wav file to text. I don't intend to develop and train machine learning models (as speaker in the audio may change from one use to the next). I don't mind doing some scripting to acheive this, but I need to know how to call the tool to acheive the end result. I just need to get a tool to take a wave file and produce text. There is a BOOK for julius which seems targetted more toward those who want to do research on the machine learning models it uses. The one recommendation that keeps coming up is Julius, which is in the repos - but I can't find any useful documentation (there is no man page and the -help output is not helpful). Some that are packaged in the AUR have multiple AUR dependencies some of which fail to build (e.g., sphinxbase). I've been digging through various options, but very few of those recommended on sites I find from a web search are packaged in the repos or AUR and would seem to require a lot of work to package / build and it's not really clear if they'd even meet my needs. I'm looking for a way to convert an audio recording to a text transcript.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |