A Docker image with a Speech-to-Text webapp for Kazakh
$0+
$0+
https://schema.org/InStock
usd
Илнар Сәлимҗанов
Usage
- download the kazakh-stt-cpu.7zfile that we are providing
- extract it (you might need to install 7-zip)
- open a terminal
- in your terminal, run: docker load < kazakh-stt-cpu.tar
- run: docker run -dp 8000:8000 taruen/kazakh-stt-cpu:latest
- open a browser and go to the following page: http://localhost:8000/servlets/standalone.rkt
The screenshot attached shows how that page should look like.
Limitations
This is a CPU-only image, and on a computer without a special graphics processing unit (GPU) the process of transcribing is not fast. E.g. on a Lenovo Thinkpad T440p laptop (Intel(R) Core(TM) i7-4700MQ CPU @ 2.40GHz, 12GB of RAM) it takes about 6 mins to recognize 30 seconds of speech.
GPU-enabled images will be provided later, so stay tuned.
Acknowledgements
Our product uses ISSAI Kazakh Speech Corpus https://doi.org/10.48342/gkg9-gn84, which is available under a Creative Commons Attribution 4.0 International License.
Source code / building blocks
A Docker image with a Speech-to-Text app for Kazakh using which you can convert .wav files with Kazakh speech into text
input files
.wav
output
plain text in browser
supported languages
Kazakh
does support GPUs?
no, this container is CPU only
Add to wishlist
30-day money back guarantee