$0+

A Docker image with a Speech-to-Text webapp for Kazakh

I want this!

A Docker image with a Speech-to-Text webapp for Kazakh

$0+

Usage

  • download the kazakh-stt-cpu.7zfile that we are providing
  • extract it (you might need to install 7-zip)
  • open a terminal
  • in your terminal, run: docker load < kazakh-stt-cpu.tar
  • run: docker run -dp 8000:8000 taruen/kazakh-stt-cpu:latest
  • open a browser and go to the following page: http://localhost:8000/servlets/standalone.rkt

The screenshot attached shows how that page should look like.

Limitations

This is a CPU-only image, and on a computer without a special graphics processing unit (GPU) the process of transcribing is not fast. E.g. on a Lenovo Thinkpad T440p laptop (Intel(R) Core(TM) i7-4700MQ CPU @ 2.40GHz, 12GB of RAM) it takes about 6 mins to recognize 30 seconds of speech.

GPU-enabled images will be provided later, so stay tuned.

Acknowledgements

Our product uses ISSAI Kazakh Speech Corpus https://doi.org/10.48342/gkg9-gn84, which is available under a Creative Commons Attribution 4.0 International License.

Source code / building blocks

$
I want this!

A Docker image with a Speech-to-Text app for Kazakh using which you can convert .wav files with Kazakh speech into text

input files
.wav
output
plain text in browser
supported languages
Kazakh
does support GPUs?
no, this container is CPU only
Copy product URL
30-day money back guarantee