18.7.2023

šŸš€ What's new?

  • Live transcription latency improved in some cases, resulting in an up to 50% decrease in average latency. Max latency for final frames remains 800ms.
  • Added parameters to guide the diarization model for better accuracy:
    • num_speakers - forces the diarization to find a specific number of speakers in the audio
    • min_speakers, max_speakers - places upper and/or lower bounds for the number of speakers in the audio
  • Added transcription_hint parameter to live transcription, for providing a custom vocabulary for example
  • Live transcription now supports the choice between sending audio in binary format or Base64
  • Added support for MP3 lavf

Ā 

šŸ› ļø Bug fixes

  • Fixed issue with word timestamps sometimes showing a duration of 0 seconds
  • Speaker labels are now assigned by order or appearance (e.g. the 1st speaker will be
    speaker 0, the 2nd - speaker 1, etc.)
  • Invalid parameters in the request now return the correct error
  • Fixed rare blank screen after upgrading to pro on app.gladia.io