18.7.2023
9 months ago by Jean-Louis Queguiner
š What's new?
- Live transcription latency improved in some cases, resulting in an up to 50% decrease in average latency. Max latency for final frames remains 800ms.
- Added parameters to guide the diarization model for better accuracy:
- num_speakers - forces the diarization to find a specific number of speakers in the audio
- min_speakers, max_speakers - places upper and/or lower bounds for the number of speakers in the audio
- Added transcription_hint parameter to live transcription, for providing a custom vocabulary for example
- Live transcription now supports the choice between sending audio in binary format or Base64
- Added support for MP3 lavf
Ā
š ļø Bug fixes
- Fixed issue with word timestamps sometimes showing a duration of 0 seconds
- Speaker labels are now assigned by order or appearance (e.g. the 1st speaker will be
speaker 0, the 2nd - speaker 1, etc.) - Invalid parameters in the request now return the correct error
- Fixed rare blank screen after upgrading to pro on app.gladia.io