ISSAI
Popular repositories
-
Kazakh_TTS
Kazakh_TTS PublicAn expanded version of the previously released Kazakh text-to-speech (KazakhTTS) synthesis corpus. In KazakhTTS2, the overall size has increased from 93 hours to 271 hours, the number of speakers h…
-
SpeakingFaces
SpeakingFaces PublicA large-scale publicly-available visual-thermal-audio dataset designed to encourage research in the general areas of user authentication, facial recognition, speech recognition, and human-computer …
-
ISSAI_SAIDA_Kazakh_ASR
ISSAI_SAIDA_Kazakh_ASR Publicthe first industrial-scale open-source Kazakh speech corpus. KSC2 corpus subsumes the previously introduced two corpora: KSC and KazakhTTS2 and supplements additional data from other sources. KSC2 …
-
thermal-facial-landmarks-detection
thermal-facial-landmarks-detection PublicSF-TL54: Thermal Facial Landmark Dataset with Visual Pairs.
Repositories
- Enhancing-Ambient-Assisted-Living-with-Multi-Modal-Vision-and-Language-Models Public
This project is aimed at detecting the abnormal behaviour or emergency cases using vision-language model (VLM), large language model (LLM), human detection model, text-to-speech (TTS) and speech-to-text models (STT). The framework can detect the subtle sings of emergency and actively interact with the user to make an accurate decision.
- OpenThermalPose Public
An Open-Source Annotated Thermal Human Pose Dataset and Initial YOLOv8-Pose Baselines
- thermal-facial-landmarks-detection Public
SF-TL54: Thermal Facial Landmark Dataset with Visual Pairs.