| 193 |
Huang, Jia-Hong; Zhu, Hongyi; Shen, Yixian; Rudinac, Stevan; Kanoulas, Evangelos |
Image2Text2Image: A Novel Framework for Label-Free Evaluation of Image-to-Text Generation with Text-to-Image Diffusion Models |
| 214 |
Wei, Wei; Zhang, Bingkun; Wang, Yibing |
TFCST: An Efficient Emotion Recognition Model Based on Deep Speech Analysis and Hierarchical Progressive Structure |
| 215 |
Wei, Wei; Zhang, Bingkun; Wang, Yibing |
TS-MEFM: A New Multimodal Speech Emotion Recognition Network Based on Speech and Text Fusion |
| 230 |
Matsuhira, Chihaya ; Kastner, Marc A. ; Komamizu, Takahiro ; Hirayama, Takatsugu ; Ide, Ichiro |
Quantifying Image-Adjective Associations by Leveraging Large-Scale Pretrained Models |
| 288 |
Li, Su ; Wang, Liang ; Wang, Jianye ; Zhang, Ziheng ; Zhang, Junjun ; Zhang, Lei |
Enhanced Anomaly Detection in 3D Motion through Language-Inspired Occlusion-Aware Modeling |
| 364 |
C. Quan, Khanh-An ; Guinaudeau, Camille ; Satoh, Shin’ichi |
Evaluating VQA Models' Consistency in the Scientific Domain |