Page content

Poster Sessions

To Presenters

  • Please set up your poster after 1:00 PM and before your poster session starts.

Day 1: 8 January 14:00 – 15:30

Poster ID Paper ID Paper Title Authors
PS1-1 120 Quantized-ViT Efficient Training via Fisher Matrix Regularization Shang, Yuzhang; Liu, Gaowen; Kompella, Ramana; Yan, Yan
PS1-2 121 Saliency based data augmentation for few-shot video action recognition Kong, Yongqiang; Wang, Yunhong; Li, Annan
PS1-3 128 Hybrid Scalable Video Coding with Neural Compression and Enhancement for Streaming Media Ye, Yuyao; Yang, Jiayu; Zhao, Yang; Gao, Mengping; Cao, Hongbin; Wang, Ronggang
PS1-4 130 Pubic Symphysis-Fetal Head Segmentation Network Using BiFormer Attention Mechanism and Multipath Dilated Convolution Cai, Pengzhou; Jiang, Lu; Li, Yanxin; Liu, Xiaojuan; Lan, Libin
PS1-5 131 DART: Depth-Enhanced Accurate and Real-Time Background Matting Li, Guofeng; Li, Hanxi; Li, Bo; Wu, Lin; Cheng, Yan
PS1-6 141 MLP-AMDC: A MLP Architecture for Adaptive-Mask-based Dual-Camera snapshot hyperspectral imaging Cai, Zeyu; Chen, Xunhao; Zhang, Can; Chen, yuchong; Yang, Jiming; Shi, Wubin; Jin, Chengqian; Da, Feipeng
PS1-7 144 Kiite World: Socializing Map-Based Music Exploration Through Playlist Sharing and Synchronized Listening Tsukuda, Kosetsu; Takahashi, Takumi; Ishida, Keisuke; Hamasaki, Masahiro; Goto, Masataka
PS1-8 146 Enhancing Environmental Monitoring through Multispectral Imaging: The WasteMS Dataset for Semantic Segmentation of Lakeside Waste Zhu, Qinfeng; Weng, Ningxin; Fan, Lei; Cai, Yuanzhi
PS1-9 158 Frequency-aware Convolution for Sound Event Detection Song, Tao; Zhang, Wenwen
PS1-10 163 MSD-YOLO : An efficient algorithm for small target detection Liu, Dongyu; Zhu, Yuan; liu, rui; Xing, Zhecong; Geng, Weiyang; Wang, Yanqiang
PS1-11 166 Robust Active Speaker Detection in Challenging Environments Using GNN-Fused Multi-Modal Cues and Body Language Li, Yongqian; Luo, Yong; Zhou, Xin
PS1-12 172 Intra-Class Compact Facial Expression Recognition Based on Amplitude Phase Separation Tian, Xiang; Zhang, Yuan; Mu, Chang; Zhang, Ziyang
PS1-13 176 PA2Net: Pyramid Attention Aggregation Network for Saliency detection Yu, Jizhe; Liu, Yu; Wu, Xiaoshuai; Xu, Kaiping; Li, Jiangquan
PS1-14 188 LIESA: Low-light Image Enhancement with Semantic Awareness Zhang, Jingyao; Hao, Shijie; Sun, Fuming Sun; Rao, Yuan
PS1-15 195 Deep Dual Internal Learning for Hyperspectral Image Super-Resolution Sun, Yongqing; Liu, Hong; Chang, Qiong; Han, Xianhua
PS1-16 198 Zero-shot sketch-based image retrieval with hybrid information fusion and sample relationship modeling Wu, Weijie; Li, Jun; Wu, Zhijian; Xu, Jianhua
PS1-17 206 The Right to an Explanation under the GDPR and the AI Act Juliussen, Bjørn Aslak
PS1-18 221 Improving singing voice transcription generalization with AI generated accompaniments Perez, Miguel; Kirchhoff, Holger; Grosche, Peter; Serra, Xavier
PS1-19 228 LITA: LMM-guided Image-Text Alignment for Art Assessment Sunada, Tatsumi; Shiohara, Kaede; Xiao, Ling; Yamasaki, Toshihiko
PS1-20 229 Towards Inclusive Education: Multimodal Classification of Textbook Images for Accessibility Yadav, Saumya; Lincker, Élise; Huron, Caroline; Martin, Stéphanie; Guinaudeau, Camille; Satoh, Shin’ichi; Shukla, Jainendra
PS1-21 296 GWUNet: A UNet with Gated Attention and Improved Wavelet Transform for Thyroid Nodules Segmentation Zheng, Shuijing; Yu, Suxi; Wang, Yi; Wen, Jing
PS1-22 111 SCLSTE: Semi-Supervised Contrastive Learning-Guided Scene Text Editing Yin, Min; Xie, Liang; Liang, HaoRan; Zhao, Xing; Chen, Ben; Liang, RongHua

Day 2: 9 January 13:30 – 15:00

Poster ID Paper ID Paper Title Authors
PS2-1 192 Comparative Analysis of Relevance Feedback Techniques for Image Retrieval Vadicamo, Lucia; Scotti, Francesca; Dearle, Alan; Connor, Richard
PS2-2 241 Understanding the Roles of Visual Modality in Multimodal Dialogue: An Empirical Study Cao, Qian; Song, Ruihua; Chen, Xu
PS2-3 242 DistillSleep: Leverage Self-Distillation to Improve Performance After Representation Learning for Sleep Staging Yu, Le; Zhang, Xianchao; Qian, Shuxia; Sun, Hong
PS2-4 246 Temporal Closeness for Enhanced Cross-Modal Retrieval of Sensor and Image Data Yamamoto, Shuhei; Kando, Noriko
PS2-5 247 An Analytical Method for Rendering Plenoptic Cameras 2.0 on 3D Multi-Layer Displays Losfeld, Armand; Seznec, Nicolas; Van Bogaert, Laurie; Lafruit, Gauthier; Teratani, Mehrdad
PS2-6 251 QRALadder: QoE and Resource Consumption-Aware Encoding Ladder Optimization for Live Video Streaming Zhu, Yingqian; Gao, Guanyu
PS2-7 256 Boosting Human Pose Estimation via Heatmap Refinement Jiang, Ling; Liu, Zhuocheng; Li, Kaige; Wu, Wei
PS2-8 265 FoodMLLM-JP: Leveraging Multimodal Large Language Models for Japanese Recipe Generation Imajuku, Yuki; Yamakata, Yoko; Aizawa, Kiyoharu
PS2-9 283 LLMs-based Augmentation for Domain Adaptation in Long-tailed Food Datasets Wang, Qing; Ngo, Chong Wah; Lim, Ee-Peng; Sun, Qianru
PS2-10 292 Music2MIDI: Pop Music to MIDI Piano Cover Generation Yip, Tin Yui; Chau, Chuck-jee
PS2-11 293 Balancing Efficiency and Accuracy: An Analysis of Sampling for Video Copy Detection Chen, Xiangyu; Satoh, Shinichi
PS2-12 295 One-Shot Generative Domain Adaptation by Constructing Self-Amplifying Datasets Xiang, Yanru; Li, Yi
PS2-13 306 Visual Anomaly Detection on Topological Connectivity under Improved YOLOv8 Li, Yu; Xie, Zhenping
PS2-14 315 HierArtEx: Hierarchical Representations and Art Experts Supporting the Retrieval of Museums in the Metaverse Falcon, Alex; Abdari, Ali; Serra, Giuseppe
PS2-15 317 DocMamba: Robust Document Image Dewarping via Selective State Space Sequence Modeling Han, Miaolin; Li, Huibin
PS2-16 326 Real-Time Action Detection in Volleyball Matches Using DETR Architecture shih, Mu-Jan; Hsu, Yi-Yu
PS2-17 332 Select and Order: Enhancing Few-Shot Image Classification through In-Context Learning Huang, Hujiang; Xie, Yu; Gao, Jun; Fan, Chuanliu; Cao, Ziqiang
PS2-18 336 SMG-Diff: Adversarial Attack Method Based on Semantic Mask-Guided Diffusion Zhang, Yongliang; Liu, Jing
PS2-19 344 Dual-Task Feedback Learning for Tongue Detection via Super-Resolution Integration Sun, Ying; Wei, Meiyi; Chen, Gang
PS2-20 354 Towards Visual Storytelling by Understanding Narrative Context through Scene-Graphs Phueaksri, Itthisak; Kastner, Marc A.; Kawanishi, Yasutomo; Komamizu, Takahiro; Ide, Ichiro
PS2-21 456 AMFT-YOLO: A Adaptive Multi-Scale YOLO Algorithm with Multi-Level Feature Fusion for Object Detection in UAV Scenes wang, tiebiao; li, xiaoyang; cui, zhenchao
PS2-22 276 Lightweight Dual Grouped Large-Kernel Convolutions for Salient Object Detection Network Liu, Jiajie; Zhang, Zhibin
PS2-23 312 Modeling High-order Relationships between Human and Video for Emotion Recognition Ai, Hanxu; Tao, Xiaomei; Li, Xingbing; Gan, Yanling
DP 117 EIA: Edge-aware Imperceptible Adversarial Attacks on 3D Point Clouds Wang, Zhensu; Peng, Weilong; Wang, Le; Wu, Zhizhe; Zhu, Peican; Tang, Keke
DP 127 MKSNet: Advanced Small Object Detection in Remote Sensing Imagery with Multi-Kernel and Dual Attention Mechanisms Zhang, Jiahao; Gao, Guangyu; Zhao, Xiao
DP 140 Infrared Small Target Detection with Feature Refinement and Context Enhancement Li, Xiuhong; Zhu, Xinyue; Li, Boyuan; Li, Songlin; Wang, Luyao; Jia, Zhenhong
DP 173 Modality-Specific Hashing: Transform Cross-Modal Retrieval into Single-Modal Retrieval Ding, Guohui; Li, Zhonghua; Ren, Yongqiang
DP 178 Multimodal Prompt Learning for Audio Visual Scene-aware Dialog Xu, Feifei; Jia, Fumiaoyue; Zhou, Wang
DP 182 MSA-Former: Multi-Scale Adaptive Transformer for Image Snow Removal Wang, Bin; Chen, Zekun; Zhang, Lei; Liang, Shili; Guo, Sijia; Kang, Xinyu; Li, Huajing
DP 184 SES-Net: Multi-dimensional Spot-Edge-Surface Network for Nuclei Segmentation Lu, Congjian; Zhou, Shuwang; Shan, Ke; Zhang, Hongkuan; Liu, Zhaoyang
DP 189 PianoPal: A Robotic Multimedia System for Interactive Piano Instruction Based on Q-learning and Real-time Feedback Wang, Yufei; Yao, Junfeng; Wang, Zefeng
DP 199 CLIP Multi-modal Hashing for Multimedia Retrieval Zhu, Jian; Sheng, Mingkai; Huang, Zhangmin; Chang, Jingfei; Long, Jian; Jiang, Jinling; Liu, Lei; Luo, Cheng
DP 223 Integrating S1&S2 Framework for Enhanced Semantic Match in Person Re-identification Yang, Xiukang; Ge, Jingguo; Li, Hui; Li, Liangxiong; Wu, Bingzhen
DP 237 Hyper-NeuS:Hypernetworks for Neural SDF Implicit Surface Reconstruction by Volume Rendering Li, Jingkun; Qi, Na; Zhu, Qing
DP 253 Structural Information-guided Fine-grained Texture Image Inpainting Fang, Zhiyi; Qian, Yi; Dai, Xiyue
DP 272 GFA-UDIS: Global-to-Flow Alignment for Unsupervised Deep Image Stitching Han, Sijia; Zhang, Zhibin
DP 275 Joint Decision Network with Modality-Specific and Dual Interactive Features for Fake News Detection Wu, Fei; Zhou, Ruixuan; Ji, Yimu; Jing, Xiao-Yuan
DP 277 MS-SAM:Multi-Scale SAM based on Dynamic Weighted Agent Attention Yang, Enhui; Zhang, Zhibin
DP 281 Multi-Modal Information Multi-Angle Mining For Multimedia Recommendation ZHU, YIJIE; Li, MingYong
DP 305 MambaTalk: Speech-driven 3D Facial Animation with Mamba Zhu, Deli; Xu, Zhao; Yang*, Yunong

Day 3: 10 January 13:30 – 15:00

Poster ID Paper ID Paper Title Authors
PS3-1 356 Rotation Methods for 360-degree Videos in Virtual Reality - A Comparative Study Hürst, Wolfgang; Zeches, Leo
PS3-2 360 Camouflaged Object Detection Based on Localization Guidance and Multi-Scale Refinement Wang, JinYang; Wu, Wei
PS3-3 362 Poseidon: A NAS-Based Ensemble Defense Method against Multiple Perturbations Su, Yulan; Zhang, Sisi; Lin, Zechao; Wang, Xingbin; Zhao, Lutan; Meng, Dan; Hou, Rui
PS3-4 363 MM-CARP: Multimodal Model with Cross-modal retrieval-Augmented and visual Region Perception Guo, Junhao; Fu, Chenhan; Wang, Guoming; Lu, Rongxing; Chen, Dong; Tang, Siliang
PS3-5 365 Revisit Data Association in Semantic SLAM Systems for Autonomous Parking Shao, Xuan; Huang, Leming; Liu, Xinghua
PS3-6 368 Lightweight Motion-Aware Video Super-Resolution for Compressed Videos KWON, ILHWAN; Li, Jun; Shah, Rajiv Ratn; Prasad, Mukesh
PS3-7 373 Vision-Language Pretraining for Variable-shot Image Classification Papadopoulos, Sotirios; Ioannidis, Konstantinos; Vrochidis, Stefanos; Kompatsiaris, Ioannis; Patras, Ioannis
PS3-8 377 A Multi-Aspect Multi-Granularity Pronunciation Assessment Method Based on Branchformer Encoder and Hierarchical Aggregation Du, Wenxu; Wumaier, Aishan; Shi, Yahui; Yi, Nian; Liu, Dehua
PS3-9 386 SCANet: Semantic Coherence Attention Network for Clothing Change Person Re-identification Yang, Dajiang; Wu, Wei; Lee, Yuxing
PS3-10 417 Toward A Full Pipeline Approach to Autonomous Drone Landing Site Identification: From Terrain Survey to Embedded Classifier Springer, Joshua David; Guðmundsson, Gylfi Þór; Kyas, Marcel
PS3-11 429 Innovative Lifelog Visualization and Exploration in Virtual Reality - A Comparative Study Hürst, Wolfgang; Visser, Yannick
PS3-12 435 Synchronization and Calibration of Video Sequences acquired using Multiple Plenoptic 2.0 Cameras Bonatto, Daniele; Fernandes Pinto Fachada, Sarah; Sancho, Jaime; Juarez, Eduardo; Lafruit, Gauthier; Teratani, Mehrdad
PS3-13 444 A Dual-Branch Model for Color Constancy Chen, Zhaoxin; Ma, Bo
PS3-14 445 Data-free Functional Projection of Large Language Models onto Social Media Tagging Domain Mu, Wenchuan; Lim, Kwan Hui
PS3-15 455 MDT-Net: a mask decoder tuning strategy for CLIP-based zero-shot 3D Classification Yan, Hao; Bai, Jing
PS3-16 458 Optimally Planning Drone Trajectory to Capture a 3D Gaussian Splatting Object Wu, Cheng-Yuan; Sun, Yuan-Chun; Lee, Cheng-Tse; Hsu, Cheng-Hsin
PS3-17 230 Quantifying Image-Adjective Associations by Leveraging Large-Scale Pretrained Models Matsuhira, Chihaya; Kastner, Marc A.; Komamizu, Takahiro; Hirayama, Takatsugu; Ide, Ichiro
PS3-18 137 Can masking background and object reduce static bias for zero-shot action recognition? Fukuzawa, Takumi; Hara, Kensho; Kataoka, Hirokatsu; Tamaki, Toru
PS3-19 355 CalorieVoL: Integrating Volumetric Context into Multimodal Large Language Models for Image-based Calorie Estimation Tanabe, Hikaru; Yanai, Keiji
PS3-20 416 Multimodal Engagement Prediction in Human-Robot Interaction using Transformer Neural Networks Lim, Jia Yap; See, John; Dondrup, Christian
PS3-21 431 What Should Autonomous Robots Verbalize and What Should They Not? Yoshihara, Daichi; Yuguchi, Akishige; Kawano, Seiya; Iio, Takamasa; Yoshino, Koichiro
PS3-22 438 BiCA-YOLO: Bidirectional Feature Enhancement and Cross Coordinate Attention for Small Object Detection Lv, Jinyan; Xiao, Guoqiang
DP 307 Frequency-Based Unsupervised Low-Light Image Enhancement Framework Wang, Haodian
DP 309 Target-Oriented Dynamic Denosing Curriculum Learning for Multimodel Stance Detection Suo, Zihao; Pan, Shanliang
DP 316 Noise-robust Separating Multi-source Aliased Vibration Signal Based on Transformer Demucs Jiang, Wanchang; Jiang, Yuxin
DP 321 gFlow: Distributed Real-Time Reverse Remote Rendering System Model Xu, Yixiao; Li, Yubo; Xu, Wanzhao; Gu, Yicheng; Wang, Yun; Ma, Jiangyuan; Qi, Zhengwei
DP 331 BLCC: A Benchmark for Multi-LiDAR and Multi-Camera Calibration Minghui, Hou; Gang, Wang; Zhiyang, Wang; Tongzhou, Zhang; Baorui, Ma
DP 342 MC-YOLO: Multi-scale Transmission Line Defect Target Recognition Network Wang, Jingdong; Ding, XU; Meng, Fanqi
DP 350 A Novel Human Abnormal Posture Detection Method Based on Spatial-Topological Feature Fusion of Skeleton Ma, Yuefeng; Cheng, Zhiqi; Liu, Deheng; Tang, Shiying
DP 359 SSCDUF: Spatial-Spectral Correlation Transformer Based on Deep Unfolding Framework for Hyperspectral Image Reconstruction Zhao, Hui; Qi, Na; Zhu, Qing; Lin, Xiumin
DP 383 Cross-View Geo-Localization via Learning Correspondence Semantic Similarity Knowledge Chen, Guanli; Huang, Guoheng; Yuan, Xiaochen; Chen, Xuhang; Zhong, Guo; Pun, Chi-Man
DP 385 Style Separation and Content Recovery for Generalizable Sketch Re-identification and A New Benchmark Lu, Lingyi; Xu, Xin; Wang, Xiao
DP 387 Chain of Thought Guided Few-shot Fine-tuning of LLMs for Multimodal Aspect-based Sentiment Classification Wu, Hao; Yang, Danping; Liu, Peng; Li, Xianxian
DP 393 Progressive Neural Architecture Generation with Weaker Predictors Zhang, Zhengzhuo; Zhuang, Liansheng
DP 420 Self-Supervised Reference-based Image Super-Resolution with Conditional Diffusion Model shi, shuai; Qi, Na; Li, Yezi; Zhu, Qing
DP 447 TPS-YOLO: The Efficient Tiny Person Detection Network Based on Improved YOLOv8 and Model Pruning Yao, Li; Huang, Qianni; Wan, Yan
DP 460 MICAN: Multi-modal Inconsistency-based Cooperation Attention Network for fake news detection Yi, Zepu; Lu, Songfeng; Tang, Xueming; Zhu, Jianxin; Wu, Junjun
DP 214 TACST: Time-Aware Transformer for Robust Speech Emotion Recognition Wei, Wei; Zhang, Bingkun; Wang, Yibing
DP 215 TS-MEFM: A New Multimodal Speech Emotion Recognition Network Based on Speech and Text Fusion Wei, Wei; Zhang, Bingkun; Wang, Yibing