Page content

Poster Presentations

paperID authors title
111 Yin, Min ; Xie, Liang ; Liang, HaoRan ; Zhao, Xing ; Chen, Ben ; Liang, RongHua SCLSTE: Semi-Supervised Contrastive Learning-Guided Scene Text Editing
120 Shang, Yuzhang ; Liu, Gaowen ; Kompella, Ramana ; Yan, Yan Quantized-ViT Efficient Training via Fisher Matrix Regularization
121 Kong, Yongqiang; Wang, Yunhong; Li, Annan Saliency based data augmentation for few-shot video action recognition
128 Ye, Yuyao ; Yang, Jiayu ; Zhao, Yang ; Gao, Mengping ; Cao, Hongbin ; Wang, Ronggang Hybrid Scalable Video Coding with Neural Compression and Enhancement for Streaming Media
130 Cai, Pengzhou ; Jiang, Lu ; Li, Yanxin ; Liu, Xiaojuan ; Lan, Libin Pubic Symphysis-Fetal Head Segmentation Network Using BiFormer Attention Mechanism and Multipath Dilated Convolution
131 Li, Guofeng ; Li, Hanxi ; Li, Bo ; Wu, Lin ; Cheng, Yan DART: Depth-Enhanced Accurate and Real-Time Background Matting
141 Cai, Zeyu ; Chen, Xunhao ; Zhang, Can ; Chen, yuchong ; Yang, Jiming ; Shi, Wubin ; Jin, Chengqian ; Da, Feipeng MLP-AMDC: A MLP Architecture for Adaptive-Mask-based Dual-Camera snapshot hyperspectral imaging
144 Tsukuda, Kosetsu; Takahashi, Takumi; Ishida, Keisuke; Hamasaki, Masahiro; Goto, Masataka Kiite World: Socializing Map-Based Music Exploration Through Playlist Sharing and Synchronized Listening
146 Zhu, Qinfeng ; Weng, Ningxin ; Fan, Lei ; Cai, Yuanzhi Enhancing Environmental Monitoring through Multispectral Imaging: The WasteMS Dataset for Semantic Segmentation of Lakeside Waste
158 Song, Tao ; Zhang, Wenwen Frequency-aware Convolution for Sound Event Detection
163 Liu, Dongyu; Zhu, Yuan; liu, rui; Xing, Zhecong; Geng, Weiyang; Wang, Yanqiang MSD-YOLO : An efficient algorithm for small target detection
166 Li, Yongqian ; Luo, Yong ; Zhou, Xin Robust Active Speaker Detection in Challenging Environments Using GNN-Fused Multi-Modal Cues and Body Language
172 Tian, Xiang; Zhang, Yuan; Mu, Chang; Zhang, Ziyang Intra-Class Compact Facial Expression Recognition Based on Amplitude Phase Separation
173 Ding, Guohui; Li, Zhonghua; Ren, Yongqiang Modality-Specific Hashing: Transform Cross-Modal Retrieval into Single-Modal Retrieval
176 Yu, Jizhe; Liu, Yu; Wu, Xiaoshuai; Xu, Kaiping; Li, Jiangquan PA2Net: Pyramid Attention Aggregation Network for Saliency detection
178 Xu, Feifei; Jia, Fumiaoyue; Zhou, Wang Multimodal Prompt Learning for Audio Visual Scene-aware Dialog
181 Chen, Liang-Chia; Chu, Wei-Ta HCV: Lightweight Hybrid CNN-Vision Transformer for Visual Object Tracking
182 Wang, Bin ; Chen, Zekun ; Zhang, Lei ; Liang, Shili ; Guo, Sijia ; Kang, Xinyu ; Li, Huajing MSA-Former: Multi-Scale Adaptive Transformer for Image Snow Removal
184 Lu, Congjian ; Zhou, Shuwang ; Shan, Ke ; Zhang, Hongkuan ; Liu, Zhaoyang SES-Net: Multi-dimensional Spot-Edge-Surface Network for Nuclei Segmentation
188 Zhang, Jingyao ; Hao, Shijie ; Sun, Fuming Sun ; Rao, Yuan LIESA: Low-light Image Enhancement with Semantic Awareness
189 Wang, Yufei ; Yao, Junfeng ; Wang, Zefeng PianoPal: A Robotic Multimedia System for Interactive Piano Instruction Based on Q-learning and Real-time Feedback
192 Vadicamo, Lucia ; Scotti, Francesca ; Dearle, Alan ; Connor, Richard Comparative Analysis of Relevance Feedback Techniques for Image Retrieval
195 Sun, Yongqing ; Liu, Hong ; Chang, Qiong ; Han, Xianhua Deep Dual Internal Learning for Hyperspectral Image Super-Resolution
198 Wu, Weijie ; Li, Jun ; Wu, Zhijian ; Xu, Jianhua Zero-shot sketch-based image retrieval with hybrid information fusion and sample relationship modeling
206 Juliussen, Bjørn Aslak The Right to an Explanation under the GDPR and the AI Act
221 Perez, Miguel ; Kirchhoff, Holger ; Grosche, Peter ; Serra, Xavier Improving singing voice transcription generalization with AI generated accompaniments
228 Sunada, Tatsumi; Shiohara, Kaede; Xiao, Ling; Yamasaki, Toshihiko LITA: LMM-guided Image-Text Alignment for Art Assessment
229 Yadav, Saumya ; Lincker, Élise ; Huron, Caroline ; Martin, Stéphanie ; Guinaudeau, Camille ; Satoh, Shin’ichi ; Shukla, Jainendra Towards Inclusive Education: Multimodal Classification of Textbook Images for Accessibility
237 Li, Jingkun; Qi, Na; Zhu, Qing Hyper-NeuS:Hypernetworks for Neural SDF Implicit Surface Reconstruction by Volume Rendering
241 Cao, Qian; Song, Ruihua; Chen, Xu Understanding the Roles of Visual Modality in Multimodal Dialogue: An Empirical Study
242 Yu, Le ; Zhang, Xianchao ; Qian, Shuxia ; Sun, Hong DistillSleep: Leverage Self-Distillation to Improve Performance After Representation Learning for Sleep Staging
246 Yamamoto, Shuhei ; Kando, Noriko Temporal Closeness for Enhanced Cross-Modal Retrieval of Sensor and Image Data
247 Losfeld, Armand; Seznec, Nicolas; Van Bogaert, Laurie; Lafruit, Gauthier; Teratani, Mehrdad An Analytical Method for Rendering Plenoptic Cameras 2.0 on 3D Multi-Layer Displays
251 Zhu, Yingqian; Gao, Guanyu QRALadder: QoE and Resource Consumption-Aware Encoding Ladder Optimization for Live Video Streaming
253 Fang, Zhiyi ; Qian, Yi ; Dai, Xiyue Structural Information-guided Fine-grained Texture Image Inpainting
256 Jiang, Ling; Liu, Zhuocheng; Li, Kaige; Wu, Wei Boosting Human Pose Estimation via Heatmap Refinement
265 Imajuku, Yuki; Yamakata, Yoko; Aizawa, Kiyoharu FoodMLLM-JP: Leveraging Multimodal Large Language Models for Japanese Recipe Generation
272 Han, Sijia; Zhang, Zhibin GFA-UDIS: Global-to-Flow Alignment for Unsupervised Deep Image Stitching
275 Wu, Fei ; Zhou, Ruixuan ; Ji, Yimu ; Jing, Xiao-Yuan Joint Decision Network with Modality-Specific and Dual Interactive Features for Fake News Detection
276 Liu, Jiajie; Zhang, Zhibin Lightweight Dual Grouped Large-Kernel Convolutions for Salient Object Detection Network
277 Yang, Enhui; Zhang, Zhibin MS-SAM:Multi-Scale SAM based on Dynamic Weighted Agent Attention
281 ZHU, YIJIE; Li, MingYong Multi-Modal Information Multi-Angle Mining For Multimedia Recommendation
283 Wang, Qing; Ngo, Chong Wah; Lim, Ee-Peng; Sun, Qianru LLMs-based Augmentation for Domain Adaptation in Long-tailed Food Datasets
292 Yip, Tin Yui; Chau, Chuck-jee Music2MIDI: Pop Music to MIDI Piano Cover Generation
293 Chen, Xiangyu; Satoh, Shinichi Balancing Efficiency and Accuracy: An Analysis of Sampling for Video Copy Detection
295 Xiang, Yanru; Li, Yi One-Shot Generative Domain Adaptation by Constructing Self-Amplifying Datasets
296 Zheng, Shuijing; Yu, Suxi; Wang, Yi; Wen, Jing GWUNet: A UNet with Gated Attention and Improved Wavelet Transform for Thyroid Nodules Segmentation
297 Chen, Junjian ; Yang, Xuan Uncertainty-guided Joint Semi-supervised Segmentation and Registration of Cardiac Images
306 Li, Yu ; Xie, Zhenping Visual Anomaly Detection on Topological Connectivity under Improved YOLOv8
307 Wang, Haodian Frequency-Based Unsupervised Low-Light Image Enhancement Framework
309 Suo, Zihao; Pan, Shanliang Target-Oriented Dynamic Denosing Curriculum Learning for Multimodel Stance Detection
312 Ai, Hanxu ; Tao, Xiaomei ; Li, Xingbing ; Gan, Yanling Modeling High-order Relationships between Human and Video for Emotion Recognition
315 Falcon, Alex ; Abdari, Ali ; Serra, Giuseppe HierArtEx: Hierarchical Representations and Art Experts Supporting the Retrieval of Museums in the Metaverse
316 Jiang, Wanchang; Jiang, Yuxin Noise-robust Separating Multi-source Aliased Vibration Signal Based on Transformer Demucs
317 Han, Miaolin; Li, Huibin DocMamba: Robust Document Image Dewarping via Selective State Space Sequence Modeling
321 Xu, Yixiao ; Li, Yubo ; Xu, Wanzhao ; Gu, Yicheng ; Wang, Yun ; Ma, Jiangyuan ; Qi, Zhengwei gFlow: Distributed Real-Time Reverse Remote Rendering System Model
326 shih, Mu-Jan ; Hsu, Yi-Yu Real-Time Action Detection in Volleyball Matches Using DETR Architecture
332 Huang, Hujiang; Xie, Yu; Gao, Jun; Fan, Chuanliu; Cao, Ziqiang Select and Order: Enhancing Few-Shot Image Classification through In-Context Learning
336 Zhang, Yongliang; Liu, Jing SMG-Diff: Adversarial Attack Method Based on Semantic Mask-Guided Diffusion
337 Terada, Takamasa; Toyoura, Masahiro Wavelet Integrated Convolutional Neural Network for ECG Signal Denoising
342 Wang, Jingdong; Ding, XU; Meng, Fanqi MC-YOLO: Multi-scale Transmission Line Defect Target Recognition Network
344 Sun, Ying; Wei, Meiyi; Chen, Gang Dual-Task Feedback Learning for Tongue Detection via Super-Resolution Integration
350 Ma, Yuefeng ; Cheng, Zhiqi ; Liu, Deheng ; Tang, Shiying A Novel Human Abnormal Posture Detection Method Based on Spatial-Topological Feature Fusion of Skeleton
354 Phueaksri, Itthisak ; Kastner, Marc A. ; Kawanishi, Yasutomo ; Komamizu, Takahiro ; Ide, Ichiro Towards Visual Storytelling by Understanding Narrative Context through Scene-Graphs
356 Hürst, Wolfgang; Zeches, Leo Rotation Methods for 360-degree Videos in Virtual Reality - A Comparative Study
360 Wang, JinYang; Wu, Wei Camouflaged Object Detection Based on Localization Guidance and Multi-Scale Refinement
362 Su, Yulan; Zhang, Sisi; Lin, Zechao; Wang, Xingbin; Zhao, Lutan; Meng, Dan; Hou, Rui Poseidon: A NAS-Based Ensemble Defense Method against Multiple Perturbations
363 Guo, Junhao ; Fu, Chenhan ; Wang, Guoming ; Lu, Rongxing ; Chen, Dong ; Tang, Siliang MM-CARP: Multimodal Model with Cross-modal retrieval-Augmented and visual Region Perception
365 Shao, Xuan ; Huang, Leming ; Liu, Xinghua Revisit Data Association in Semantic SLAM Systems for Autonomous Parking
368 KWON, ILHWAN ; Li, Jun ; Shah, Rajiv Ratn ; Prasad, Mukesh Lightweight Motion-Aware Video Super-Resolution for Compressed Videos
373 Papadopoulos, Sotirios ; Ioannidis, Konstantinos ; Vrochidis, Stefanos ; Kompatsiaris, Ioannis ; Patras, Ioannis Vision-Language Pretraining for Variable-shot Image Classification
377 Du, Wenxu; Wumaier, Aishan; Shi, Yahui; Yi, Nian; Liu, Dehua A Multi-Aspect Multi-Granularity Pronunciation Assessment Method Based on Branchformer Encoder and Hierarchical Aggregation
383 Chen, Guanli ; Huang, Guoheng ; Yuan, Xiaochen ; Chen, Xuhang ; Zhong, Guo ; Pun, Chi-Man Cross-View Geo-Localization via Learning Correspondence Semantic Similarity Knowledge
386 Yang, Dajiang; Wu, Wei; Lee, Yuxing SCANet: Semantic Coherence Attention Network for Clothing Change Person Re-identification
387 Wu, Hao; Yang, Danping; Liu, Peng; Li, Xianxian Chain of Thought Guided Few-shot Fine-tuning of LLMs for Multimodal Aspect-based Sentiment Classification
392 Cheng, Shyi-Chyi ; CHEN, YEN-LIN ; Li, Shih-Yu MPPQNet: A Moment-Preserving Product Quantization Neural Network for Progressive 3D Point Cloud Transmission
414 Hezel, Nico; Barthel, Kai Uwe; Schilling, Bruno; Schall, Konstantin; Jung, Klaus Dynamic Exploration Graph: A Novel Approach for Efficient Nearest Neighbor Search in Evolving Multimedia Datasets
417 Springer, Joshua David; Guðmundsson, Gylfi Þór; Kyas, Marcel Toward A Full Pipeline Approach to Autonomous Drone Landing Site Identification: From Terrain Survey to Embedded Classifier
420 shi, shuai; Qi, Na; Li, Yezi; Zhu, Qing Self-Supervised Reference-based Image Super-Resolution with Conditional Diffusion Model
429 Hürst, Wolfgang; Visser, Yannick Innovative Lifelog Visualization and Exploration in Virtual Reality - A Comparative Study
435 Bonatto, Daniele ; Fernandes Pinto Fachada, Sarah ; Sancho, Jaime ; Juarez, Eduardo ; Lafruit, Gauthier ; Teratani, Mehrdad Synchronization and Calibration of Video Sequences acquired using Multiple Plenoptic 2.0 Cameras
436 Lim, Xin ; Wong, Lai-Kuan ; Loh, Yuen Peng ; Gu, Ke ; Lin, Weisi Mix-YOLONet: Deep Image Dehazing for Improving Object Detection
438 Lv, Jinyan; Xiao, Guoqiang BiCA-YOLO: Bidirectional Feature Enhancement and Cross Coordinate Attention for Small Object Detection
444 Chen, Zhaoxin; Ma, Bo A Dual-Branch Model for Color Constancy
445 Mu, Wenchuan; Lim, Kwan Hui Data-free Functional Projection of Large Language Models onto Social Media Tagging Domain
447 Yao, Li; Huang, Qianni; Wan, Yan TPS-YOLO: The Efficient Tiny Person Detection Network Based on Improved YOLOv8 and Model Pruning
455 Yan, Hao; Bai, Jing MDT-Net: a mask decoder tuning strategy for CLIP-based zero-shot 3D Classification
456 wang, tiebiao; li, xiaoyang; cui, zhenchao AMFT-YOLO: A Adaptive Multi-Scale YOLO Algorithm with Multi-Level Feature Fusion for Object Detection in UAV Scenes
458 Wu, Cheng-Yuan; Sun, Yuan-Chun; Lee, Cheng-Tse; Hsu, Cheng-Hsin Optimally Planning Drone Trajectory to Capture a 3D Gaussian Splatting Object
460 Yi, Zepu; Lu, Songfeng; Tang, Xueming; Zhu, Jianxin; Wu, Junjun MICAN: Multi-modal Inconsistency-based Cooperation Attention Network for fake news detection
230 Matsuhira, Chihaya ; Kastner, Marc A. ; Komamizu, Takahiro ; Hirayama, Takatsugu ; Ide, Ichiro Quantifying Image-Adjective Associations by Leveraging Large-Scale Pretrained Models
137 Fukuzawa, Takumi ; Hara, Kensho ; Kataoka, Hirokatsu ; Tamaki, Toru Can masking background and object reduce static bias for zero-shot action recognition?
355 Tanabe, Hikaru; Yanai, Keiji CalorieVoL: Integrating Volumetric Context into Multimodal Large Language Models for Image-based Calorie Estimation
416 Lim, Jia Yap ; See, John ; Dondrup, Christian Multimodal Engagement Prediction in Human-Robot Interaction using Transformer Neural Networks
431 Yoshihara, Daichi ; Yuguchi, Akishige ; Kawano, Seiya ; Iio, Takamasa ; Yoshino, Koichiro What Should Autonomous Robots Verbalize and What Should They Not?