Page content

Demonstrations: Day 2 & 3 (9 and 10 January 13:30 – 15:00)

demoID paperID title authors
D01 468 SelectSum: Topic-Based Selective Summarization of Speech-Based Videos Wattasseril, Jobin Idiculla; Döllner, Jürgen
D02 469 Real-time Visualizer for Turntablist Performance Hamanaka, Masatoshi
D03 494 Multi-Dimensional Exploration of Media Collection Metadata Khan, Omar Shahbaz ; Duane, Aaron ; Hasnan, Hariz ; Blavec, Noé Le ; Ouvrard, Pierre ; Verdon, Johan ; d’Orazio, Laurent ; Thierry, Constance ; Jónsson, Björn Þór
D04 470 DriveCoach: Smart Driving Assistance with Multimodal Risk Prediction and Risk Adaptive Behavior Recommendation Gan, Wenbin; Dao, Minh-Son; Zettsu, Koji
D05 472 System Demo of Modeling Smart University Campus Virtual Environments Fernandez Roblero, Jaime Boanerjes ; Ali, Muhammad Intizar
D06 473 AMDA: Advancing Multimedia Data Annotation for human-centric situations Mohamed Serouis, Ibrahim; Sèdes, Florence
D07 475 FencBuddy: Action-aware Depth Perception Training for Fencing Attacks HUNG-YAO, PENG; ZI-HENG, ZHONG; CHENG-CHIH, TSAI; CHING-YEH, CHIANG; TSE-YU, PAN
D08 477 WaveFontStyler: Font Style Transfer Based on Sound Izumi, Kota; Yanai, Keiji
D09 479 Training a Segmentation-based Visual Anonymization Service for Street Scenes Korb, Martin; Bailer, Werner
D10 481 CleverFox: Integrating Visual Mnemonics with AI for Enhanced Language Learning Chiang, Yung-Chu ; Tang, Zi-Xian ; Luo, Yi-Ching ; Chang, Jason S.
D11 482 Fingering Prediction for Classical Guitar: Dataset Creation and Model Development Iino, Nami ; Iino, Akinaru
D12 483 An Implementation of Networked JamSketch Kitahara, Tetsuro ; Tsutsumi, Takuya ; Nagoshi, Takaaki ; Suzuki, Taizan
D13 485 Using Language Models to Generate and Forget the Narrative Memories of an Assistive Robot Garcia Contreras, Angel Fernando ; Chang, Wen-Yu ; Kawano, Seiya ; Chen, Yun-Nung ; Yoshino, Koichiro
D14 486 Better Image Segmentation with Classification: Guiding Zero-Shot Models Using Class Activation Maps Borgli, Hanna ; Stensland, Håkon Kvale ; Halvorsen, Pål
D15 488 Transformer-Based Audio Generation Conditioned by 2D Latent Maps: A Demonstration Limberg, Christian ; Zhang, Zhe ; Kastner, Marc A.
D16 489 KuzushijiFontDiff: Diffusion Model for Japanese Kuzushiji Font Generation YUAN, HONGHUI; YANAI, KEIJI
D17 490 SceneTextStyler: Editing Text with Style Transformation YUAN, HONGHUI; YANAI, KEIJI
D18 492 Multimodal Interoperability with the CLAMS Platform Lynch, Kelley ; Rim, Kyeongmin ; King, Owen ; Pustejovsky, James
D19 493 Enhancing User Control in AI-Based Video Summarization for Social Media Kontostathis, Ioannis; Apostolidis, Evlampios; Apostolidis, Konstantinos; Mezaris, Vasileios
D20 496 Movie Retrieval Systems Using Genre-guided Multimodal Learning Techniques Huang, Wei-Lun ; Hidayati, Shintami Chusnul ; Pan, Tse-Yu
D21 497 A User Identification and Reading Style Detection System Based on Eye Movement Patterns During Reading Kongmeesub, Onanong; Gurrin, Cathal; Nie, Dongyun
D22 484 Federated Learning with Multimodal-Sensing and Knowledge Distillation: An application on real-world benchmark dataset Le, Duy-Dong ; Huynh, Duy-Thanh ; Bao, Pham The
D23 499 Efficient Deployment of Multimodal AI Models: Leveraging Pruning, Quantization and Multi-Objective Optimization for Edge Computing Vu, Dang ; Dang, Tien ; Nguyen, Quoc-Trung ; Pham, Tan
D24 466 Badminton Footwork Practice via an Immersive Virtual Reality System Jheng, Duen-Chian ; Harchan, Bill Louis ; Kostka de Sztemberg, Berenika Nawoja ; Hsu, Jen-Hao ; Hu, Min-Chun
D25 480 RoboDJ: Live Commentary Robots System Driven by Physical- and Cyber-world Observations Kawanishi, Yasutomo; Nakamura, Yutaka; Shintani, Taiken; Ishi, Carlos T.; Kawano, Seiya; Yoshino, Koichiro; Minato, Takashi; Minoh, Michihiko
D26 487 Leveraging Latent Diffusion in 3D Gaussian Splatting for Novel View Synthesis Li, Bohan ; Li, Xingyi ; Liang, Yangwen ; Wang, Shuangquan ; Song, Kee-Bong