Sifat M. Abdullah's final PhD defense
Hi everyone, You are all invited to attend Sifat M. Abdullah’s final PhD defense. Title: Investigating Multimodal Foundation Models through the Lens of Safety and Security Date: Tuesday, November 18, 2025 Time: 2:00 PM EST Room: Gilbert Place 4307 Zoom link: https://virginiatech.zoom.us/j/88365297664 Committee members: Bimal Viswanath (Chair), Danfeng (Daphne) Yao, Taejoong (Tijay) Chung, Peng Gao, and Murtuza Jadliwala (The University of Texas at San Antonio). Abstract: Generative AI plays a crucial role in processing and interpreting information, making its reliability more important than ever. Multimodal Foundation Models (MFM), which drive the latest innovations in generative AI, have a significant impact on our daily lives. While MFMs enable powerful cross-modal capabilities, for instance from Text-to-image (T2I) models like DALL-E and Stable Diffusion to Multimodal LLMs (MLLM) like LLaMA and ChatGPT, they are often exploited to generate highly realistic deepfake images and are also vulnerable to adversarial attacks that degrade their performance. We investigate MFMs through the lens of security on the following 2 principal threats: (1) Understanding the threat from misuse of MFMs & developing methods for their mitigation. T2I models can generate highly convincing deepfake media which can be misused to spread misinformation. This raises concerns about the authenticity of digital content and the potential for large-scale manipulation. To address this, it is crucial to develop robust detection methods that can accurately identify and mitigate the risks posed by such synthetic media. (2) Attackers violating the integrity of MFMs. Adversarially perturbed images can significantly deteriorate the performance of MLLMs causing them to miscaption images and elicit toxic responses. Mitigating such adversarial threats is essential to ensure safety and security. The following dissertation addresses the above 2 threat directions: (1) Assessing the real-world applicability of state-of-the-art (SOTA) deepfake defenses and developing robust and generalizable detection methods. (2) Defend MFMs against perturbation-based adversarial attacks with advances in off-the-shelf Generative AI (GenAI) image translation models & their test-time reasoning capabilities. We will conclude by outlining future research directions that the security community can leverage to develop more secure and robust Multimodal Foundation Models. thanks, Bimal
participants (1)
-
Viswanath, Bimal