Multimodal Artificial Intelligence Laboratory

Additional Navigation

Signal and Imaging Processing Laboratory

The Multimodal Artificial Intelligence Laboratory conducts advanced research at the intersection of perception, reasoning, and autonomous action across multiple sensory modalities. We develop intelligent systems that integrate and interpret data from diverse sources to operate effectively in complex, dynamic environments by exploring advanced machine learning algorithms and human-computer interactions.

Our work includes:

Real-time multimodal sensor fusion and computer vision for robust perception.
Advanced acoustic and audio signal processing for accurate modeling, detection, and tracking.
Development and fine-tuning of private large language models (LLMs) and agentic AI systems for autonomous planning, adaptive decision-making, and dialogue management and Implementation of retrieval-augmented generation (RAG) and long-term memory architectures to enhance AI capabilities.
Robotics and autonomous vehicles (ground, aerial, and underwater) for navigation in unstructured terrain and adverse environment.

By combining state-of-the-art techniques in vision, signal processing, language modeling, and cross-modal fusion, we build AI systems that are context-aware, continuously adaptive, and capable of intelligent interaction with their environment.

We build AI that doesn’t just analyze the world but engages with it intelligently.