Architecting Efficient Multimodal Data Systems

Amine Mhebhbi - Polytechnique Montréal

Nov. 14, 2025, 2:30 p.m. - Nov. 14, 2025, 3:30 p.m.

ENGMD 279

Hosted by: Oana Balmau


Data management is undergoing a major shift from traditional monomodal data systems to multimodal ones. In monomodal settings, data of a single form, such as tables, graphs, or documents, is stored and queried declaratively. By contrast, new multimodal workloads are emerging with the rise of large language models (LLMs), modern data lakes, and the push toward automating enterprise processes. These workloads challenge conventional systems not only due to data heterogeneity but also because of the increasing scale of data and the need for higher query throughput.  In this talk, I will outline core principles of data system design applied to multimodal query processing. I will begin with graph data, which plays a central role in linking siloed datasets through relationship tables that can be queried declaratively. A central challenge here is managing the explosion of large intermediate results. Moving to multimodal query processing, I will present design considerations for systems that integrate foundation models. I will also highlight our work on semantic operators within relational data systems and discuss the optimizations required to make them practical and cost effective.

Amine Mhebhbi is an assistant professor at Polytechnique Montréal, where he leads the Data and AI Systems Group. His interests are in building and analyzing analytical data management systems; his work includes tackling performance considerations and debuggability, interface design, and data applications. Amine received his Ph.D. in 2023 from the University of Waterloo. His research has been awarded a VLDB best paper award, a Microsoft Ph.D. fellowship award, and the University of Waterloo's Computer Science distinguished dissertation award.