MBZUAI innovators demonstrate AI that can see, listen, help in daily life

MAYS IBRAHIM (ABU DHABI)

Researchers at the Mohamed Bin Zayed University of Artificial Intelligence (MBZUAI) are developing deployable, real-world AI applications - from smart glasses powered by reasoning models to tools that maximise creative workflows. Live demos were showcased at the Machines Can Think Summit 2026 in Abu Dhabi, which ran from January 26 to 27.

Among the highlights is OMER Care - a wearable AI system built around smart glasses and a multimodal large language model capable of real-time scene understanding, natural conversation, and contextual reasoning.

"This is not just a vision model or a speech model; it's an agentic pipeline," Hisham Cholakkal, Assistant Professor of Computer Vision at the MBZUAI, told Aletihad.

"It combines visual understanding, audio interaction and reasoning into a single system that decides which model to call, depending on the task."

The demo features a pair of AI-powered smart glasses equipped with a camera, microphone and, in one version, an eye tracker that detects the user's object of interest.

Visual input from the glasses is processed by computer vision models for scene understanding and optical character recognition, while speech interaction allows users to ask questions naturally, without reading screens or typing commands.

One version of this solution is assistive AI glasses designed for healthcare and daily living.

It integrates eye-tracking technology and connects to a healthcare-specialised AI model, MedicsR1, according to Cholakkal.

This allows elderly users or stroke survivors to identify objects, read medicine labels, and ask health-related questions in real time, he explained.

"An elderly user, for example, can look at a medicine bottle and ask whether it is appropriate for a specific condition, what dosage to take, or whether it is safe for certain symptoms. The system then responds through audio guidance."

The second version focuses on education, integrating the smart glasses with MBZUAI's reasoning model, K2Think, which is designed for advanced mathematical and logical reasoning.

In a live demonstration, the system was shown interpreting a complex geometry problem through the camera, identifying multiple shapes within a structure and breaking the task into steps.

When a user asked for the volume of the object, the system recognised it as a complex reasoning task and automatically routed the problem to K2Think.

"It understands there's a cylinder here, a cone here, another cone there, and it knows it needs to calculate each volume separately and combine them," Cholakkal explained. "It doesn't just give an answer. It explains each step, so it becomes a teaching tool."

The educational version is designed to guide students through problem-solving rather than simply producing results, turning the AI into an interactive tutor that teaches reasoning processes through voice and visual explanation.

A New Way to Find Soundtracks for Videos

Another MBZUAI-backed demo at the summit is Audiomatic, developed by PhD student Muhammad Taimoor Haseeb.

"What people do today is search manually through massive audio library using text; you type a keyword, scroll through endless results, preview tracks then try to sync everything manually," Haseeb told Aletihad.

Audiomatic, he noted, automatically generates or retrieves synchronised soundtracks and sound effects for video content.

The AI-powered platform functions both as a creator web app and a developer API, allowing integration into video editors, text-to-video platforms, gaming studios and sound libraries.

According to Haseeb, who is also a musician, the platform relies on ethically sourced, pre-recorded loops created by human musicians, who receive royalties when their work is used.