timeseries.png

Problem Statement

Large language models (LLMs) excel at reasoning across text and other modalities, yet they struggle with continuous temporal signals such as time-series data. This limitation is critical, as many real-world domains—from healthcare to finance—depend on reasoning about evolving trajectories.

Recent work by our group, in collaboration with Stanford and Google researchers, has developed and validated a new multimodal LLM architecture, OpenTSLM, which extends LLaMA and Gemma with time-series as an additional modality. Using soft prompting and cross-attention (Flamingo-style), these Text–Time-Series LLMs were shown to reason effectively over longitudinal health data, achieving state-of-the-art performance in tasks like ECG-based question answering, activity recognition, and sleep stage detection.

Project Approach

Building on this foundation, we aim to extend these architectures beyond healthcare. We hypothesize that OpenTSLM models can be adapted to reason about time-series trajectories from new text input across a wide range of domains—explaining why certain events may shift temporal patterns, what alternative scenarios are plausible, and how new signals relate to historical dynamics. Once such reasoning capabilities are established, the same models could also predict future trajectories, combining causal and contextual insights with temporal forecasting.

Applications

Beyond healthcare, we see broad potential applications:

Goal

Our objective is to develop and validate novel Time Series Language Model (TSLM) architectures beyond healthcare, demonstrating their ability to reason over and, where appropriate, predict temporal data streams. If successful, this will enable a new class of multimodal reasoning agents that explain not only what is happening, but also why systems evolve as they do—bridging text and time-series signals in critical domains such as finance, healthcare, and robotics.

Contact