Research

Advancing the Frontier of Tabular AI

Our research pushes beyond prediction, exploring causality, forecasting, multi-table reasoning, and true interpretability to build powerful AI systems you can trust.

Capabilites

Areas of Research

Relational and multi-table reasoning

Real-world data is rarely confined to a single table; it lives in complex, interconnected databases. Traditional AI, however, often analyzes tables in isolation, missing the bigger picture. Our research focuses on building models with native multi-table reasoning capabilities. We are developing architectures that can understand relational schemas, join information from disparate sources, and perform complex queries to unlock holistic insights that are impossible to find in a single view.

Causality

True intelligence requires moving beyond correlation to understand causation. Answering "what will happen?" is useful, but answering "why it happens" is transformative. Our research is focused on building models that can infer causal relationships directly from observational data. By leveraging a training paradigm based on millions of synthetic datasets generated from underlying structural causal models, we are teaching our AI to identify not just patterns, but the drivers behind them. This is the key to enabling robust decision-making and moving from simple prediction to effective intervention.

Time-Series Forecasting

Forecasting future trends is a cornerstone of strategic decision-making in finance, energy, and logistics. Yet, traditional models often require extensive, task-specific training. Our research is developing foundation models that excel at zero-shot time-series forecasting. By learning universal patterns from a vast array of synthetic time-series, our models can make highly accurate predictions on unseen data in seconds , delivering state-of-the-art performance for critical applications like algorithmic trading and risk management.

Scalability

The transformative power of foundation models must extend to enterprise-scale data. Our research is dedicated to breaking the computational barriers of in-context learning, enabling our models to handle massive datasets without sacrificing speed. Through novel architectural optimizations, advanced caching strategies, and efficient attention mechanisms, we are drastically reducing memory requirements and inference latency. Our roadmap includes scaling TabPFN to effectively process up to one million samples, bringing the benefits of zero-shot, high-accuracy modeling to the largest and most demanding tabular data challenges.

Interpretability

In high-stakes domains like finance and medicine, a prediction is only as valuable as the trust you can place in it. We reject the "black box" paradigm. Our research is focused on making the reasoning of our foundation models transparent and understandable. We build interpretability into our systems from the start, providing tools that allow users to dissect any prediction and understand which features drove the outcome. By demystifying complex model behavior, we provide the clarity and accountability required to confidently deploy AI in mission-critical applications.

Fairness

As AI models are increasingly used for critical decisions in areas like lending and clinical trials, ensuring they are free from bias is not just an ethical necessity but a technical imperative. Our research into fairness leverages our unique approach of pre-training on synthetic data. By controlling the generative process, we can create training environments that actively counteract historical biases, teaching our models to make equitable predictions from the ground up. We are developing novel methods to not only detect and measure bias in tabular data but also to build inherent fairness directly into the architecture of our foundation models, ensuring they are both powerful and principled.