AfterQuery Research

Our research is guided by the thesis that model performance is bounded by quality of training data. Great models start with great data.

Research Publications

Machine Learning
arXiv:2501.18062January 2025

FinanceQA: A Benchmark for Evaluating Financial Analysis Capabilities of Large Language Models

A comprehensive testing suite evaluating LLMs' performance on complex numerical financial analysis tasks that mirror real-world investment work.

Security Research
arXiv:2505.19395May 2025

VADER: A Human-Evaluated Benchmark for Vulnerability Assessment, Detection, Explanation, and Remediation

A benchmark designed to assess LLM performance across four key vulnerability-handling dimensions using 174 real-world software vulnerabilities.

Our Research Areas

We focus on several key areas to advance the field of artificial intelligence

AI Safety & Security

Researching methods to ensure AI systems behave in accordance with human values and intentions while minimizing potential risks.

Multimodal Learning

Developing systems that can understand and reason across multiple modalities including images, audio, and video.

Computer Use & Automation

Developing AI systems that can interact with and control computer interfaces, enabling autonomous task execution and workflow automation.

Data Quality & Curation

Developing ways for creating high-quality training datasets that drive superior model performance and reliability.

Model Evaluation

Creating comprehensive benchmarks and evaluation frameworks to assess AI model capabilities across diverse real-world scenarios.

Get started

Connect with our Team

Our research findings are advancing foundational model capabilities through human-generated, specialized datasets.