Posts by Collection

portfolio

Portfolio item number 1

Short description of portfolio item number 1

Portfolio item number 2

Short description of portfolio item number 2

publications

Chain FL: Decentralized Federated Machine Learning via Blockchain

Published in BCCA 2020, 2020

Chain FL: Decentralized federated machine learning via blockchain.

Download here

Integrating Image Data Extraction and Table Parsing Methods for Chart Question Answering

Published in ChartQA @ CVPR 2021, 2021

Integrating image data extraction and table parsing methods for chart question answering.

Download here

Chart-to-Text: A Large-Scale Benchmark for Chart Summarization

Published in ACL 2022, 2022

Chart-to-Text: A large-scale benchmark for chart summarization.

Download here

ChartQA: A Benchmark for Question Answering about Charts with Visual and Logical Reasoning

Published in ACL 2022 Findings, 2022

ChartQA: A benchmark for question answering about charts with visual and logical reasoning.

Download here

Chart Question Answering: State of the Art and Future Directions

Published in EuroVis 2022, 2022

Chart Question Answering: State of the art and future directions.

Download here

UniChart: A Universal Vision-language Pretrained Model for Chart Comprehension and Reasoning

Published in EMNLP 2023, 2023

UniChart: A universal vision-language pretrained model for chart comprehension and reasoning.

Download here

Do LLMs Work on Charts? Designing Few-Shot Prompts for Chart Question Answering and Summarization

Published in arXiv Preprint, 2023

Do LLMs Work on Charts? Designing few-shot prompts for chart question answering and summarization.

Download here

LongFin: A Multimodal Document Understanding Model for Long Financial Domain Documents

Published in AIFinSI @ AAAI 2024, 2024

LongFin: A multimodal document understanding model for long financial domain documents.

Download here

ChartInstruct: Instruction Tuning for Chart Comprehension and Reasoning

Published in ACL 2024, 2024

ChartInstruct: Instruction tuning for chart comprehension and reasoning.

Download here

Are Large Vision Language Models up to the Challenge of Chart Comprehension and Reasoning? An Extensive Investigation into the Capabilities and Limitations of LVLMs

Published in EMNLP 2024, 2024

An extensive investigation into the capabilities and limitations of Large Vision Language Models for chart comprehension and reasoning.

Download here

BigDocs: A Permissively-Licensed Dataset for Training Vision-Language Models on Document and Code Tasks

Published in RBFM @ NeurIPS 2024, 2024

BigDocs is a permissively-licensed dataset for training vision-language models on document and code tasks.

Download here

Apriel-1.5-15b-Thinker

Published in arXiv Preprint, 2025

Apriel-1.5-15b-Thinker.

Download here

LLM-based data science agents: A survey of capabilities, challenges, and future directions

Published in arXiv Preprint, 2025

LLM-based data science agents: A survey of capabilities, challenges, and future directions.

Download here

Improving GUI Grounding with Explicit Position-to-Coordinate Mapping

Published in arXiv Preprint, 2025

Improving GUI Grounding with Explicit Position-to-Coordinate Mapping.

Download here

Scope: Selective Cross-modal Orchestration of Visual Perception Experts

Published in arXiv Preprint, 2025

Scope: Selective Cross-modal Orchestration of Visual Perception Experts.

Download here

Judging the Judges: Can Large Vision-Language Models Fairly Evaluate Chart Comprehension and Reasoning?

Published in ACL Industry Track 2025, 2025

Judging the Judges: Can Large Vision-Language Models Fairly Evaluate Chart Comprehension and Reasoning?

Download here

Learning or Cheating? Assessing Data Contamination in Large Vision-Language Models

Published in IEEE MLSP 2025, 2025

Learning or Cheating? Assessing Data Contamination in Large Vision-Language Models.

Download here

Colflor: Towards Bert-Size Vision-Language Document Retrieval Models

Published in IEEE MLSP 2025, 2025

Colflor: Towards Bert-Size Vision-Language Document Retrieval Models.

Download here

BigDocs: A Permissively-Licensed Dataset for Training Vision-Language Models on Document and Code Tasks

Published in ICLR 2025, 2025

BigDocs is a permissively-licensed dataset for training vision-language models on document and code tasks.

Download here

ChartGemma: Visual Instruction-tuning for Chart Reasoning in the Wild

Published in COLING 2025 Industry Track, 2025

ChartGemma is a visual instruction-tuned model for chart reasoning.

Recommended citation: Masry, A., Thakkar, M., Bajaj, A., Kartha, A., Hoque, E., & Joty, S. (2025). ChartGemma: Visual Instruction-tuning for Chart Reasoning in the Wild. COLING 2025. https://arxiv.org/abs/2407.04172v1

COLMATE: Contrastive Late Interaction and Masked Text for Multimodal Document Retrieval

Published in EMNLP 2025 Industry Track, 2025

COLMATE: Contrastive Late Interaction and Masked Text for Multimodal Document Retrieval.

Download here

ChartQAPro: A more diverse and challenging benchmark for chart question answering

Published in ACL 2025 Findings, 2025

ChartQAPro: A more diverse and challenging benchmark for chart question answering.

Download here

BigCharts-R1: Enhanced Chart Reasoning with Visual Reinforcement Finetuning

Published in COLM 2025, 2025

BigCharts-R1: Enhanced Chart Reasoning with Visual Reinforcement Finetuning.

Download here

AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Document Understanding

Published in NeurIPS 2025, 2025

AlignVLM: Bridging vision and language latent spaces for multimodal understanding.

Download here

DashboardQA: Benchmarking Multimodal Agents for Question Answering on Interactive Dashboards

Published in EACL 2026, 2026

DashboardQA: Benchmarking Multimodal Agents for Question Answering on Interactive Dashboards.

Download here

talks

Multimodal Chart Understanding with VLMs Permalink

Published: March 01, 2025

NLP+Vis: Data Visualization Understanding with VLMs Permalink

Published: September 01, 2025

Data Visualization Understanding with VLMs Permalink

Published: December 05, 2025

teaching

Undergraduate Teaching Assistant

Undergraduate course, Office of Learning and Teaching (KOLT), Koç University, 2019

Role: Undergraduate Teaching Assistant
Period: Spring Semester, 2019
Location: Istanbul, Turkey

Teaching Assistant

Graduate and Undergraduate courses, York University, 2021

Role: Teaching Assistant
Period: January 2021 – April 2022 & September 2024 – Present
Location: Toronto, Ontario