ScaleMaster Dataset - Multi-Scale 3D Scene Understanding

Abstract

Recent advances in deep monocular visual SLAM have achieved impressive accuracy and dense reconstruction capabilities, yet their robustness to scale inconsistency in large- scale indoor environments remains largely unexplored. Existing benchmarks are limited to room-scale or structurally simple settings, leaving critical issues of intra-session scale drift and inter-session scale ambiguity insufficiently addressed. To fill this gap, we introduce the ScaleMaster Dataset, the first benchmark explicitly designed to evaluate scale consistency under challenging scenarios such as multi-floor structures, long trajectories, repetitive views, and low-texture regions. We systematically analyze the vulnerability of state-of-the-art deep monocular visual SLAM systems to scale inconsistency, providing both qualitative and quantitative evaluations. Crucially, our analysis extends beyond traditional trajectory metrics to include a direct map-to-map quality assessment using metrics like Chamfer distance against high-fidelity 3D ground truth. Our results reveal that while these traditional methods demonstrate strong performance on existing benchmarks, they suffer from severe scale-related failures in realistic, large-scale indoor environments. By releasing the ScaleMaster dataset and baseline results, we aim to establish a foundation for future research toward developing scale-consistent and reliable visual SLAM systems.

Dataset Statistics

25 Sequences

3.8 km+ Total Length

10+ Environments

RGB+D+IMU Data Type

Dataset Sequences

Click on any sequence to explore in 3D

📚 Library Environments (9 sequences)

↕️

Library 01

multi-floor & 3D Map

🔄

Library 02

loop & 3D Map

🔄

Library 03

loop

📚

Library 04

narrow

🏛️

Library 05

vertical

🌀

Library 06

rotation & 3D Map

📖

Library 07

survey & 3D Map

🔍

Library 08

survey

🪟

Library 09

glass

🏢 Large Hall Environments (5 sequences)

🔄

LargeHall 01

large

🌙

LargeHall 02

night & 3D Map

🌙

LargeHall 03

night

🔄

LargeHall 04

loop

🌑

LargeHall 05

low-light & 3D Map

🚗 Parking & Basement (3 sequences)

🅿️

Parking 01

parking area

🚗

Parking 02

parking area

↗️

Basement 01

vertical & 3D Map

🪜 Stairs & Station (3 sequences)

🪜

Stairs 01

stairs

⬆️

Stairs 02

repetitive

🚇

Station 01

dynamic

🏠 Indoor Rooms (5 sequences)

🏨

HotelRoom 01

small

🔬

Lab 01

lab

💼

Office 01

office

🏛️

Lobby 01

lobby

🛋️

Lounge 01

lounge

Download Dataset

📥 Download

Survey for Download
Individual sequences

Browse Dataset

💻 Git Clone

Clone entire dataset
with Git LFS

Show Command

📖 Documentation

Dataset format
Usage examples

View on GitHub

Command Line Download

# Clone with Git LFS
git clone https://github.com/JooHyoSeok/ScaleMaster-Dataset

Citation

@inproceedings{ju2026scalemaster,
  title={Have We Mastered Scale in Deep Monocular Visual SLAM? The ScaleMaster Dataset and Benchmark},
  author={Ju, Hyoseok and Suh, Bokeon and Kim, Giseop},
  booktitle={Proceedings of the IEEE International Conference on Robotics and Automation (ICRA)},
  year={2026}
}