Abstract
LEGO-SLAM is the first framework to achieve real-time, open-vocabulary mapping within a 3DGS-based SLAM system. By distilling high-dimensional language embeddings into a compact, scene-adaptive 16-dimensional feature space, we drastically reduce memory usage and enable 15 FPS performance. Our approach includes a language-guided pruning strategy that significantly reduces Gaussian counts without quality loss, along with an efficient loop detection method that reuses mapping features for robust tracking in novel environments.