Development of an intelligent framework for semantic road segmentation using depth and camera motion constraints: An approach for resource management and engineering decision making in transportation
Abstract
One of the major challenges in semantic segmentation of road videos is to effectively exploit the 3D structure of the scene and camera motion, while only 2D pixel labeling is available. In this paper, we introduce a semi-supervised learning framework for semantic segmentation that exploits geometric constraints imposed by depth and camera motion in the video. In this approach, for each 15-frame sequence, only the first frame is manually labeled pixel-wise, and for subsequent frames, pseudo-labels are generated by “geometrically mapping” the labeled frame mask to other frames. To estimate monocular depth and relative camera motion between frames, the Monodepth2 model is used, which performs self-supervised learning of depth and camera motion using new view reconstruction and photometric reprojection error. Based on the estimated depth and motion, each pixel from the reference frame mask is mapped to 3D space and then to the destination frame, and by comparing the mapped depth and the estimated depth in the destination frame, a validation mask is constructed to refine the pseudo-labels. These pseudo-labels are used for semi-supervised training of the same LR-ASPP segmentation network. Evaluation on the KITTI-STEP dataset shows that the semi-supervised model based on geometric mapping (SSL-Warp) achieves mIoU of 25.73% and pixel accuracy of 71.42%, providing a slight improvement in performance over the baseline frame-only model. Analysis of the results shows that the depth and camera motion constraints are very useful for static classes such as road and vegetation, but for objects with independent motion, they require combination with explicit motion information.
References
Cao, S. (2025). DEG-SLAM: A Dynamic Visual RGB-D SLAM Based on Object Detection and Geometric Constraints for Degenerate Motion. Measurement Science and Technology, 36(2), 026302. https://doi.org/10.1088/1361-6501/ada39c
Chang, Y., Hu, J., & Xu, S. (2023). OTE-SLAM: An Object Tracking Enhanced Visual SLAM System for Dynamic Environments. Sensors, 23(18), 7921. https://doi.org/10.3390/s23187921
Chen, C., Wang, B., Lu, C. X., Trigoni, N., & Markham, A. (2024). Deep Learning for Visual Localization and Mapping: A Survey. IEEE Transactions on Neural Networks and Learning Systems, 35(12), 17000-17020. https://doi.org/10.1109/tnnls.2023.3309809
Dai, J., Yang, M., Li, Y., Zhao, J., & Hanajima, N. (2024). ADS–SLAM: A Semantic SLAM Based on Adaptive Motion Compensation and Semantic Information for Dynamic Environments. Measurement Science and Technology, 36(1), 016304. https://doi.org/10.1088/1361-6501/ad824b
Gao, S., Gao, X., & Zhang, D. (2025). DMS-SLAM: Semantic Visual SLAM Based on Deep Mask Segmentation in Dynamic Environments. Measurement Science and Technology, 36(4), 046311. https://doi.org/10.1088/1361-6501/adc1f1
Gong, C., Sun, Y., Zou, C., Jiang, D., Huang, L., & Tao, B. (2024). SFD-SLAM: A Novel Dynamic RGB-D SLAM Based on Saliency Region Detection. Measurement Science and Technology, 35(10), 106304. https://doi.org/10.1088/1361-6501/ad5b0e
Gong, X., Chen, H., Zhang, H., Liao, K., & Liu, X. (2026). Low Computational Cost and Misclassification Rate Semantic VSLAM Realization Based on Frame Skipping Dual Filtering and Adaptive Motion Estimation. Measurement Science and Technology, 37(12), 126201. https://doi.org/10.1088/1361-6501/ae531e
Huang, H., Dong, Y., Yang, J., & Liu, X. (2023). Autonomous Vehicles Localisation Based on Semantic Map Matching Method. The International Archives of the Photogrammetry Remote Sensing and Spatial Information Sciences, XLVIII-1/W2-2023, 901-908. https://doi.org/10.5194/isprs-archives-xlviii-1-w2-2023-901-2023
Huang, W., Zou, C., Yun, J., Jiang, D., Huang, L., Liu, Y., Jiang, G. Z., & Xie, Y. (2024). Strong-Slam: Real-Time RGB-D Visual SLAM in Dynamic Environments Based on StrongSORT. Measurement Science and Technology, 35(12), 126309. https://doi.org/10.1088/1361-6501/ad7a11
Jiang, D., Yun, J., Huang, L., Xie, Y., & Sun, Y. (2025). DPLS-SLAM: A Visual SLAM System Based on Point–line Feature Fusion and Lightweight Improved YOLOv8seg Network in Dynamic Environment. Measurement Science and Technology, 36(7), 076304. https://doi.org/10.1088/1361-6501/ade55c
Li, J., & Luo, J. (2024). YS-SLAM: YOLACT++ Based Semantic Visual SLAM for Autonomous Adaptation to Dynamic Environments of Mobile Robots. Complex & Intelligent Systems, 10(4), 5771-5792. https://doi.org/10.1007/s40747-024-01443-x
Li, Z., Zhang, X., Fan, L., & Li, J. (2024). An Accurate and Robust RGB-D Visual SLAM Method in Dynamic Environments. https://doi.org/10.21203/rs.3.rs-5311384/v1
Li, Z., Zhang, X., Fan, L., & Li, J. (2025). Visual SLAM With Semantic and Geometric Constraints in Dynamic Environments. Journal of Electronic Imaging, 34(02). https://doi.org/10.1117/1.jei.34.2.023050
Liao, P., Chen, L., Tang, J., & Feng, Z. (2025). YGDD‐SLAM: Direct Geometric Constraint SLAM Based on Object Detection and Depth Image Segmentation. Journal of Field Robotics. https://doi.org/10.1002/rob.70024
Liu, J., Liu, N., & Yuan, Y. (2025). Towards Biologically-Inspired Visual SLAM in Dynamic Environments: IPL-SLAM With Instance Segmentation and Point-Line Feature Fusion. Biomimetics, 10(9), 558. https://doi.org/10.3390/biomimetics10090558
Sahili, A. R., Hassan, S., Sakhrieh, S., Mounsef, J., Maalouf, N., Arain, B., & Taha, T. (2023). A Survey of Visual SLAM Methods. IEEE Access, 11, 139643-139677. https://doi.org/10.1109/access.2023.3341489
Shen, Y., & Zhang, X. (2025). A Dynamic SLAM System With YOLOv7 Segmentation and Geometric Constraints for Indoor Environments. Robotica, 43(7), 2527-2545. https://doi.org/10.1017/s0263574725101823
Sun, Y., Wang, Q., Yan, C., Feng, Y., Tan, R., Shi, X., & Wang, X. (2023). D-Vins: Dynamic Adaptive Visual–Inertial SLAM With IMU Prior and Semantic Constraints in Dynamic Scenes. Remote Sensing, 15(15), 3881. https://doi.org/10.3390/rs15153881
Wang, F., Zhao, L., Xu, Z., Liang, H., & Zhang, Q. (2024). LDVI-SLAM: A Lightweight Monocular Visual-Inertial SLAM System for Dynamic Environments Based on Motion Constraints. Measurement Science and Technology, 35(12), 126301. https://doi.org/10.1088/1361-6501/ad71e7
Wang, K., Yao, X., Ma, N., Ran, G., & Liu, M. (2024). DMOT-SLAM: Visual SLAM in Dynamic Environments With Moving Object Tracking. Measurement Science and Technology, 35(9), 096302. https://doi.org/10.1088/1361-6501/ad4dc7
Wang, S., Chen, N., Li, W., Yuan, J., Zheng, E., Wang, G., & Chen, W. (2025). SGDO-SLAM: A Semantic RGB-D SLAM System With Coarse-to-Fine Dynamic Rejection and Static Weighted Optimization. Sensors, 25(12), 3734. https://doi.org/10.3390/s25123734
Wei, S., Wang, S., Li, H., Liu, G., Yang, T., & Liu, C. (2023). A Semantic Information-Based Optimized vSLAM in Indoor Dynamic Environments. Applied Sciences, 13(15), 8790. https://doi.org/10.3390/app13158790
Wu, Y., Zhang, Z., Chen, H., & Li, J. (2025). A Motion Segmentation Dynamic SLAM for Indoor GNSS-Denied Environments. Sensors, 25(16), 4952. https://doi.org/10.3390/s25164952
Yao, C., Ding, L., & Lan, Y. H. (2023). MOR-SLAM: A New Visual SLAM System for Indoor Dynamic Environments Based on Mask Restoration. https://doi.org/10.20944/preprints202308.1419.v1
You, Y., Peng, W., Cai, J., Huang, W., Kang, R., & Liu, H. (2022). MISD‐SLAM: Multimodal Semantic SLAM for Dynamic Environments. Wireless Communications and Mobile Computing, 2022(1). https://doi.org/10.1155/2022/7600669
Zhang, W., Chen, H., & Song, F. (2025). NSDM-SLAM: Non-Blocking Semantic Detection and Mask Propagation for Robust Visual SLAM in Dynamic Environments. Measurement Science and Technology, 36(9), 096302. https://doi.org/10.1088/1361-6501/adfe04
Zhang, X., & Shen, Y. (2025). YER-SLAM: A Dynamic Visual SLAM Based on Object Detection With Region Growing Algorithm. Journal of Intelligent & Fuzzy Systems Applications in Engineering and Technology, 49(3), 720-734. https://doi.org/10.1177/18758967251353441
Zhang, Y., Li, Y., & Chen, P. (2023). TSG-SLAM: SLAM Employing Tight Coupling of Instance Segmentation and Geometric Constraints in Complex Dynamic Environments. Sensors, 23(24), 9807. https://doi.org/10.3390/s23249807
Zheng, C., Zhang, P., & Li, Y. (2025). Semantic SLAM System for Mobile Robots Based on Large Visual Model in Complex Environments. Scientific reports, 15(1). https://doi.org/10.1038/s41598-025-90340-5
Zhu, Y., An, H., Wang, H., Xu, R., Sun, Z., & Lu, K. (2024). DOT-SLAM: A Stereo Visual Simultaneous Localization and Mapping (SLAM) System With Dynamic Object Tracking Based on Graph Optimization. Sensors, 24(14), 4676. https://doi.org/10.3390/s24144676
Downloads
Published
Submitted
Revised
Accepted
Issue
Section
License
Copyright (c) 2026 Mohammad Mohammadi

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

