Development of an intelligent framework for semantic road segmentation using depth and camera motion constraints: An approach for resource management and engineering decision making in transportation

Mohammad Mohammadi

Authors

Mohammad Mohammadi * Graduated from Master of Science in Computer Engineering - Artificial Intelligence and Robotics, K.N. Toosi University of Technology m.mohamadi2@email.kntu.ac.ir

Abstract

One of the major challenges in semantic segmentation of road videos is to effectively exploit the 3D structure of the scene and camera motion, while only 2D pixel labeling is available. In this paper, we introduce a semi-supervised learning framework for semantic segmentation that exploits geometric constraints imposed by depth and camera motion in the video. In this approach, for each 15-frame sequence, only the first frame is manually labeled pixel-wise, and for subsequent frames, pseudo-labels are generated by “geometrically mapping” the labeled frame mask to other frames. To estimate monocular depth and relative camera motion between frames, the Monodepth2 model is used, which performs self-supervised learning of depth and camera motion using new view reconstruction and photometric reprojection error. Based on the estimated depth and motion, each pixel from the reference frame mask is mapped to 3D space and then to the destination frame, and by comparing the mapped depth and the estimated depth in the destination frame, a validation mask is constructed to refine the pseudo-labels. These pseudo-labels are used for semi-supervised training of the same LR-ASPP segmentation network. Evaluation on the KITTI-STEP dataset shows that the semi-supervised model based on geometric mapping (SSL-Warp) achieves mIoU of 25.73% and pixel accuracy of 71.42%, providing a slight improvement in performance over the baseline frame-only model. Analysis of the results shows that the depth and camera motion constraints are very useful for static classes such as road and vegetation, but for objects with independent motion, they require combination with explicit motion information.

References

Cao, S. (2025). DEG-SLAM: A Dynamic Visual RGB-D SLAM Based on Object Detection and Geometric Constraints for Degenerate Motion. Measurement Science and Technology, 36(2), 026302. https://doi.org/10.1088/1361-6501/ada39c

Chang, Y., Hu, J., & Xu, S. (2023). OTE-SLAM: An Object Tracking Enhanced Visual SLAM System for Dynamic Environments. Sensors, 23(18), 7921. https://doi.org/10.3390/s23187921

Chen, C., Wang, B., Lu, C. X., Trigoni, N., & Markham, A. (2024). Deep Learning for Visual Localization and Mapping: A Survey. IEEE Transactions on Neural Networks and Learning Systems, 35(12), 17000-17020. https://doi.org/10.1109/tnnls.2023.3309809

Dai, J., Yang, M., Li, Y., Zhao, J., & Hanajima, N. (2024). ADS–SLAM: A Semantic SLAM Based on Adaptive Motion Compensation and Semantic Information for Dynamic Environments. Measurement Science and Technology, 36(1), 016304. https://doi.org/10.1088/1361-6501/ad824b

Gao, S., Gao, X., & Zhang, D. (2025). DMS-SLAM: Semantic Visual SLAM Based on Deep Mask Segmentation in Dynamic Environments. Measurement Science and Technology, 36(4), 046311. https://doi.org/10.1088/1361-6501/adc1f1

Gong, C., Sun, Y., Zou, C., Jiang, D., Huang, L., & Tao, B. (2024). SFD-SLAM: A Novel Dynamic RGB-D SLAM Based on Saliency Region Detection. Measurement Science and Technology, 35(10), 106304. https://doi.org/10.1088/1361-6501/ad5b0e

Gong, X., Chen, H., Zhang, H., Liao, K., & Liu, X. (2026). Low Computational Cost and Misclassification Rate Semantic VSLAM Realization Based on Frame Skipping Dual Filtering and Adaptive Motion Estimation. Measurement Science and Technology, 37(12), 126201. https://doi.org/10.1088/1361-6501/ae531e

Huang, H., Dong, Y., Yang, J., & Liu, X. (2023). Autonomous Vehicles Localisation Based on Semantic Map Matching Method. The International Archives of the Photogrammetry Remote Sensing and Spatial Information Sciences, XLVIII-1/W2-2023, 901-908. https://doi.org/10.5194/isprs-archives-xlviii-1-w2-2023-901-2023

Huang, W., Zou, C., Yun, J., Jiang, D., Huang, L., Liu, Y., Jiang, G. Z., & Xie, Y. (2024). Strong-Slam: Real-Time RGB-D Visual SLAM in Dynamic Environments Based on StrongSORT. Measurement Science and Technology, 35(12), 126309. https://doi.org/10.1088/1361-6501/ad7a11

Jiang, D., Yun, J., Huang, L., Xie, Y., & Sun, Y. (2025). DPLS-SLAM: A Visual SLAM System Based on Point–line Feature Fusion and Lightweight Improved YOLOv8seg Network in Dynamic Environment. Measurement Science and Technology, 36(7), 076304. https://doi.org/10.1088/1361-6501/ade55c

Li, J., & Luo, J. (2024). YS-SLAM: YOLACT++ Based Semantic Visual SLAM for Autonomous Adaptation to Dynamic Environments of Mobile Robots. Complex & Intelligent Systems, 10(4), 5771-5792. https://doi.org/10.1007/s40747-024-01443-x

Li, Z., Zhang, X., Fan, L., & Li, J. (2024). An Accurate and Robust RGB-D Visual SLAM Method in Dynamic Environments. https://doi.org/10.21203/rs.3.rs-5311384/v1

Li, Z., Zhang, X., Fan, L., & Li, J. (2025). Visual SLAM With Semantic and Geometric Constraints in Dynamic Environments. Journal of Electronic Imaging, 34(02). https://doi.org/10.1117/1.jei.34.2.023050

Liao, P., Chen, L., Tang, J., & Feng, Z. (2025). YGDD‐SLAM: Direct Geometric Constraint SLAM Based on Object Detection and Depth Image Segmentation. Journal of Field Robotics. https://doi.org/10.1002/rob.70024

Liu, J., Liu, N., & Yuan, Y. (2025). Towards Biologically-Inspired Visual SLAM in Dynamic Environments: IPL-SLAM With Instance Segmentation and Point-Line Feature Fusion. Biomimetics, 10(9), 558. https://doi.org/10.3390/biomimetics10090558

Sahili, A. R., Hassan, S., Sakhrieh, S., Mounsef, J., Maalouf, N., Arain, B., & Taha, T. (2023). A Survey of Visual SLAM Methods. IEEE Access, 11, 139643-139677. https://doi.org/10.1109/access.2023.3341489

Shen, Y., & Zhang, X. (2025). A Dynamic SLAM System With YOLOv7 Segmentation and Geometric Constraints for Indoor Environments. Robotica, 43(7), 2527-2545. https://doi.org/10.1017/s0263574725101823

Sun, Y., Wang, Q., Yan, C., Feng, Y., Tan, R., Shi, X., & Wang, X. (2023). D-Vins: Dynamic Adaptive Visual–Inertial SLAM With IMU Prior and Semantic Constraints in Dynamic Scenes. Remote Sensing, 15(15), 3881. https://doi.org/10.3390/rs15153881

Wang, F., Zhao, L., Xu, Z., Liang, H., & Zhang, Q. (2024). LDVI-SLAM: A Lightweight Monocular Visual-Inertial SLAM System for Dynamic Environments Based on Motion Constraints. Measurement Science and Technology, 35(12), 126301. https://doi.org/10.1088/1361-6501/ad71e7

Wang, K., Yao, X., Ma, N., Ran, G., & Liu, M. (2024). DMOT-SLAM: Visual SLAM in Dynamic Environments With Moving Object Tracking. Measurement Science and Technology, 35(9), 096302. https://doi.org/10.1088/1361-6501/ad4dc7

Wang, S., Chen, N., Li, W., Yuan, J., Zheng, E., Wang, G., & Chen, W. (2025). SGDO-SLAM: A Semantic RGB-D SLAM System With Coarse-to-Fine Dynamic Rejection and Static Weighted Optimization. Sensors, 25(12), 3734. https://doi.org/10.3390/s25123734

Wei, S., Wang, S., Li, H., Liu, G., Yang, T., & Liu, C. (2023). A Semantic Information-Based Optimized vSLAM in Indoor Dynamic Environments. Applied Sciences, 13(15), 8790. https://doi.org/10.3390/app13158790

Wu, Y., Zhang, Z., Chen, H., & Li, J. (2025). A Motion Segmentation Dynamic SLAM for Indoor GNSS-Denied Environments. Sensors, 25(16), 4952. https://doi.org/10.3390/s25164952

Yao, C., Ding, L., & Lan, Y. H. (2023). MOR-SLAM: A New Visual SLAM System for Indoor Dynamic Environments Based on Mask Restoration. https://doi.org/10.20944/preprints202308.1419.v1

You, Y., Peng, W., Cai, J., Huang, W., Kang, R., & Liu, H. (2022). MISD‐SLAM: Multimodal Semantic SLAM for Dynamic Environments. Wireless Communications and Mobile Computing, 2022(1). https://doi.org/10.1155/2022/7600669

Zhang, W., Chen, H., & Song, F. (2025). NSDM-SLAM: Non-Blocking Semantic Detection and Mask Propagation for Robust Visual SLAM in Dynamic Environments. Measurement Science and Technology, 36(9), 096302. https://doi.org/10.1088/1361-6501/adfe04

Zhang, X., & Shen, Y. (2025). YER-SLAM: A Dynamic Visual SLAM Based on Object Detection With Region Growing Algorithm. Journal of Intelligent & Fuzzy Systems Applications in Engineering and Technology, 49(3), 720-734. https://doi.org/10.1177/18758967251353441

Zhang, Y., Li, Y., & Chen, P. (2023). TSG-SLAM: SLAM Employing Tight Coupling of Instance Segmentation and Geometric Constraints in Complex Dynamic Environments. Sensors, 23(24), 9807. https://doi.org/10.3390/s23249807

Zheng, C., Zhang, P., & Li, Y. (2025). Semantic SLAM System for Mobile Robots Based on Large Visual Model in Complex Environments. Scientific reports, 15(1). https://doi.org/10.1038/s41598-025-90340-5

Zhu, Y., An, H., Wang, H., Xu, R., Sun, Z., & Lu, K. (2024). DOT-SLAM: A Stereo Visual Simultaneous Localization and Mapping (SLAM) System With Dynamic Object Tracking Based on Graph Optimization. Sensors, 24(14), 4676. https://doi.org/10.3390/s24144676

Development of an intelligent framework for semantic road segmentation using depth and camera motion constraints: An approach for resource management and engineering decision making in transportation

Authors

Abstract

References

Downloads

Published

Submitted

Revised

Accepted

Issue

Section

License

How to Cite

Make a Submission

Keywords

Language

Journal Archive

Average time from submission until

Information Table

Indexing & Abstracting

Latest publications