STRUCTURAL HEALTH MONITORING-AN INTERNATIONAL JOURNAL

Advanced deep learning framework for underwater object detection with multibeam forward-looking sonar
Ge L, Singh P and Sadhu A
Underwater object detection (UOD) is an essential activity in maintaining and monitoring underwater infrastructure, playing an important role in their efficient and low-risk asset management. In underwater environments, sonar, recognized for overcoming the limitations of optical imaging in low-light and turbid conditions, has increasingly gained popularity for UOD. However, due to the low resolution and limited foreground-background contrast in sonar images, existing sonar-based object detection algorithms still face challenges regarding precision and transferability. To solve these challenges, this article proposes an advanced deep learning framework for UOD that uses the data from multibeam forward-looking sonar. The framework is adapted from the network architecture of YOLOv7, one of the state-of-the-art vision-based object detection algorithms, by incorporating unique optimizations in three key aspects: data preprocessing, feature fusion, and loss functions. These improvements are extensively tested on a dedicated public dataset, showing superior object classification performance compared to the selected existing sonar-based methods. Through experiments conducted on an underwater remotely operated vehicle, the proposed framework validates significant enhancements in target classification, localization, and transfer learning capabilities. Since the engineering structures have similar geometric shapes to the objects tested in this study, the proposed framework presents potential applicability to underwater structural inspection and monitoring, and autonomous asset management.
Deep learning-based obstacle-avoiding autonomous UAVs with fiducial marker-based localization for structural health monitoring
Waqas A, Kang D and Cha YJ
This paper proposes a framework for obstacle-avoiding autonomous unmanned aerial vehicle (UAV) systems with a new obstacle avoidance method (OAM) and localization method for autonomous UAVs for structural health monitoring (SHM) in GPS-denied areas. There are high possibilities of obstacles in the planned trajectory of autonomous UAVs used for monitoring purposes. A traditional UAV localization method with an ultrasonic beacon is limited to the scope of the monitoring and vulnerable to both depleted battery and environmental electromagnetic fields. To overcome these critical problems, a deep learning-based OAM with the integration of You Only Look Once version 3 (YOLOv3) and a fiducial marker-based UAV localization method are proposed. These new obstacle avoidance and localization methods are integrated with a real-time damage segmentation method as an autonomous UAV system for SHM. In indoor testing and outdoor tests in a large parking structure, the proposed methods showed superior performances in obstacle avoidance and UAV localization compared to traditional approaches.
Deep learning-based concrete defects classification and detection using semantic segmentation
Arafin P, Billah AM and Issa A
Visual damage detection of infrastructure using deep learning (DL)-based computational approaches can facilitate a potential solution to reduce subjectivity yet increase the accuracy of the damage diagnoses and accessibility in a structural health monitoring (SHM) system. However, despite remarkable advances with DL-based SHM, the most significant challenges to achieving the real-time implication are the limited available defects image databases and the selection of DL networks depth. To address these challenges, this research has created a diverse dataset with concrete crack (4087) and spalling (1100) images and used it for damage condition assessment by applying convolutional neural network (CNN) algorithms. CNN-classifier models are used to identify different types of defects and semantic segmentation for labeling the defect patterns within an image. Three CNN-based models-Visual Geometry Group (VGG)19, ResNet50, and InceptionV3 are incorporated as CNN-classifiers. For semantic segmentation, two encoder-decoder models, U-Net and pyramid scene parsing network architecture are developed based on four backbone models, including VGG19, ResNet50, InceptionV3, and EfficientNetB3. The CNN-classifier models are analyzed on two optimizers-stochastic gradient descent (SGD), root mean square propagation (RMSprop), and learning rates-0.1, 0.001, and 0.0001. However, the CNN-segmentation models are analyzed for SGD and adaptive moment estimation, trained with three different learning rates-0.1, 0.01, and 0.0001, and evaluated based on accuracy, intersection over union, precision, recall, and F1-score. InceptionV3 achieves the best performance for defects classification with an accuracy of 91.98% using the RMSprop optimizer. For crack segmentation, EfficientNetB3-based U-Net, and for spalling segmentation, IncenptionV3-based U-Net model outperformed all other algorithms, with an F1-score of 95.66 and 89.43%, respectively.
Efficient attention-based deep encoder and decoder for automatic crack segmentation
Kang DH and Cha YJ
Recently, crack segmentation studies have been investigated using deep convolutional neural networks. However, significant deficiencies remain in the preparation of ground truth data, consideration of complex scenes, development of an object-specific network for crack segmentation, and use of an evaluation method, among other issues. In this paper, a novel semantic transformer representation network (STRNet) is developed for crack segmentation at the pixel level in complex scenes in a real-time manner. STRNet is composed of a squeeze and excitation attention-based encoder, a multi head attention-based decoder, coarse upsampling, a focal-Tversky loss function, and a learnable swish activation function to design the network concisely by keeping its fast-processing speed. A method for evaluating the level of complexity of image scenes was also proposed. The proposed network is trained with 1203 images with further extensive synthesis-based augmentation, and it is investigated with 545 testing images (1280 × 720, 1024 × 512); it achieves 91.7%, 92.7%, 92.2%, and 92.6% in terms of precision, recall, F1 score, and mIoU (mean intersection over union), respectively. Its performance is compared with those of recently developed advanced networks (Attention U-net, CrackSegNet, Deeplab V3+, FPHBN, and Unet++), with STRNet showing the best performance in the evaluation metrics-it achieves the fastest processing at 49.2 frames per second.