New AI Model Boosts Accuracy in Airport Pavement Crack Detection

New AI Model Boosts Accuracy in Airport Pavement Crack Detection

Detecting cracks in airport pavements has long posed a formidable challenge for engineers and maintenance teams. Cracks are often narrow, irregularly shaped, and embedded in complex backgrounds filled with tire marks, rubber deposits, shadows, and surface textures that confound traditional inspection methods. Manual visual inspections are slow, costly, and prone to human error, while conventional computer vision techniques frequently fail under real-world conditions marked by variable lighting, low contrast, and high noise.

Now, a team of researchers from the College of Computer Science and Technology at the Civil Aviation University of China has introduced a novel deep learning architecture specifically designed to tackle these challenges. Their approach, detailed in a recent paper published in the Journal of Nanjing University of Aeronautics & Astronautics, leverages deformable convolutions and multi-scale feature fusion to achieve unprecedented accuracy in pixel-level crack segmentation—reaching an F1-Score of 90.95% on a real-world dataset collected from multiple Chinese airports.

The model, dubbed DFNet (Deformable Convolution and Feature Fusion Neural Network), represents a significant leap forward in automated infrastructure inspection. Unlike prior methods that rely on fixed geometric receptive fields or shallow feature hierarchies, DFNet dynamically adapts its sampling strategy to the morphology of cracks, captures contextual information across multiple spatial scales, and intelligently merges low-level details with high-level semantics to produce crisp, precise segmentation maps.

This advancement arrives at a critical time. As global air traffic rebounds and airport infrastructure ages, the need for efficient, reliable, and scalable pavement monitoring systems has never been greater. Cracks, if left undetected or untreated, can rapidly evolve into structural failures that compromise runway safety and necessitate costly emergency repairs. Regulatory frameworks such as China’s Technical Specifications of Aerodrome Pavement Evaluation and Management (MH/T5024–2009) underscore the urgency of timely crack detection, yet existing automated tools have struggled to meet engineering standards in operational environments.

The DFNet framework addresses this gap through three core innovations: a Deformable Convolution Module (DCM), a Multi-scale Convolution Module (MCM), and a Feature Fusion Module (FFM). Each component targets a specific weakness observed in previous algorithms, from rigid convolutional kernels to insufficient integration of hierarchical features.

The DCM replaces standard convolutional layers with deformable convolutions—a technique first proposed in 2017 but rarely applied to pavement inspection. In conventional convolutions, filters sample input features at fixed grid locations (e.g., the 3×3 neighborhood around each pixel). This rigidity limits the network’s ability to capture elongated, curvilinear structures like cracks, which may span arbitrary orientations and scales. Deformable convolutions overcome this by learning spatial offsets for each sampling point during training. These offsets allow the receptive field to stretch, rotate, or skew in response to local image content, effectively “molding” itself around crack contours. The result is a more expressive feature extractor that adapts to the free-form geometry of real-world pavement distress.

Complementing this adaptive sampling is the MCM, which processes features through parallel convolutional branches with varying kernel sizes—1×1, 3×3, 5×5, and 7×7. This design enables the network to simultaneously capture fine-grained details (via small kernels) and broader contextual cues (via larger kernels). Since airport cracks can range from hairline fractures just a few pixels wide to meter-long fissures, a single receptive field size is insufficient. The multi-scale approach ensures that DFNet remains sensitive to both micro-cracks and macro-fractures without sacrificing computational efficiency, thanks to depth-wise separable convolutions and channel-wise feature aggregation.

Perhaps the most impactful innovation lies in the FFM. Drawing inspiration from encoder-decoder architectures like U-Net and FCN, DFNet fuses features from multiple stages of the encoding process—not just the final layer. Early layers retain high-resolution spatial information crucial for delineating crack boundaries, while deeper layers encode semantic understanding that helps distinguish true cracks from look-alike artifacts such as shadows or painted markings. By summing and concatenating these complementary signals before upsampling, DFNet achieves a rare balance: high recall (89.72%) without compromising precision (92.21%). This is particularly vital in safety-critical applications where missing a crack (false negative) is far more dangerous than flagging a benign anomaly (false positive).

The research team validated DFNet against six established baselines: Canny edge detection, Free-form Anisotropy (FFA), CrackForest, Fully Convolutional Networks (FCN), U-Net, and DeepCrack. All models were evaluated on a custom dataset comprising 960 high-resolution images (1800×900 pixels) captured by an autonomous runway inspection robot developed by Chengdu Guimu Robotics Co., Ltd. The robot, equipped with a Teledyne DALSA nano m1920 CMOS camera, operates at speeds of 20–30 km/h, simulating real-world inspection conditions. To augment the dataset, the team applied sliding-window cropping (512×512 patches) and geometric augmentations (flips, rotations), ultimately generating 12,960 training samples split 8:1:1 into training, validation, and test sets.

Quantitative results were unequivocal. DFNet outperformed all competitors across every metric: Pixel Accuracy (99.59%), Intersection over Union (56.20%), Precision (92.21%), Recall (89.72%), and F1-Score (90.95%). Traditional methods like Canny and FFA faltered dramatically, with F1-Scores below 22%, primarily due to sensitivity to noise and poor generalization in low-contrast scenes. Even advanced deep learning models showed limitations: FCN suffered from blurry boundaries, U-Net missed cracks under water stains, and DeepCrack exhibited low recall when cracks overlapped with runway markings.

Visual inspections of segmentation outputs further confirmed DFNet’s robustness. In challenging scenarios—such as cracks obscured by tire rubber, intersecting with taxiway lines, or embedded in textured concrete—DFNet consistently produced clean, continuous crack maps with minimal fragmentation or false positives. This resilience stems from its ability to jointly optimize geometric adaptability (via DCM), contextual awareness (via MCM), and hierarchical feature integration (via FFM).

To rigorously assess the contribution of each module, the authors conducted ablation studies. Removing the DCM caused a 1.04-point drop in F1-Score, confirming that deformable sampling is essential for capturing irregular crack morphologies. Eliminating the MCM reduced performance by 1.19 points, underscoring the value of multi-scale context. Most notably, disabling the FFM led to the largest decline in recall (down to 86.41%), highlighting the critical role of feature fusion in preserving fine details that shallow decoders often lose.

Despite these achievements, the authors acknowledge limitations. DFNet’s inference speed—while adequate for offline analysis—does not yet meet real-time requirements for high-speed robotic inspection. Additionally, while the model excels on the collected dataset, its generalizability to international airports with different pavement materials, lighting conditions, or climate-induced distress patterns remains to be tested. Future work may explore lightweight architectures, domain adaptation techniques, or integration with 3D sensing modalities like LiDAR.

Nevertheless, the implications of this research are profound. By pushing the boundaries of pixel-level segmentation in a high-stakes industrial domain, DFNet demonstrates how tailored deep learning architectures can solve real engineering problems more effectively than generic off-the-shelf models. Its modular design also invites further innovation—researchers could swap in attention mechanisms, graph neural networks, or transformer blocks to enhance performance further.

From a broader perspective, this work exemplifies the growing synergy between civil infrastructure management and artificial intelligence. As airports worldwide seek to modernize maintenance protocols through digital twins, predictive analytics, and autonomous inspection systems, algorithms like DFNet will form the perceptual backbone of next-generation asset management platforms. The ability to detect millimeter-scale defects with high confidence not only extends pavement lifespan but also enhances aviation safety—a mission of paramount importance in an industry where margins for error are vanishingly small.

In conclusion, the DFNet model marks a significant milestone in the automation of airport pavement inspection. By harmonizing geometric flexibility, multi-scale perception, and hierarchical feature fusion, it sets a new benchmark for crack detection accuracy in complex, real-world environments. As the aviation sector continues its digital transformation, such innovations will be indispensable in building safer, smarter, and more resilient infrastructure for the future.

Authors: Haifeng Li, Pan Jing, Hongyang Han — College of Computer Science and Technology, Civil Aviation University of China, Tianjin 300300, China
Published in: Journal of Nanjing University of Aeronautics & Astronautics, Vol. 53, No. 6, December 2021, pp. 981–988
DOI: 10.16356/j.1005-2615.2021.06.018