AI-Powered System Streamlines Early Detection of Infant Hip Dysplasia, Boosts Diagnostic Consistency

AI-Powered System Streamlines Early Detection of Infant Hip Dysplasia, Boosts Diagnostic Consistency

In a significant leap forward for pediatric diagnostics, researchers from Nanjing Medical University, Shenzhen University, and Guangdong Women and Children’s Hospital have unveiled an intelligent auxiliary screening system designed to revolutionize the early detection of developmental dysplasia of the hip (DDH) in infants. The system, detailed in a peer-reviewed study published in the Journal of Shenzhen University Science and Engineering, tackles longstanding challenges in ultrasound-based DDH screening by automating two critical, traditionally subjective steps: identifying the optimal ultrasound image plane and precisely measuring key anatomical angles. This innovation promises not only to enhance diagnostic accuracy and consistency across diverse clinical settings but also to significantly reduce the burden on healthcare professionals, particularly in resource-limited environments where expert experience may be scarce.

Developmental dysplasia of the hip is a relatively common congenital condition affecting approximately 0.15% to 2.00% of newborns. It involves abnormal development of the hip joint, where the femoral head does not sit securely within the acetabulum, potentially leading to partial or complete dislocation. If left undetected and untreated, DDH can result in chronic pain, gait abnormalities, and early-onset arthritis later in life. Fortunately, when diagnosed early—typically within the first six months of life—treatment is highly effective, with success rates exceeding 96%. This underscores the critical importance of robust, accessible screening programs for newborns.

The current gold standard for infant DDH screening is the Graf method, which relies on ultrasound imaging. This method requires clinicians to acquire a high-quality, standardized ultrasound image—a “standard plane”—that clearly displays specific anatomical landmarks: the straight iliac bone, the lower edge of the iliac bone, the labrum, and the cartilage-bone junction. Once this ideal image is obtained, clinicians manually measure two crucial angles, alpha (α) and beta (β), which are then used, alongside the infant’s age, to classify the severity of the dysplasia. While widely adopted, the Graf method is inherently subjective and labor-intensive. Its effectiveness hinges heavily on the operator’s skill and experience. Obtaining the perfect standard plane can be challenging due to infant movement during scanning. Furthermore, even experienced practitioners can exhibit measurement variability of up to 3 degrees for the alpha angle and 6 degrees for the beta angle, primarily due to image noise, artifacts, and differing interpretations of subtle anatomical features. This variability is even more pronounced among less experienced clinicians, particularly in primary care or rural settings, potentially leading to missed diagnoses or unnecessary interventions.

Recognizing these limitations, the research team led by Xindi Hu, Xin Yang, Xu Zhou, Limin Wang, Yongdong Liang, Ning Shang, Dong Ni, and Ning Gu set out to develop a comprehensive, AI-driven solution that could automate and standardize the entire screening workflow. Their proposed system is elegantly structured into two core modules: an automatic standard plane recognition module and an automatic fast measurement module. The goal was not merely to replicate human tasks but to create a tool that enhances reliability, reduces dependence on vast amounts of labeled training data, and operates efficiently enough for real-time clinical use.

The first major hurdle addressed by the team was the identification of the standard plane. Traditionally, this task has been largely overlooked in automated DDH screening systems, despite being the foundational step upon which all subsequent measurements depend. Acquiring a reliable dataset for training a conventional binary classifier (standard vs. non-standard plane) is extremely difficult. While standard planes have a clear definition, non-standard planes encompass a virtually infinite variety of poor-quality images, incomplete views, or noisy scans. Labeling such a diverse negative class is prohibitively time-consuming and impractical. To circumvent this, the researchers innovated by employing a few-shot one-class classifier (FOC) network. This approach represents a paradigm shift: instead of requiring both positive and negative examples, the FOC network is trained exclusively on a small number of confirmed standard plane images. By leveraging self-supervised learning, the network learns the intrinsic visual characteristics that define a “standard” image without needing explicit examples of what constitutes “non-standard.”

The self-supervised training strategy is ingenious. The researchers applied a series of geometric transformations—horizontal and vertical translations, and rotations—to each standard plane image in their limited training set. For each transformed image, they generated a corresponding “pseudo-label” indicating which specific transformation had been applied. The neural network, built upon a ResNet34 backbone, was then trained to predict these pseudo-labels. In essence, the network learned to recognize the canonical structure of the standard plane by understanding how its features change under controlled geometric distortions. During inference, when presented with a new, unseen ultrasound frame, the network applies the same transformations and predicts the associated labels. The confidence with which it correctly identifies the transformations serves as a “standardization score.” A high score indicates that the image closely matches the learned characteristics of a standard plane, while a low score suggests deviation. This method dramatically reduces the annotation burden, requiring only about one-third of the training samples typically needed for supervised classification.

The second module, responsible for the actual measurement of the alpha and beta angles, presented another set of challenges. Traditional semantic segmentation networks, which assign a class label to every pixel in an image, often struggle with the low contrast, blurry edges, and structural ambiguities inherent in ultrasound images. Moreover, these networks can be computationally expensive, making real-time analysis impractical. To overcome these issues, the team developed a fast instance network (FIN). Unlike semantic segmentation, which treats all pixels belonging to a class as identical, instance segmentation distinguishes between individual instances of objects—even if they belong to the same class. FIN adopts a single-stage architecture, similar to Yolact, which is known for its speed. This design allows the network to simultaneously detect the location of the key anatomical structures (iliac bone, lower iliac edge, labrum, and cartilage-bone junction) and generate precise segmentation masks for each.

The FIN architecture is sophisticated yet efficient. It utilizes a Feature Pyramid Network (FPN) with a ResNet50 backbone to extract multi-scale features from the input image. These features are then processed through two parallel pathways: one generates a set of “prototype masks,” which are generic templates for the shapes of the target structures, and the other predicts coefficients that determine how to combine these prototypes to accurately fit each specific instance detected in the image. This combination of detection and mask generation ensures that the network not only locates the structures but also delineates their boundaries with high precision. Crucially, the network incorporates a “fast NMS” (Non-Maximum Suppression) algorithm to quickly eliminate redundant bounding box predictions, further boosting its speed. The system doesn’t just calculate the angles; it visually overlays the measured lines and calculated values directly onto the ultrasound image, providing immediate, intuitive feedback to the clinician.

The researchers conducted rigorous experiments to validate the performance of their system against established benchmarks. For the standard plane recognition task, the FOC network was compared against three other one-class classification methods: One-Class Support Vector Machine (OCSVM), Deep Support Vector Data Description (deep SVDD), and Ganomaly. Using a dataset comprising 185 cases with 329 videos, the FOC network demonstrated superior performance across multiple metrics. Most notably, its Area Under the Receiver Operating Characteristic Curve (AUROC) reached 76.43%, significantly outperforming OCSVM (65.92%), deep SVDD (63.27%), and Ganomaly (62.67%). The FOC network also achieved the highest recall rate (79.17%), which is clinically critical as it indicates the system’s ability to correctly identify true standard planes, minimizing the risk of missing a potential case. While the FOC network showed some tendency to misclassify near-standard planes as standard (a trade-off for its high recall), its overall performance validated the efficacy of the few-shot, self-supervised approach.

For the angle measurement task, the FIN network was benchmarked against three popular semantic segmentation architectures: Full Convolutional Network (FCN), U-Net, and Deeplab V3. The evaluation used a larger dataset of 634 cases, yielding 1,321 annotated standard plane images. The results were unequivocal: FIN outperformed all competitors in terms of both accuracy and speed. In detection metrics, FIN achieved the highest mean Intersection over Union (mIoU) of 77.19% and mean Average Precision (mAP) of 55.25%. In segmentation quality, it scored the best Dice Similarity Coefficient (DSC) of 86.69% and Jaccard Coefficient (JAC) of 76.68%, while exhibiting the lowest Hausdorff Distance (HD) and Average Surface Distance (ASD), indicating superior boundary accuracy. Most impressively, FIN operated at a remarkable inference speed of 33.88 frames per second (FPS), comfortably exceeding the 30 FPS threshold required for real-time clinical interaction. In contrast, FCN, while also achieving over 30 FPS, lagged significantly in segmentation accuracy.

The ultimate test of any diagnostic tool is its impact on clinical outcomes. The study demonstrated that the FIN network’s angle measurements were not only faster but also more accurate than those performed manually by different clinicians. The average absolute error (MAD) for the alpha angle was 2.48 degrees, and for the beta angle, it was 4.38 degrees. These figures were substantially lower than the average manual measurement errors observed between different human operators (3.98 degrees for alpha and 5.83 degrees for beta). This finding is perhaps the most compelling evidence of the system’s value: it acts as a powerful equalizer, bringing the precision of an expert’s measurement to every scan, regardless of the operator’s level of experience. By providing consistent, objective measurements and visual annotations, the system empowers clinicians, facilitates training, and ultimately contributes to more equitable and higher-quality care for infants.

The implications of this research extend beyond the immediate clinical setting. By reducing the reliance on large, meticulously labeled datasets and enabling high-speed, accurate analysis, the system lowers the barrier to entry for implementing advanced DDH screening in diverse healthcare environments. It addresses the global challenge of healthcare disparities by providing a tool that can augment the capabilities of frontline providers, ensuring that infants everywhere have access to reliable, early diagnosis. The modular design of the system, separating plane recognition from angle measurement, also offers flexibility for future integration into existing ultrasound machines or as a standalone software application.

This work represents a mature application of artificial intelligence in medical imaging, moving beyond proof-of-concept to deliver a practical, validated solution to a real-world clinical problem. The researchers’ thoughtful design choices—from the innovative use of self-supervised learning for one-class classification to the adoption of a fast instance segmentation architecture—demonstrate a deep understanding of both the technical challenges and the practical constraints of clinical workflows. Their system doesn’t replace the clinician; rather, it augments their expertise, providing a consistent, reliable foundation upon which clinical judgment can be exercised with greater confidence. As AI continues to permeate healthcare, innovations like this intelligent DDH screening system exemplify the potential for technology to democratize access to high-quality diagnostics, improve patient outcomes, and alleviate the burdens faced by healthcare professionals worldwide.

Hu Xindi, Yang Xin, Zhou Xu, Wang Limin, Liang Yongdong, Shang Ning, Ni Dong, Gu Ning. Intelligent auxiliary screening system for developmental dysplasia of the hip of infants. Journal of Shenzhen University Science and Engineering, 2021, 38(4): 408-418. doi:10.3724/SP.J.1249.2021.04408