AI-Assisted Ultrasound Boosts Breast Lesion Diagnosis Accuracy for Trainee Physicians

Breast cancer has emerged as one of the most life-threatening malignancies for women worldwide over the past decade, with its incidence and mortality rates rising at an alarming pace. Early and accurate diagnosis is widely recognized as the cornerstone of improving the prognosis of breast cancer patients, and ultrasound has become the first-line imaging modality for breast lesion examination in China, particularly due to the small breast volume and high breast density among Chinese women. However, the clinical practice of breast ultrasound in the country is currently plagued by three major challenges: a severe shortage of ultrasound physicians, the high level of clinical experience required for accurate diagnosis, and the high risk of missed diagnoses caused by physician fatigue. Against this backdrop, the integration of artificial intelligence (AI) into medical imaging analysis has emerged as a promising solution to address these pain points, with the potential to significantly enhance the efficiency and quality of clinical work for medical professionals. Among the various AI-based tools developed for this purpose, the S-Detect software has attracted extensive attention for its ability to assist in the differential diagnosis of benign and malignant breast lesions by extracting and analyzing features from breast ultrasound images.

A team of researchers from the Department of Ultrasonography at the Third Affiliated Hospital of Zhengzhou University has conducted an in-depth study to explore the diagnostic value of the S-Detect medical imaging AI software in the evaluation of breast masses by standardized training physicians, as well as to identify the key factors influencing these physicians’ willingness to use medical imaging AI in their clinical practice. The study, which focused on analyzing the diagnostic efficacy of S-Detect and the psychological and cognitive factors affecting AI adoption among trainee doctors, has yielded compelling results that highlight the transformative potential of AI in medical education and clinical practice for breast ultrasound diagnosis. The research involved 40 standardized training physicians and 60 cases of breast masses with confirmed pathological results, and combined a clinical diagnostic experiment with a questionnaire survey to conduct a comprehensive and systematic analysis, providing valuable empirical evidence for the popularization and application of medical imaging AI in the standardized training of ultrasound physicians.

The research design of the study was rigorous and multi-faceted, consisting of two core parts: a clinical experiment to compare the diagnostic accuracy of trainee physicians with and without the assistance of S-Detect, and a questionnaire survey to investigate the factors influencing their willingness to use medical imaging AI. For the clinical experiment, the research team randomly selected 60 two-dimensional breast ultrasound images stored in medical equipment, including 40 benign lesions and 20 malignant lesions. All these images were collected by a senior breast ultrasound physician, and the patients involved were aged between 18 and 62 years, with an average age of 37 years, all of whom had clear pathological diagnoses. To ensure the validity of the study results, patients who had undergone puncture, radiotherapy or chemotherapy before the ultrasound examination in the hospital were excluded from the research sample. The ultrasound examinations in the study were performed using a Samsung RS80A ultrasound diagnostic instrument with a probe frequency of 3-12 MHz and equipped with the S-Detect software, and all diagnostic classifications were based on the 5th edition of the BI-RADS classification criteria for breast ultrasound, with BI-RADS 4A category serving as the critical threshold for distinguishing benign and malignant breast masses.

The 40 standardized training physicians participating in the study covered different grade levels, including 6 physicians from the 2017 grade, 19 from the 2018 grade and 15 from the 2019 grade, with a gender distribution of 7 males and 33 females, aged between 23 and 31 years, with an average age of 27.3 years. Prior to the formal experiment, all trainee physicians received systematic and detailed training on the 5th edition of the breast ultrasound BI-RADS classification criteria and the basic knowledge of ultrasound differential diagnosis of benign and malignant breast masses. A senior ultrasound physician was responsible for demonstrating the operation of the S-Detect software in detail, and after the completion of the teaching course, the trainee physicians were given sufficient time to familiarize themselves with the ultrasound equipment and conduct simulated training on the pre-stored images using the S-Detect software. During the training process, the physicians were allowed to ask questions at any time, and the instructors provided immediate and targeted answers to ensure that all participants mastered the basic operation methods and application principles of the software. All trainee physicians were fully informed of the research objectives and procedures and participated in the study on a voluntary basis, which fully guaranteed the ethical compliance of the research.

The experimental process was divided into two phases: in the first phase, the 40 trainee physicians classified the 60 breast masses into BI-RADS categories solely based on the two-dimensional ultrasound features of the lesions without the assistance of the S-Detect software. In the second phase, the same group of physicians classified the same breast masses using the S-Detect software, performing the classification on the maximum transverse and longitudinal sections of the masses. In the computer-aided diagnosis (CAD) mode of the software, the breast masses were automatically outlined with a linear contour, and the physicians were also allowed to manually adjust the contour to ensure the accuracy of the region of interest. Within a few seconds of the operation, the S-Detect software would generate a diagnostic suggestion indicating whether the lesion was likely benign or malignant, and at the same time provide a series of ultrasound feature descriptions as reference, including the shape, orientation, boundary, internal echo and posterior acoustic shadow of the mass. In the study, lesions classified as BI-RADS 4A or below were considered benign, and a consistent diagnosis with the pathological results was defined as a benign coincidence; lesions classified as BI-RADS 4B or above were considered malignant, and a consistent diagnosis with the pathological results was defined as a malignant coincidence. The diagnostic coincidence rate, which represents the diagnostic accuracy, was used as the core indicator to evaluate the diagnostic efficacy of the two methods.

In addition to the clinical diagnostic experiment, the research team also designed a specialized questionnaire to investigate the factors influencing the standardized training physicians’ willingness to use medical imaging AI. The questionnaire was designed to examine four key dimensions to explore the relationships between the physicians’ understanding of medical imaging AI, their dependence on medical imaging AI, their anxiety about medical imaging AI, and their actual willingness to use medical imaging AI in clinical practice. The questionnaire adopted a Likert 5-point scale for scoring, with 5 points for “completely agree”, 4 points for “relatively agree”, 3 points for “basically agree”, 2 points for “not very agree”, and 1 point for “completely disagree”. This scoring method ensured the objectivity and quantifiability of the questionnaire data, laying a solid foundation for subsequent statistical analysis.

For the statistical analysis of the research data, the research team used the SPSS 23.0 statistical software to conduct a comprehensive and in-depth analysis. The count data were expressed as percentages, and the Wilcoxon test and rank sum test were used to compare the diagnostic coincidence rates of the trainee physicians before and after the application of the S-Detect software, with a P-value of less than 0.05 considered statistically significant. The KMO value and Cronbach’s α coefficient were calculated to evaluate the validity and reliability of the overall scale and each subscale of the questionnaire, and the Pearson correlation coefficient and multiple linear regression analysis were used to explore the correlation between various influencing factors and the willingness to use medical imaging AI, as well as the degree of influence of each factor.

The results of the clinical diagnostic experiment were striking and statistically significant. The 40 standardized training physicians achieved a diagnostic coincidence rate of 84.17% for the 60 breast masses when not using the S-Detect software, while the diagnostic coincidence rate increased significantly to 93.75% when assisted by the S-Detect software. The statistical comparison between the two groups of data showed a highly significant difference (P<0.01), which fully demonstrated that the application of the S-Detect AI software can significantly improve the diagnostic accuracy of standardized training physicians in the identification of benign and malignant breast masses. The pathological classification of the 60 breast masses in the study showed that the 40 benign lesions included 28 fibroadenomas, 2 adenosis with fibroadenoma, 6 adenosis, 1 lipoma, 1 suppurative inflammation, and 2 intraductal papillomas; the 20 malignant lesions included 18 invasive carcinoma of no special type and 2 ductal carcinoma in situ, with the maximum diameter of the masses ranging from 7 mm to 39 mm. The detailed pathological classification provided a clear and reliable gold standard for the evaluation of diagnostic efficacy, further enhancing the credibility of the study results.

The results of the questionnaire survey and its statistical analysis also provided in-depth insights into the factors influencing the standardized training physicians’ willingness to use medical imaging AI. First of all, the validity and reliability tests of the questionnaire showed excellent results: the KMO value of the overall scale was 0.761, and the KMO values of the subscales for understanding, dependence, anxiety and willingness to use were 0.825, 0.739, 0.790 and 0.643 respectively. All KMO values of the overall scale and each subscale were greater than 0.6, indicating that the scale had good structural validity and was suitable for factor analysis. The Cronbach’s α coefficient of the overall scale was 0.904, and the Cronbach’s α coefficients of the four subscales were 0.919, 0.851, 0.829 and 0.771 respectively, all of which were greater than 0.7, reflecting the high internal consistency and reliability of the questionnaire scale. The excellent validity and reliability test results laid a solid foundation for the scientificity and accuracy of the subsequent correlation analysis and regression analysis.

The Pearson correlation analysis of the four dimensions and the demographic factor of age revealed multiple significant correlation relationships. The analysis results showed that the physicians’ understanding of medical imaging AI was strongly positively correlated with their willingness to use it (r=0.647, P<0.01), their dependence on medical imaging AI was moderately positively correlated with their willingness to use it (r=0.597, P<0.01), and their anxiety about medical imaging AI was weakly positively correlated with their willingness to use it (r=0.317, P<0.05). In terms of the correlation with age, age was significantly negatively correlated with understanding (r=-0.415, P0.05), anxiety (r=-0.244, P>0.05) and willingness to use (r=-0.477, P<0.01), indicating that younger standardized training physicians tend to have a higher level of understanding of medical imaging AI, a stronger willingness to use it, and lower anxiety about the application of AI technology in clinical practice.

The multiple linear regression analysis further clarified the key factors that have a significant impact on the willingness to use medical imaging AI. The research team took age, understanding, dependence and anxiety as independent variables and the willingness to use as the dependent variable for multiple linear regression analysis, adopting the “backward” method for variable selection. All variables were initially included in the regression model at one time, and then the variables with insignificant influence on the regression equation were eliminated in order until no variables met the elimination criteria. After three iterations of variable selection, the factor of “anxiety about medical imaging AI” was eliminated from the model, and the final regression model showed good fitting performance: the R² of the model was 0.604, and the adjusted R² was 0.571, indicating that the remaining independent variables could explain 57.1% of the variation in the dependent variable of willingness to use, reflecting a strong explanatory power of the model. The significance test of the model showed a P-value of 0, indicating that the overall regression model was highly statistically significant.

The final regression equation obtained from the analysis was: Willingness to use = 3.055 – 0.058 × Age + 0.285 × Understanding + 0.306 × Dependence. From the size of the regression coefficients of the independent variables in the model, it can be seen that the regression coefficient of dependence on medical imaging AI is the highest (0.306), followed by the understanding of medical imaging AI (0.285), and age has a negative regression coefficient (-0.058). This result means that under the condition that other factors remain unchanged, each one-unit increase in the degree of dependence on medical imaging AI will bring the greatest improvement in the willingness to use it; the improvement in the level of understanding of medical imaging AI will also significantly enhance the willingness to use it; while with the increase of age, the willingness of standardized training physicians to use medical imaging AI will show a slight decrease. This finding is of great practical significance for the formulation of medical education and training strategies for ultrasound physicians, pointing out the key directions for improving the acceptance and application of medical imaging AI among trainee physicians.

The in-depth discussion of the study results further revealed the clinical application value of the S-Detect software and the practical implications of the factors influencing the willingness to use medical imaging AI. Standardized resident training is an important part of postgraduate medical education, and it is crucial for training high-level clinical physicians and improving the overall quality of medical services. However, trainee physicians are generally lacking in clinical experience, which greatly limits the quality of their ultrasound reports and their diagnostic accuracy in clinical practice. The results of this study clearly show that the application of the S-Detect AI software can effectively make up for the deficiency of insufficient clinical experience of trainee physicians and significantly improve their diagnostic accuracy in the differential diagnosis of benign and malignant breast masses. This not only reflects the important auxiliary role of medical imaging AI in clinical diagnosis but also highlights its great potential in medical education and standardized training of physicians.

It is worth noting that although the S-Detect software has shown excellent auxiliary diagnostic effects, the study also found that there are individual cases of misdiagnosis in the clinical application process. For example, one case of sclerosing adenosis among the benign lesions was classified as BI-RADS 4B or above by the software, because this lesion had four malignant ultrasound features: irregular shape, aspect ratio greater than 1, microcalcifications inside the lesion, and posterior acoustic shadow. This is closely related to the diverse and complex ultrasound manifestations of sclerosing adenosis, which make it difficult to distinguish from malignant lesions by imaging features alone. In addition, one case of suppurative inflammation was also classified as BI-RADS 4B or above, with malignant features such as uneven internal echo, blurred edges, and a high-echo halo around the lesion. However, this type of lesion can be easily distinguished from malignant tumors through a comprehensive analysis combining medical history inquiry, visual inspection and palpation. These misdiagnosis cases clearly indicate that S-Detect, as a computer-aided diagnostic technology, cannot completely replace the clinical judgment of physicians based on medical history collection, physical examination and other clinical information. Therefore, in the clinical application of medical imaging AI tools such as S-Detect, it is essential to combine the AI diagnostic suggestions with the patient’s medical history, clinical symptoms, physical signs and other comprehensive information for a holistic judgment, so as to maximize the advantages of AI technology and avoid the risks of over-reliance on AI leading to misdiagnosis.

The development and application of AI in medical imaging have a long history and a broad prospect. The concept of AI was first formally proposed by McCarthy in 1956, and machine learning and deep learning, as important branches of AI, have been widely applied in the field of medical imaging in recent years. As early as 2012, Jamieson and other researchers began to explore the application of deep learning technology in the classification of breast ultrasound images. In subsequent studies, Ciritsis and his team used a deep convolutional neural network (DCNN) to detect and classify breast lesions on ultrasound images of 582 patients, and even refined the classification to the BI-RADS category, achieving an accuracy rate that exceeded that of human physicians. These previous studies have laid a solid technical foundation for the application of AI in breast ultrasound diagnosis, and the results of this study further verified the practical value of AI technology in the clinical practice and medical education of breast ultrasound in China.

The average age of the 40 standardized training physicians participating in this study was 27.3 years, and the survey results showed that this group of young physicians has a relatively low level of anxiety about the possibility of AI replacing human physicians, and their understanding of AI is strongly positively correlated with their willingness to use AI. This finding is consistent with the characteristics of young medical professionals in the digital age, who are more receptive to new technologies and have a stronger learning ability for emerging medical technologies such as AI. Based on this finding, the research team proposed that the standardized training of physicians should gradually increase the proportion of online and offline AI training courses, provide more opportunities for young physicians to communicate and learn outside the hospital, and strengthen their systematic learning of medical imaging AI knowledge and skills. By improving the level of understanding and mastery of AI technology among trainee physicians, their willingness to use AI in clinical practice can be effectively enhanced, and the integration of AI technology into daily clinical work can be promoted, so as to realize the organic combination of AI technology and clinical medical practice.

In conclusion, this study has achieved two important research results with significant clinical and educational value: on the one hand, the application of the S-Detect medical imaging AI software can significantly improve the diagnostic accuracy of standardized training physicians in the differential diagnosis of benign and malignant breast masses, which provides a reliable auxiliary diagnostic tool for trainee physicians in breast ultrasound practice and helps to solve the problem of insufficient diagnostic accuracy caused by lack of clinical experience; on the other hand, the level of understanding of medical imaging AI among standardized training physicians is the key factor affecting their willingness to use AI, and younger physicians have a higher willingness to use AI technology, while the degree of dependence on AI also has a significant positive impact on the willingness to use it. These findings provide clear guidance for the reform and optimization of the standardized training system for ultrasound physicians, indicating that medical education institutions and clinical medical institutions should strengthen the AI-related training in the standardized training process, improve the cognitive level and operational ability of trainee physicians on medical imaging AI, and guide them to establish a scientific view of AI application, so as to make medical imaging AI technology better serve clinical diagnosis and medical education, and continuously improve the quality and efficiency of breast ultrasound diagnosis in clinical practice.

With the continuous development and innovation of AI technology, the integration of medical imaging AI into clinical medicine and medical education will become an inevitable trend in the development of the medical industry. The results of this study not only provide empirical evidence for the clinical application of the S-Detect software but also lay a theoretical foundation for the popularization and application of medical imaging AI in the standardized training of physicians. In the future, it is necessary to carry out more in-depth and multi-center research to further explore the application value of different AI-based medical imaging tools in the diagnosis of various diseases, and to continuously optimize the training system of medical professionals combining AI technology, so as to promote the high-quality development of the medical industry with the power of science and technology, and provide more accurate and efficient medical services for patients.

Authors: Wang Huizhu, Yuan Wanru, Wang Xinxia, Liu Yun, Wang Yingying, Yue Lifang Affiliation: Department of Ultrasonography, the Third Affiliated Hospital of Zhengzhou University, Zhengzhou 450052, Henan Province, China Journal: Journal of Modern Medicine and Health DOI: 10.3969/j.issn.1009-5519.2021.10.044

(Word count: 3896)