From subjective to objective: A pilot study on testicular radiomics analysis as a measure of gonadal function

1 INTRODUCTION

Male gonadal dysfunctions play a causative role in the 50% of infertile couples, either alone or in combination with a female factor.1 Despite the progressive advances made by diagnostic medicine, male infertility aetiology still remains unknown in about 30%–40% of the cases.2

Currently, the diagnostic framework of male infertility involves both the physical and the ultrasonography (US) testicular evaluation.3 Testis US currently represents a topic of interest in andrological research, since several authors have tried to clarify the clinical connection between US parameters and testicular function, including both spermato- and steroidogenesis.4, 5 Recently, the European Academy of Andrology published new reference ranges for US-testicular parameters.4, 6 Among these, the testicular volume is historically the most investigated parameter, since almost 90% of it is constituted of seminiferous tubules and germ cells, thus representing an indirect measure of testicular spermatogenic capability.7 More recently, a connection between US testicular volume and testicular steroidogenic function has been established8, 9 but the question remains whether the US testicular volume measurement is sufficient to classify testicular functions. Hitherto, a standard method to calculate US testicular volume is unavailable and reference ranges validated to predict both spermatogenic and androgen-secreting capabilities are still lacking.4, 5, 8 Thus, alongside testicular volume, the role of more qualitative US parameters, such as testicular echogenicity and echostructure homogeneity/inhomogeneity, has been explored.3, 10-13 Testicular echogenicity increases during testis development, depending on seminiferous tubules’ maturation and germ cell number increase.14 Indeed, pre-pubertal testes appear more hypoechoic compared to adult ones.14 Accordingly, testicular hypoechogenicity has been associated with reduced spermatogenesis and aberrant interstitial proliferation.15 On the other hand, testicular US inhomogeneity, defined as the absence of a uniform structure, has been recently proposed as a marker of testicular dysfunction related to male infertility.8 However, the definition of testis echostructure homo/inhomogeneity is operator dependent and a widely accepted quantitative measure to describe it is still not adopted.3 The aforementioned limitations have precluded the extension of research-derived testicular US predictive properties in clinical practice. Trying to overcome these confines, Pozza et al. proposed a new scoring system aimed at predicting testicular function combining several US features in subjects referred for testicular US.16 Although promising results has been described, the inter-operator variability still remains a weak point to objectify testis US-derived parameters, limiting all further clinically relevant applications.

Radiomics is a new engineering approach to radiological imaging, overcoming the visual limit of the radiologist by extracting numerical information from the images.17, 18 An increasing number of studies is currently available, applying radiomics in several medical field, but not in human reproduction. Bearing this in mind, a quantitative approach to assess testicular US-derived parameters is required, with the aim to reliably clarify their potential correlation with testicular function. The main aim of this pilot study is to extract objective testicular US features, correlating with testicular function, including both spermato- and steroidogenesis using a radiomics approach, in order to overcome the operator-dependent subjectivity. Engineering approaches are directly applied on testicular US images to correlate texture features with conventional semen parameters and pituitary-gonadal axis hormones, expressions of testis spermatogenetic and steroidogenic capabilities. This innovative approach to the testicular US is aimed to identify quantitatively and objective parameters able to at describing testicular US features.

2 MATERIALS AND METHODS

A prospective observational pilot study was carried out from December 20, 2019 to December 20, 2020. All consecutive male patients attending the Andrology Unit of the Department of Medical Specialties of the Azienda Ospedaliero-Universitaria of Modena (Italy) have been considered eligible. Patients were enrolled according to the following inclusion criteria: (i) age over 18 years, (ii) attendance for couple infertility, and (iii) psycho-physical ability to sign an informed consent. Patients with diseases, genetic or not, known to alter testicular histology and/or function (i.e., hyper- or hypogonadotropic hypogonadism, previous testicular or pituitary surgery, etc.) were excluded, together with patients taking drugs interfering with hypothalamic-pituitary-testicular axis. Azoospermic patients (i.e., no sperms detected in the semen sample) were removed from the classification analysis since they could not be classified as a severe form of oligozoospermia. Accordingly, the testicular histology and the consequent US pattern could be extremely different in azoospermic patients compared to other categories. Moreover, the low number of azoospermic patients enrolled in our study group precluded the possibility to create a single category. Finally, every medical treatment for male infertility was started after the testicular US execution and seminal/hormonal basal assessments to avoid interferences on collected data.

All enrolled patients underwent andrological examination according to the clinical practice. In details, for each patient, personal and familial histories were collected, and physical examination was routinely performed. All patients underwent conventional semen analysis,19 pituitary-gonadal hormones assessment, and testicular US. A dataset was created matching for each patient clinical, seminal, and hormonal values, while all US images were collected. All data were anonymously sent to the Biolab of the Department of Electronics and Telecommunication of Torino (Turin, Italy) for the image processing and the statistical analysis.

The present study was approved by the Ethical Committee of Modena (Protocol Number 1057/2019) and each enrolled patient provided written informed consent to participate.

2.1 Testicular ultrasound

Testicular US was performed by a single operator using a single machine (Esaote My Lab25 Gold, Malmesbury, Wiltshire, UK) during the first evaluation, prior to knowing the hormonal and seminal patient's status. The following data from both testes were collected: Testis longitudinal and transversal sections, testicular volume, vascularization, testicular nodules detection, epididymis characteristics, varicocele, and hydrocele presence. Testicular volume was calculated using the ellipsoid formula—length (cm) × width (cm) × depth (cm) × 0.71—with measures obtained from both axial and longitudinal scans.20 Although not definitely validated, we used this mathematical formula since its superiority in the prediction of real testicular volume was described.21 Varicocele was graded according to Sarteschi's scale.22 Moreover, visual inspection of parenchyma US inhomogeneity for each testis was recorded. Testicular US structure was judged by the operator following a binary classification, that is, 0 for homogeneous testis and 1 for inhomogeneous testis.

2.2 Blood examination

After an overnight fast, morning (8.00 am) blood samples were obtained from all the patients to measure the following hormones: total testosterone, luteinizing hormone (LH), and follicle-stimulating hormone (FSH) serum levels. Serum total testosterone was evaluated by Chemiluminescent Microparticle Immunoassay (Achitect, Abbott, Dundee, UK). LH and FSH were measured by Chemiluminescent Microparticle Immunoassay (Achitect, Abbott, Longford, Ireland).

2.3 Semen analysis

Conventional semen analysis was performed on a semen sample collected through masturbation after 2–7 days of sexual abstinence. The microscopic analysis was performed according to the world health organization criteria.19

2.4 Quantitative image analysis

Radiomics applied provided the US image processing, consisting of four consecutive steps: (i) image segmentation, (ii) image pre-processing, (iii) texture features extraction, and (iv) statistical analysis. Every image processing algorithm was developed in MATLAB (Mathworks, R2020b) on a 2.21 GHz quad-core with 16 GB of RAM.

During the image segmentation (first step), the longitudinal scan was extracted from the US image (Figure 1, Panel A) and the testis was manually segmented (Figure 1, Panel B). The second step (image pre-processing) provided all strategies applied to exclude artificial image artefacts, such as measurement points, crosses, numbers, and zones of the parenchyma with lower echogenicity due to high acoustic impedance difference (Figure 1, Panel C). Once these artefacts were detected, they were excluded from the segmentation (Figure 1, Panel D). Finally, since US images were acquired using different gains to improve consistency in the echogenicity quantification, images were corrected using a logarithmic scale to have a uniform gain equal to 58%. Every image was amplified (in case of image with gain lower than 58 dB) or attenuated (in case of image with gain higher than 58 dB) by a factor of urn:x-wiley:20472919:media:andr13131:andr13131-math-0001

image

Ultrasound (US) image processing. Panel A: The testis US image obtained during a routinely performed testicular US; Panel B: The contour of the manual segmentation of the testicle in the longitudinal view; Panel C: The exclusion of calibre measurement artefacts and zones with lower echogenicity due to high acoustic impedance differences; Panel D: The final mask of the testicle for the texture analysis

The third step provided the texture analysis, which refers to the process of extracting mathematical descriptors from an image. These descriptors are called textural features and could quantify specific properties of the image texture. In this study, a total of 44 textural features belonging to five families were extracted: (i) Four first-order features that statistically describe the distribution of pixel values including mean intensity (echogenicity), variance, skewness, and kurtosis; (ii) nine grey-level co-occurrence matrix-based (GLCM) features,23 evaluating spatial relations between pixels with the same grey level; (iii) 13 grey-level run-length matrix-based (GLRLM) features,24 evaluating presence of consecutive pixels with the same grey level; (iv) 13 grey-level size zone matrix-based features,25 evaluating the presence of zones with the same grey level; (v) five neighbourhood grey tone difference matrix-based features,26 evaluating differences between pixels with the same grey level (Table 1). Only pixel values inside the segmented area after artefacts subtraction were considered for the texture analysis.

2.5 Statistical analysis and classification

The statistical analysis represents the fourth step of the quantitative images analysis applied with radiomics. US texture features previously described were extracted from both left and right testes, considered separately. The statistical analysis on texture features provided first the evaluation of their reliability, evaluating the correlation between left and right testis. Second, Pearson correlation of texture features with the andrologist visual definition of inhomogeneity was performed. Third, Pearson correlation of each texture feature with conventional semen and hormone parameters was measured. In this stage, the LH on testosterone ratio was calculated to better understand the potential predictive role of US-texture images on the steroidogenic compartment. Finally, a multivariate linear regression (MLR) was applied using textural features as measurements and semen and hormone parameters as responses. R squared and p-values were computed. Since the analysis was performed on a large number of US textural features, the final statistical significance of Pearson correlation was adjusted using the Bonferroni correction. During these analyses, sperm motility and morphology were considered as absolute numbers and not as percentage. In order to detect any eventual pattern among the semen and hormonal parameter, a principal component analysis (PCA) was performed. The components extracted were included as responses in an MLR analysis, using US texture features as predictors.

Finally, patients were classified according to conventional semen analysis. Classification analyses were performed to measure the performance of a US texture features-based system in the detection of abnormal values of semen parameters. In details, oligozoospermia was defined when sperm concentration was lower than 15 million/ml, asthenozoospermia when progressive motility was lower than 32% and teratozoospermia when normal forms were below 4%. In order to have a robust classification system, only texture features which showed good agreement between right and left testis were considered. Further, to reduce dimensionality and complexity of the classification analysis, feature reduction was performed using PCA explaining a variance ratio of 99%. On the final dataset, classification analyses were performed by (i) multivariate linear regression analysis, and (ii) machine learning support vector machine (SVM) and artificial neural network (NN). The classification analysis was performed using a 10-fold cross-validation scheme. In details, the dataset was randomly divided into 10 folds, nine folds were randomly selected for training the classifiers and onefold for testing the classification performance. This operation was performed 10 times changing the test fold each time. This allows to evaluate the generalization ability of the classifier (i.e., how the classifier performs with unseen data). Further, to avoid any potential bias in the random selection of folds, the entire procedure was repeated 50 times.

3 RESULTS 3.1 Cohort characteristics

Two hundred and twenty-four US images have been collected from 112 patients (mean age 38.6 ± 9.1 years). Among these, 27 patients (54 images) were not considered for statistical analysis due to lack of hormonal or semen analyses (n = 10) or diagnosis of azoospermia (n = 17) (Figure 2). Finally, a cohort of 85 subjects (mean age was 38.4 ± 9.2 years) with seminal, hormonal, and testis US data was obtained (Figure 2). Thus, the analysis was performed on 170 US images.

image

Flow chart of patients enrolled (US, ultrasonography)

The average calculated testicular volume was 15.8 ± 7.5 ml for the left and 14.8 ± 6.7 ml for the right testis. Testicular inhomogeneity was clinically described during US evaluation in 21 subjects (24.7%) for the left and in 20 subjects (23.5%) for the right testis. The presence of microlithiasis was detected in four patients (4.7%) for the left and in four (4.7%) for the right testis. No solid testicular nodules were detected in our cohort, only a testicular cyst in two patients (2.3%). The epididymal head was within reference ranges with a mean value of 8.0 ± 2.1 mm in the left and 8.6 ± 2.3 mm in the right side. Epididymal inhomogeneity was detected in 20 patients (23.5%) for the left and in 22 patients (25.9%) for the right. Hydrocele was detected in 26 patients (30.6%) and varicocele was detected in 2.3% of the dataset (two patients) in the right side and in 18.8% (16 patients) in the left side. The maximum varicocele degree detected was the 3 score in two patients.

Considering conventional semen analysis, oligozoospermia was recorded in 52.9% of the dataset (45 patients), asthenozoospermia in 56.5% (48 patients), and teratozoospermia in 69.4% (64 patients). In details, Table 1 summarizes semen analysis and hormonal value for the entire cohort analyzed.

TABLE 1. Patient's characteristics considering both semen and hormonal examinations Parameter Reference range Value Semen analysis Semen volume (ml) >1.5 2.5 (1.9) Semen pH >7.2 8.0 ± 0.3 Sperm concentration (million/ml) >15 17.3 (18.0) Total sperm number (million) >39 41.3 (38.1) Progressive sperm motility (%) >32 20.0 (40.0) Total sperm motility (%) >40 30.0 (43.0) Normal forms (%) >4 1.0 (4.0) Hormonal assessment Testosterone (ng/ml) 2.2–7.8 5.1 ± 2.4 LH (IU/L) 1–9 3.5 (2.5) FSH (IU/L) 1–12 4.6 (4.5) Clinical characteristics Unilateral varicocele, n(%) – 14 (16.5%) Bilateral varicocele, n(%) – 2 (2.3%) History of cryptorchidism, n(%) – 1 (1.2%) Data are expressed as mean ± standard deviation or median (interquartile range). 3.2 Texture analysis and texture features reproducibility

During the first step, the images were corrected for consistence of echogenicity. This step was fundamental to adjust the different gains used to acquire the snapshot. Then, longitudinal scans were analyzed by the engineering approach. During the pre-processing phase, each image underwent segmentation and artefacts removal (Figure 1). The second phase provided the texture features extraction. As reported in the methods section, both first-order and advanced features were extracted for a total of 44 variables.

Since each testis was considered separately, the reproducibility of texture features between left and right testis in the same patient was first evaluated. A threshold of 0.5 on the left/right correlation was used to define an acceptable reproducibility. With this approach, 35 on 44 US texture features showed good agreement between right and left testis. This result suggests that the texture features extraction is a good model to describe testicular US characteristics in the same patient, with a good reproducibility between the two testes.

3.3 Correlation with visual defined inhomogeneity

The US inhomogeneity defined by the engineering approach was statistically significant correlated with the inhomogeneity defined by the andrologist. In particular, after Bonferroni correction, 19 texture features were identified as a predictor of the clinical definition of inhomogeneity (Table 2). Further, multiple linear regression analysis highlighted that US texture features significantly predict visually defined inhomogeneity (R-squared 0.379, p < 0.001).

TABLE 2. Correlation analysis between ultrasound (US) texture features and visual inhomogeneity as defined by the andrologist Features Pearson Correlation p-Value Beta coefficient Mean −0.34 <0.001 −0.03 Skewness 0.31 <0.001 0.03 GLCM_SumAverage −0.30 <0.001 −2.47 GLCM_AutoCorrelation −0.29 <0.001 11.50 LGRE 0.32 <0.001 0.42 HGRE −0.28 <0.001 25.80 SRLGE 0.32 <0.001 0.23 SRHGE −0.28 <0.001 −35.20 LRLGE 0.26 <0.001 1.09 LRHGE −0.30 <0.001 0.59 GLVR −0.31 <0.001 0.10 LGZE 0.31 <0.001 −0.39 HGZE −0.28 <0.001 1.46 SZLGE 0.30 <0.001 −1.14 SZHGE −0.28 <0.001 −0.13 LZHGE −0.29 <0.001 −2.19 GLVZ −0.25 <0.001 −0.01 ZSV 0.25 <0.001 −0.12 Strength 0.26 <0.001 0.07 Note: Bold values represent parameters significantly correlated with visual US inhomogeneity. The last column shows the beta coefficients of the multivariate analysis. Abbreviations: GLCM, grey-level co-occurrence matrix; GLN, grey-level non-uniformity; GLV, grey-level-variability; GLVR, grey-level variability of runs; GLVZ, grey-level variability of zones; GLVZ, grey-level variability of zones; HGRE, high grey-level run emphasis; HGZE, high grey-level zone emphasis; LGRE, low grey-level run emphasis; LGZE, low grey-level zone emphasis; LRE, long run emphasis; LRHGE, long run high grey-level emphasis ; LRLGE, long run low grey-level emphasis; LZHGE, large zone high grey-level emphasis; LZE, length size emphasis; LZLGE, large zone low grey level emphasis; RP, run percentage; RLN, run length non-uniformity; RL, run length velocity; SRHGE, short run high grey-level emphasis; SRE, short run emphasis; SRLGE, short run low grey-level emphasis; SZHGE, small zone high grey-level emphasis; SZE, small zone emphasis; SZLGE, small zone low grey-level emphasis; ZP, zone percentage; ZSN, zone size non-uniformity; ZSV, zone size variability. 3.4 Correlation with semen parameters

Thirteen US texture features were significantly correlated with semen parameters (Table 3). In particular, 12 US texture features correlated with total sperm number, 12 texture features with progressive motility, and 12 texture features with total motility (Table 3). Interestingly, no US texture features correlated with sperm morphology (Table S1).

TABLE 3. Correlation analysis between ultrasound texture features and semen parameters obtained by conventional semen analysis Semen parameters Concentration Total number Progressive motility Total motility Features Correlation p-Value Beta Correlation p-Value Beta Correlation p-Value Beta Correlation p-Value Beta Mean 0.2 0.009 14.5 0.2 0.020 24.3 0.1 0.119 2261.3 0.1 0.075 1918.4 Variance −0.1 0.251 1.1 −0.1 0.412 37.5 −0.1 0.482 −223.9 0.0 0.537 376.3 Skewness −0.1 0.041 −17,470.0 −0.1 0.059 −25.1 -0.1 0.234 −97916.0 −0.1 0.157 −185297.0 Kurtosis 0.1 0.213 27,861.0 0.0 0.487 −9.3 0.0 0.925 −144,060.0 0.0 0.818 −31,256.0 GLCM_Energy −0.1 0.102 −3032.0 -0.1 0.111 144.3 −0.1 0.116 −15,827.0 −0.1 0.123 −1,821,791.0 GLCM_Contrast 0.3 <0.001 −10,387.0 0.3 <0.001 −347.3 0.3 <0.001 −821,212.0 0.3 0.000 −192,075.0 GLCM_Entropy 0.1 0.082 −10,964.0 0.1 0.094 284.8 0.1 0.097 256,873.0 0.1 0.101 078,507.0 GLCM_Homogeneity –0.3 <0.001 449,981.0 −0.3 <0.001 174.5 −0.2 0.001 5,328,277.0 −0.2 0.001 7,023,606.0 GLCM_Correlation −0.2 0.002 136,775.0 –0.2 0.002 −21.8 −0.2 0.002 −612,505.0 −0.2 0.002 –606,790.0 GLCM_SumAverage 0.1 0.111 515,620.0 0.1 0.254 334.5 0.0 0.510 2,755,099.0 0.1 0.394 1,843,415.0 GLCM_Variance 0.0 0.650 266,897.0 0.0 0.706 −4.4 0.0 0.716 −401,372.0 0.0 0.751 −363,877.0 GLCM_Dissimilarity 0.3 <0.001 79,253.0 0.3 <0.001 −10.0 0.3 <0.001 −1,917,661.0 0.2 <0.001 −74,734.0 GLCM_AutoCorrelation 0.1 0.163 −150,463.0 0.1 0.341 −323.0 0.0 0.621 986,885.0 0.0 0.496 3,878,548.0 SRE 0.3 <0.001 -48,162.0 0.3 <0.001 –312.0 0.3 <0.001 −1,940,323.0 0.3 <0.001 −1,590,729.0 LRE −0.2 0.001 −309,302.0 −0.2 0.002 −1218.4

留言 (0)

沒有登入
gif