Elsevier

Medical Image Analysis

Volume 35, January 2017, Pages 58-69
Medical Image Analysis

Computing group cardinality constraint solutions for logistic regression problems

https://doi.org/10.1016/j.media.2016.05.011Get rights and content

Highlights

  • Model concurrent disease classification and temporal-consistent pattern selection.

  • Minimize model by directly solving logistic regression confined by group cardinality.

  • Correctly identify ROIs differentiating the cine MRs of 44 TOF from 38 controls.

  • Generally significantly more accurate than approaches relaxing group sparsity.

Abstract

We derive an algorithm to directly solve logistic regression based on cardinality constraint, group sparsity and use it to classify intra-subject MRI sequences (e.g. cine MRIs) of healthy from diseased subjects. Group cardinality constraint models are often applied to medical images in order to avoid overfitting of the classifier to the training data. Solutions within these models are generally determined by relaxing the cardinality constraint to a weighted feature selection scheme. However, these solutions relate to the original sparse problem only under specific assumptions, which generally do not hold for medical image applications. In addition, inferring clinical meaning from features weighted by a classifier is an ongoing topic of discussion. Avoiding weighing features, we propose to directly solve the group cardinality constraint logistic regression problem by generalizing the Penalty Decomposition method. To do so, we assume that an intra-subject series of images represents repeated samples of the same disease patterns. We model this assumption by combining series of measurements created by a feature across time into a single group. Our algorithm then derives a solution within that model by decoupling the minimization of the logistic regression function from enforcing the group sparsity constraint. The minimum to the smooth and convex logistic regression problem is determined via gradient descent while we derive a closed form solution for finding a sparse approximation of that minimum. We apply our method to cine MRI of 38 healthy controls and 44 adult patients that received reconstructive surgery of Tetralogy of Fallot (TOF) during infancy. Our method correctly identifies regions impacted by TOF and generally obtains statistically significant higher classification accuracy than alternative solutions to this model, i.e., ones relaxing group cardinality constraints.

Introduction

An important topic in medical image analysis is to identify image phenotypes by automatically classifying time series of 3D Magnetic Resonance Images (MRIs). For example, intra-subject MRI sequences are used to analyze cardiac motion (Osman, Kerwin, McVeigh, Prince, 1999, Sermesant, Forest, Pennec, Delingette, Ayache, 2003, Chandrashekara, Mohiaddin, Rueckert, 2004, Huang, Shen, Zhang, Makedon, Hettleman, Pearlman, 2005, Besbes, Komodakis, Glocker, Tziritas, Paragios, 2007, Sundar, Litt, Shen, 2009, Zhang, Wahle, Johnson, Scholz, Sonka, 2010, Margeta, Geremia, Criminisi, Ayache, 2012, Wang, Amini, et al., 2012, Yu, Zhang, Li, Metaxas, Axel, 2014), and brain development (Chetelat, Landeau, Eustache, Mezenge, Viader, de La Sayette, Desgranges, Baron, 2005, Zhang, Peng, Li, Jahanshad, Hou, Jiang, Masuda, Langbehn, Miller, Mori, et al., 2010, Aljabar, Wolz, Srinivasan, Counsell, Rutherford, Edwards, Hajnal, Rueckert, 2011, Serag, Gousias, Makropoulos, Aljabar, Hajnal, Boardman, Counsell, Rueckert, 2012, Toews, Wells, William, Zöllei, 2012, Bernal-Rusiel, Reuter, Greve, Fischl, Sabuncu, 2013, Schellen, Ernst, Gruber, Mlczoch, Weber, Brugger, Ulm, Langs, Salzer-Muhar, Prayer, Kasprian, 2015). However, the automatic classification of medical images is generally challenging. First, the number of features extracted from medical images is usually much larger than the number of samples. This generally results in overfitting of the method to the data, i.e., much higher classification accuracy during training than on test data (Ryali, Supekar, Abrams, Menon, 2010, Marques, Clemmensen, Dam, 2012, Deshpande, Maurel, Barillot, 2014). In addition, the image phenotypes identified by automatic classifiers are often difficult to relate to the medical literature (Qu et al., 2003). In this article, we propose an algorithm that addresses both issues by directly solving the so called logistic regression problem with group sparsity constraints.

Classifiers based on sparse models reduce the dense image data to a small number of features by counting the number of selected features via the l0-“norm” and are configured so that the count is below a predefined threshold (Yamashita, Sato, Yoshioka, Tong, Kamitani, 2008, Carroll, Cecchi, Rish, Garg, Rao, 2009, Rao, Lee, Gass, Monsch, 2011, Liu, Zhang, Shen, 2012, Lv, Jiang, Li, Zhu, Chen, Zhang, Zhang, Hu, Han, Huang, et al., 2015). A generalization of that concept are group sparsity models, which first group image features based on predefined rules and then count the number of non-zero groupings (Ng, Vahdat, Hamarneh, Abugharbieh, 2010, Wu, Yuan, Zhuang, 2010, Ryali, Supekar, Abrams, Menon, 2010). To solve the underlying minimization problem, however, these methods relax the feature selection process from (group) cardinality constraints to weighting feature by, for example, replacing the l0-“norm” with the l2-norm (Meier, Van De Geer, Bühlmann, 2008, Friedman, Hastie, Tibshirani, Ryali, Supekar, Abrams, Menon, 2010, Li, Yin, Fang, 2012). The solution of those methods relates to the original sparse problem only under specific assumptions, e.g., the data entry matrix needs to satisfy the restricted isometry property in compressed sensing problem (Candes, Tao, 2005, Candès, Romberg, Tao, 2006). However, matrices generally do not satisfy this property, such as those of the appendix of (Lu and Zhang, 2013), and most data matrices of medical image applications, e.g., matrices defined by the regional volume scores of subjects. Thus, with the exception of sparse models applied to compressed sensing, the solution obtained with respect to the relaxed norm generally does not recover the one of the original sparse model defined by the l0-“norm”. In addition, the number of measures selected by the classifier depends now on the training data due to the soft selection scheme. One can select a predefined number by choosing measures whose weight is above a certain threshold. However, in the case of sparse logistic regression the corresponding classifier depends on the measures below the threshold and the relevance of those weights with respect to the disease under study is an ongoing topic of discussion (Haufe, Meinecke, Görgen, Dähne, Haynes, Blankertz, Bießmann, 2014, Sabuncu, 2014). Alternatively, the upper bound associated with the sparse constraint is set so that the classifier returns the wanted number of measures for a given training data set (Vounou, Janousova, Wolz, Stein, Thompson, Rueckert, Montana, 2012, Zhang, Shen, 2012, Ma, Huang, 2008, Zhang, Zhan, Metaxas, 2012). The tuning is now data dependent, i.e., each training set is generally associated with a different upper bound so that selected number of scores is constant across training sets. Even comparing the patterns of different subsets of the same data set, i.e., folds, is none trivial as each pattern is the solution to a minimization problem, whose sparsity constraint is unique to a fold. Avoiding soft feature selection and thus these issues, our algorithm solves the original group sparsity constrained, logistic classification problem defined by the l0-“norm” by extending the Penalty Decomposition (PD) method (Lu and Zhang, 2013). By doing so, our method uses a single model to not only classify samples but also directly select patterns (without thresholding or changing upper bounds) that potentially are image phenotypes meaningful to medical community.

To further investigate its potential, we now generalize PD from solving sparse logistic regression problems with group size one to more than one. Specifically, we assume that an intra-subject series of images represents repeated samples of the same disease patterns. In other words, selecting an image feature for disease identification needs to account for the entire series of measurements created by that feature across time. We model this assumption by combining each “feature series” into a single group. The proposed PD algorithm then derives a solution within that model by decoupling the minimization of the logistic regression function from enforcing the group sparsity constraint. Applying Block Coordinate Descent (BCD), the minimum to the smooth and convex logistic regression problem is determined via gradient descent while we derive a closed form solution for finding a sparse approximation of that minimum.

We apply our method to cine MRI of 38 healthy adults and 44 adult patients that received reconstructive surgery of Tetralogy of Fallot (TOF) during infancy. The data sets fulfill the assumption of the group sparsity model as the residual effects of TOF mostly impact the shape of the right ventricle (Atrey, Hossain, El Saddik, Kankanhalli, 2010, Bailliard, Anderson, 2009) so that the regions impacted by TOF should not change across the time series captured by a cine MRI. During training, we automatically set all important parameters of our approach by first training a separate regressor for each setting of the parameter space. We then reduce the risk of overfitting by combining those classifiers into a single ensemble of classifiers (Rokach, 2010). This ensemble of classifiers correctly favors subregions of the ventricles most likely impacted by TOF. For most experiments, it also produces statistically significant higher accuracy scores than ensemble of classifiers that relax the group cardinality constraint.

We first proposed to generalize PD to group sparsity constraints at MICCAI 2015 (Zhang and Pohl, 2015). This article provides a more in-depth view of this idea. Specifically, we expand PD to guarantee convergence of the sparse approximation to a local minimum of the group-sparsity confined, logistic regression problem, which is the primary contribution of this manuscript. We also modify the experiments by replacing the morphometric encodings of heart regions based on the average of the Jacobian determinants with simple volumetric scores. This simplifies preprocessing as alignment of each cine MRI to a template is unnecessary. It also reduces the size of the parameter search space, which now omits the smoothing parameters associated with the alignment process. Moreover, we not only record a single accuracy score for each implementation but instead generate distributions of scores by modifying the number of training samples. For each training size, we apply the method to 10 different training and testing sets. Finally, we distinguish the ventricular septum from the left ventricle to refine our findings from the previous publication (Zhang and Pohl, 2015) and support those findings with new plots that visualize the selection of regions across the entire heart.

Beyond our MICCAI publication, a possible alternative regression approach for simultaneous classification and pattern extraction is the random forest method (Lempitsky et al., 2009). However, it is unclear how to expand this technology to group-wise selection schemes that enforce temporal consistency in selecting regions, i.e., the same regions are picked across all time points. Due to these difficulties most machine learning approaches applied to cine MRI just focus on disease classification, such as (McLeod, Mansi, Sermesant, Pongiglione, Pennec, 2013, Afshin, Ben Ayed, Punithakumar, Law, Islam, Goela, Peters, Li, 2014, Bai, Peressutti, Oktay, Shi, O’Regan, King, Rueckert, 2015). They often improve results by manually selecting regions thought to be impacted by the disease before performing classification (Wald et al., 2009). An exception are (Qian, Liu, Metaxas, Axel, 2011, Ye, Desjardins, Hamm, Litt, Pohl, 2014, Bhatia, Rao, Price, Wolz, Hajnal, Rueckert, 2014), which separately perform disease classification and weigh individual regions possibly impacted by disease. The disconnect between the two steps and the weighing of individual regions makes clinical interpretation of the findings more difficult as, in addition to the earlier mentioned issues associated with the interpretation of weights, it increases the risk of false positive findings compared to directly identifying patterns of regions. Our experimental results echo these issues, where logistic regression with relaxed sparsity constraints was generally significantly less accurate than our proposed solution to the original sparsity constraint. We conclude that our proposed approach is the first to solve a single optimization problem for simultaneous disease classification and group-based pattern identification based on segmentation of cine MRIs.

The rest of this paper is organized as follows. Section 2 provides an in-depth description of PD algorithm and its convergence properties. Section 3 summarizes the experiments on the TOF dataset and Section 4 concludes the paper with final remarks.

Section snippets

Solving sparse group logistic regression

We first describe the logistic regression model with group cardinality constraint, which accurately assigns subjects to cohorts based on features extracted from intra-subject image sequences. We then generalize the PD approach to find a solution within that model. We end the section deriving convergence properties of the resulting algorithm.

Testing algorithms on correctly classifying TOF

To better understand the strength and weakness of our proposed Algorithm 1, we compare the accuracy of our approach to alternative solver of sparsity constraint logistic regression problems on a data set consisting of regional volume scores extracted from cine MRIs of 44 TOF cases and 38 healthy controls. The dataset provides an ideal test bed for such a comparison as it contains ground-truth diagnosis, i.e., each subject received reconstructive surgery for TOF during infancy or not.

Conclusion

We generalized the PD approach to directly solve group cardinality constraint logistic regression, i.e., simultaneously performing disease classification and temporal-consistent pattern identification. To do so, we assumed that an intra-subject series of images represents repeated samples of the same disease patterns. We modeled this assumption by combining series of measurements created by a feature across time into a single group. Unlike existing approaches, our algorithm then derived a

Acknowledgment

We would like to thank Drs. Benoit Desjardins and DongHye Ye for their help on generating the cardiac dataset. This research was supported by NIH grants (R01 HL127661, K05 AA017168) and the Creative and Novel Ideas in HIV Research (CNIHR) Program through a supplement to the University of Alabama at Birmingham (UAB) Center For AIDS Research funding (P30 AI027767). This funding was made possible by collaborative efforts of the Office of AIDS Research, the National Institute of Allergy and

References (58)

  • M. Toews et al.

    A feature-based developmental model of the infant brain in structural MRI

    Medical Image Computing and Computer-Assisted Intervention – MICCAI 2012

    (2012)
  • R.M. Wald et al.

    Effects of regional dysfunction and late gadolinium enhancement on global right ventricular function and exercise capacity in patients with repaired tetralogy of Fallot

    Circulation

    (2009)
  • D.H. Ye et al.

    Regional manifold learning for disease classification

    IEEE Trans. Med. Imaging

    (2014)
  • D. Zhang et al.

    Multi-modal multi-task learning for joint prediction of multiple regression and classification variables in Alzheimer’s disease

    NeuroImage

    (2012)
  • H. Zhang et al.

    4-D cardiac MR image analysis: left and right ventricular morphology and function

    IEEE Trans. Med. Imaging

    (2010)
  • S. Zhang et al.

    Deformable segmentation via sparse representation and dictionary learning

    Med. Image Anal.

    (2012)
  • Y. Zhang et al.

    Solving logistic regression with group cardinality constraints for time series analysis

    Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015

    (2015)
  • M. Afshin et al.

    Regional assessment of cardiac left ventricular myocardial function via mri statistical features

    IEEE Trans. Med. Imaging

    (2014)
  • P. Aljabar et al.

    A combined manifold learning analysis of shape and appearance to characterize neonatal brain development

    IEEE Trans. Med. Imaging

    (2011)
  • P.K. Atrey et al.

    Multimodal fusion for multimedia analysis: a survey

    Multim. Syst.

    (2010)
  • W. Bai et al.

    Learning a global descriptor of cardiac motion from a large cohort of 1000+ normal subjects

    Functional Imaging and Modeling of the Heart

    (2015)
  • F. Bailliard et al.

    Tetralogy of Fallot

    Orphanet J. Rare Dis.

    (2009)
  • R. Bartle et al.

    Introduction to Real Analysis. Matemáticas (Limusa

    (1982)
  • D. Bertsekas

    Nonlinear programming

    (1999)
  • A. Besbes et al.

    4D ventricular segmentation and wall motion estimation using efficient discrete optimization

    Advances in Visual Computing

    (2007)
  • K.K. Bhatia et al.

    Hierarchical manifold learning for regional image analysis

    IEEE Trans. Med. Imaging

    (2014)
  • E.J. Candès et al.

    Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information

    IEEE Trans. Inf. Theory

    (2006)
  • E.J. Candes et al.

    Decoding by linear programming

    IEEE Trans. Inf. Theory

    (2005)
  • R. Chandrashekara et al.

    Analysis of 3-D myocardial motion in tagged MR images using nonrigid image registration

    IEEE Trans. Med. Imaging

    (2004)
  • Cited by (12)

    View all citing articles on Scopus
    View full text