A comparison of penalised regression methods for informing the selection of predictive markers

Greenwood, CJ; Youssef, GJ; Letcher, P; Macdonald, JA; Hagg, LJ; Sanson, A; McIntosh, Jennifer; Hutchinson, DM; Toumbourou, JW; Fuller-Tyszkiewicz, M; Olsson, Craig A

doi:10.26181/5ff5314663705

File(s) stored somewhere else

https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0242730

Please note: Linked content is NOT stored on La Trobe and we can't guarantee its availability, quality, security or accept any liability.

A comparison of penalised regression methods for informing the selection of predictive markers

journal contribution

posted on 2021-01-06, 03:40 authored by CJ Greenwood, GJ Youssef, P Letcher, JA Macdonald, LJ Hagg, A Sanson, Jennifer McIntoshJennifer McIntosh, DM Hutchinson, JW Toumbourou, M Fuller-Tyszkiewicz, Craig A Olsson

© 2020 Greenwood et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Background Penalised regression methods are a useful atheoretical approach for both developing predictive models and selecting key indicators within an often substantially larger pool of available indicators. In comparison to traditional methods, penalised regression models improve prediction in new data by shrinking the size of coefficients and retaining those with coefficients greater than zero. However, the performance and selection of indicators depends on the specific algorithm implemented. The purpose of this study was to examine the predictive performance and feature (i.e., indicator) selection capability of common penalised logistic regression methods (LASSO, adaptive LASSO, and elastic-net), compared with traditional logistic regression and forward selection methods. Design Data were drawn from the Australian Temperament Project, a multigenerational longitudinal study established in 1983. The analytic sample consisted of 1,292 (707 women) participants. A total of 102 adolescent psychosocial and contextual indicators were available to predict young adult daily smoking. Findings Penalised logistic regression methods showed small improvements in predictive performance over logistic regression and forward selection. However, no single penalised logistic regression model outperformed the others. Elastic-net models selected more indicators than either LASSO or adaptive LASSO. Additionally, more regularised models included fewer indicators, yet had comparable predictive performance. Forward selection methods dismissed many indicators identified as important in the penalised logistic regression models. Conclusions Although overall predictive accuracy was only marginally better with penalised logistic regression methods, benefits were most clear in their capacity to select a manageable subset of indicators. Preference to competing penalised logistic regression methods may therefore be guided by feature selection capability, and thus interpretative considerations, rather than predictive performance alone.

Funding

Data collection for the ATP study was supported primarily through Australian grants from the Melbourne Royal Children's Hospital Research Foundation, National Health and Medical Research Council, Australian Research Council, and the Australian Institute of Family Studies. Funding for this work was supported by grants from the Australian Research Council [DP130101459; DP160103160; DP180102447] and the National Health and Medical Research Council of Australia [APP1082406]. Olsson, C.A. was supported by a National Health and Medical Research Council fellowship (Investigator grant APP1175086). Hutchinson, D.M. was supported by the National Health and Medical Research Council of Australia [APP1197488].The ATP study is located at The Royal Children's Hospital Melbourne and is a collaboration between Deakin University, The University of Melbourne, the Australian Institute of Family Studies, The University of New South Wales, The University of Otago (New Zealand), and the Royal Children's Hospital (further information available at www.aifs.gov.au/atp).The views expressed in this paper are those of the authors and may not reflect those of their organizational affiliations, nor of other collaborating individuals or organizations. We acknowledge all collaborators who have contributed to the ATP, especially Professors Ann Sanson, Margot Prior, Frank Oberklaid, John Toumbourou and Ms Diana Smart. We would also like to sincerely thank the participating families for their time and invaluable contribution to the study.

History

Publication Date

2020-11-20

Journal

PLoS ONE

Volume

15

Issue

11

Article Number

e0242730

Pagination

14p. (p. 1-14)

Publisher

Public Library of Science (PLOS)

ISSN

1932-6203

Rights Statement

The Author reserves all moral rights over the deposited text and must be credited if any re-use occurs. Documents deposited in OPAL are the Open Access versions of outputs published elsewhere. Changes resulting from the publishing process may therefore not be reflected in this document. The final published version may be obtained via the publisher’s DOI. Please note that additional copyright and access restrictions may apply to the published version.