Predicting High-Cost Pediatric Patients: Derivation and Validation of a Population-Based Model

Publisher: Medical Care, vol. 53, no. 8
Aug 01, 2015
Lindsey J. Leininger, Brendan Saloner, and Laura R. Wherry

Key Finding:

  • A model comprised of the Children with Special Health Care Needs Screener and prior year’s health care utilization is sufficiently predictive of future health care expenditures to serve as a clinical prediction tool prospectively identifying children with elevated health care need.

Background: Health care administrators often lack feasible methods to prospectively identify new pediatric patients with high health care needs, precluding the ability to proactively target appropriate population health management programs to these children.


Objective: To develop and validate a predictive model identifying high-cost pediatric patients using parent-reported health (PRH) measures that can be easily collected in clinical and administrative settings.


Design: Retrospective cohort study using 2-year panel data from the 2001 to 2011 rounds of the Medical Expenditure Panel Survey.


Subjects: A total of 24,163 children aged 5–17 with family incomes below 400% of the federal poverty line were included in this study.


Measures: Predictive performance, including the c-statistic, sensitivity, specificity, and predictive values, of multivariate logistic regression models predicting top-decile health care expenditures over a 1-year period.


Results: Seven independent domains of PRH measures were tested for predictive capacity relative to basic sociodemographic information: the Children with Special Health Care Needs (CSHCN) Screener; subjectively rated health status; prior year health care utilization; behavioral problems; asthma diagnosis; access to health care; and parental health status and access to care. The CSHCN screener and prior year utilization domains exhibited the highest incremental predictive gains over the baseline model. A model including sociodemographic characteristics, the CSHCN screener, and prior year utilization had a c-statistic of 0.73 (95% confidence interval, 0.70–0.74), surpassing the commonly used threshold to establish sufficient predictive capacity (c-statistic > 0.70).