• Application of Machine Learning Algorithms for Predicting Missing Cost Data

      Rueda, Juan-David; Slejko, Julia F; 0000-0002-0907-7106 (2019)
      Objective: To compare new alternatives to estimate health care costs in the presence of missing data using methods based on machine-learning (ML). Introduction: Costs must be correctly estimated for value assessment and budget calculations. Problems arise when they are not correctly estimated. Sometimes costs can be biased and lead to wrong decisions that affect population health. Cost estimation is a challenging task and it is more challenging in the presence of missing data. Methods: We used Surveillance, Epidemiology, and End Results program (SEER)-Medicare including patients with multiple myeloma newly diagnosed from 2007-2013. We explored the problem of missing data using different approaches creating artificial missing data. We hypothesized that the use of ML techniques improves the prediction of mean medical total costs in the presence of missingness. ML methods included support vector machines, boosting, random forest, and classification and regression trees. First, we analyzed the problem considering only one dimension, when one variable is missing in a cross-sectional scenario, using generalized linear models as a comparator against ML. Then, we added time as a factor for missingness, utilizing reweighted estimators against ML. Finally, we explored the different levels of censoring and determined how each censoring level affected our cost estimations. In this case, we created multiple linear spline models to establish the effect of censoring on the bias of the estimator. Results: We demonstrated that ML algorithms had better prediction when data were missing completely at random and missing at random. All the methods performed badly in the missing not at random scenario. In the second aim, we showed that ML-based methods predict just as well as reweighted estimators for the five-year total cost of a patient with multiple myeloma. Lastly, we found that ML methods are consistent and robust at low and moderate levels of censoring; however, we failed to prove that they are better than the reweighted estimators. Conclusions: ML-based methods are a good alternative for the prediction of missing cost data in the case of cross-sectional and longitudinal data.
    • Association of Patient Cost Sharing and Area Deprivation with Multiple Myeloma Treatment Receipt and Outcomes

      Hong, Yoon Duk; Slejko, Julia F; 0000-0002-9548-5770 (2021)
      Introduction: Advances in multiple myeloma (MM) treatment have improved survival, but there are increased concerns about treatment affordability and access. This study assessed 1) how cost-sharing assistance and area deprivation affect treatment receipt, 2) changes in the patient cost responsibility and disparities in treatment over time, and 3) how the low-income subsidy (LIS), which lowers Part D cost sharing, and area deprivation affect treatment access and survival. Methods: Using the Surveillance, Epidemiology, and End Results-Medicare database, we identified patients diagnosed with MM. The effect of cost-sharing assistance and area deprivation on treatment was estimated using multilevel logistic regression. We estimated the monthly incremental patient cost responsibility among MM patients compared to non-cancer controls and examined changes over time (2007-2011, 2012-2016). The effect of diagnosis period and area deprivation on treatment was estimated using multilevel logistic regression. The association between LIS, area deprivation, and mortality was estimated from a mixed-effects Cox proportional hazards model. We assessed whether treatment mediates the association between LIS and mortality. Results: Individuals receiving Medicare Parts A, B and D cost-sharing assistance had higher odds of receiving treatment compared with non-recipients (OR=1.21; 95%CI: 1.01–1.45). Living in the most deprived area (Quintile 5) was associated with lower odds of receiving treatment compared with the least deprived area (Quintile 1; OR=0.81; 95%CI: 0.65–0.99), but there was no difference in the other quintiles. The difference in the estimated monthly incremental patient cost responsibility between 2012-2015 and 2007-2011 was $58 [average marginal cost; 95%CI: $12–$105]). The difference in the likelihood of any treatment receipt between Quintile 1 and 5 decreased, but the difference in the likelihood of receiving a novel agent-based regimen increased. The mortality hazard was higher for LIS recipients relative to non-recipients in Quintiles 1, 3 and 4 (HR=1.50, 1.38, 1.28; p=0.0001), and there was no difference in the other two quintiles. This association was partially mediated by treatment receipt. Conclusions: The patient cost responsibility for MM care increased over time. The type of cost-sharing assistance and area deprivation affect treatment receipt, although not across all quintiles. LIS receipt did not confer a survival benefit.