dc.description.abstract |
This dissertation investigates the existing methods for risk prediction of a sequence of events from longitudinal studies for the continuous time data, in addition to, proposing a simple alternative method. These outcomes (events) can change status at different followups that may produce a large number of paths or trajectories. Also, regressive models for multinomial and ordinal outcomes for discrete time data to obtain a joint model for a sequence of events for risk prediction is proposed. A key challenge is the simplification and generalization of the existing method for continuous time data for risk prediction for a large sequence of events at different stages. Most of the models are proposed to solve the problem arising from the progression of specific diseases process. The proposed alternative multistage procedure simplifies the transition models for risk prediction of a sequence of events for continuous time data. This framework provides the estimates for each stage in the process conditionally and the conditional estimates are linked based on marginal and conditional models to obtain the joint probabilities needed for predicting the status of disease based on the potential risk factors. The proposed method of prediction is a new development using a series of events in conditional setting arising from the beginning to the endpoint. Also, a general form of integral is developed for predicting the joint probability of a sequence of events from longitudinal studies for (i) different types of trajectories and (ii) any segment of a trajectory along with the generalization to any number of stages which is a new development. In follow-up or panel studies, multinomial outcomes may occur within an interval where transition times are not exactly known, or the time of the event is itself discrete. Available models for risk prediction for multinomial outcomes with specified risk factors are only for a single response and are not extended for prediction of a sequence of events for discrete time data for different stages. The regressive models for multinomial outcomes are proposed and then a modeling framework is developed to predict the joint probabilities for a sequence of events. The proposed models link the marginal and sequence of conditional models to provide the joint model needed for predicting the probability of a trajectory based on specified covariate patterns. The marginal model uses the outcome variable at the baseline and the models at the subsequent follow-ups provide the estimates of the parameters of the conditional models. The major improvement of the proposed framework is that one needs to fit a significantly smaller number of models compared to the conditional models such as Markov models. The independence of the repeated outcomes will allow using simpler models, and the goodness-of-fit of the joint model is required for model performance. The proposed goodness-of-fit test for joint model is obtained by linking marginal and conditional models. The test for independence uses marginal models for each repeated outcomes. The simulation study and application using real data prove the usefulness and illustrate the performance of these tests. For ordinal outcomes from longitudinal studies regressive proportional odds model, and in the case of violation of proportional odds assumption regressive partial proportional odds model are proposed. Then a framework is developed to predict joint probabilities for a sequence of ordinal outcomes. The major improvement of the proposed model is that only one model is required for each repeated outcome compared to the sequence of conditional models such as Markov models. Results from these two models are compared to that from the proposed regressive multinomial logistic model. Also, test for goodnessof-fit and test for independence are shown. The proposed models provide the estimates for each stage in the process conditionally, and the joint model can be obtained for any order to predict the risk of a sequence of events. Proposed regressive partial proportional odds model and regressive multinomial models showed better performance compared to the regressive proportional odds model when proportional odds assumption is violated. Simulation studies showed satisfactory performance of the proposed regressive models for ordinal outcomes. All the proposed model and the risk prediction framework for both continuous and discrete time data are a new development. The major improvement of the proposed model is that it reduces the over-parameterization. One can easily add interaction terms among previous outcomes, and predictors in the proposed framework which may provide a better understanding of the underlying process and the relationships between outcomes and risk factors. Using the developed framework, modeling and risk prediction for a sequence of events can be performed in many fields of studies such as epidemiology, public health, survival analysis, genetics, reliability, environmental studies, etc. This model would be very useful for analyzing big data. One can use the existing software for model fitting, and risk prediction of a sequence of events. |
en_US |