Abstract:
In practice, the count data may contain too many structures, which can cause the zero-
augmentation issue. If such data are analyzed using standard count models, the results
can be misleading. Traditionally, zero-in
ated data are analyzed using a statistical model
assuming that data arise from a standard count as well as a degenerated populations. Since
zero-truncated count models provide similar results obtained from traditional zero-in
ated
count models, in this study, we have proposed a marginalized statistical model based on mix-
ture of two-component Poisson distributions for analyzing zero-in
ated longitudinal count
data (clustered and repeated measures data) to draw inference regarding the e ects of the
covariates on marginal mean (marginalization over Poisson components) of the count re-
sponse.
To analyze the zero-in
ated clustered data, our proposed marginalized Poisson-Poisson
(REMPois-Pois) mixture model takes into account the intra-cluster correlation by incorpo-
rating random e ects into the models for marginal mean and component-1 mean in the exist-
ing marginalized Poisson-Poisson (MPois-Pois) mixture model suggested for cross-sectional
setup. The parameters of the REMPois-Pois model were estimated using maximum like-
lihood (ML) technique. The Gauss{Hermite quadrature (GHQ) technique was employed
to approximate the integrals appeared in the likelihood function. The performance of the
proposed marginalized model were assessed through extensive simulation studies. It was ob-
served that the proposed model performs well under di erent scenarios of simulation setups.
Finally, the proposed REMPois-Pois model was illustrated by using a nationally represen-tative data set on the number of antenatal care (ANC) visits extracted from Bangladesh
Demographic and Health Survey (BDHS), 2014.
To analyze zero-in
ated longitudinal repeated measures count data, a marginalized mix-
ture of two-component longitudinal Poisson models (RMMPois-Pois model) have also been
proposed in this study. Since observations obtained from the same subject are likely to be
correlated in such instance, the regression parameters of the model were estimated by gener-
alized quasilikelihood (GQL) approach taking true correlation into account. To examine the
performance of the RMMPois-Pois model, we have conducted extensive simulation studies.
The results of the simulation studies indicate that the performance of the proposed model
is remarkable. To illustrate the RMMPois-Pois model, a real life repeated count data set
on the number of episodes for certain side effect acquired from a pharmaceutical company was utilized.