🕷️ Crawler Inspector

URL Lookup

Direct Parameter Lookup

Raw Queries and Responses

1. Shard Calculation

Query:
Response:
Calculated Shard: 106 (from laksa081)

2. Crawled Status Check

Query:
Response:

3. Robots.txt Check

Query:
Response:

4. Spam/Ban Check

Query:
Response:

5. Seen Status Check

ℹ️ Skipped - page is already crawled

đź“„
INDEXABLE
âś…
CRAWLED
2 days ago
🤖
ROBOTS ALLOWED

Page Info Filters

FilterStatusConditionDetails
HTTP statusPASSdownload_http_code = 200HTTP 200
Age cutoffPASSdownload_stamp > now() - 6 MONTH0.1 months ago
History dropPASSisNull(history_drop_reason)No drop reason
Spam/banPASSfh_dont_index != 1 AND ml_spam_score = 0ml_spam_score=0
CanonicalPASSmeta_canonical IS NULL OR = '' OR = src_unparsedNot set

Page Details

PropertyValue
URLhttps://www.theanalysisfactor.com/zero-inflated-poisson-models-for-count-outcomes/
Last Crawled2026-04-12 00:01:44 (2 days ago)
First Indexed2017-12-21 06:27:10 (8 years ago)
HTTP Status Code200
Meta TitleZero-Inflated Poisson Models for Count Outcomes - The Analysis Factor
Meta Descriptionnull
Meta Canonicalnull
Boilerpipe Text
There are quite a few types of outcome variables that will never meet ordinary linear model’s assumption of normally distributed residuals.  A non-normal outcome variable can have normally distribued residuals, but it does need to be continuous, unbounded, and measured on an interval or ratio scale.   Categorical outcome variables clearly don’t fit this requirement, so it’s easy to see that an ordinary linear model is not appropriate.  Neither do count variables.  It’s less obvious, because they are measured on a ratio scale, so it’s easier to think of them as continuous, or close to it.  But they’re neither continuous or unbounded, and this really affects assumptions. Continuous variables measure how much.  Count variables measure how many.  Count variables can’t be negative—0 is the lowest possible value, and they’re often skewed–so severly that 0 is by far the most common value.  And they’re discrete, not continuous.  All those jokes about the average family having 1.3 children have a ring of truth in this context. Count variables often follow a Poisson or one of its related distributions.  The Poisson distribution assumes that each count is the result of the same Poisson process—a random process that says each counted event is independent and equally likely.  If this count variable is used as the outcome of a regression model, we can use Poisson regression to estimate how predictors affect the number of times the event occurred. But the Poisson model has very strict assumptions.  One that is often violated is that the mean equals the variance.  When the variance is too large because there are many 0s as well as a few very high values, the negative binomial model is an extension that can handle the extra variance. But sometimes it’s just a matter of having too many zeros than a Poisson would predict.  In this case, a better solution is often the Zero-Inflated Poisson (ZIP) model.  (And when extra variation occurs too, its close relative is the Zero-Inflated Negative Binomial model). ZIP models assume that some zeros occurred by a Poisson process, but others were not even eligible to have the event occur.  So there are two processes at work—one that determines if the individual is even eligible for a non-zero response, and the other that determines the count of that response for eligible individuals. The tricky part is either process can result in a 0 count.   Since you can’t tell which 0s were eligible for a non-zero count, you can’t tell which zeros were results of which process.  The ZIP model fits, simultaneously, two separate regression models.  One is a logistic or probit model that models the probability of being eligible for a non-zero count.  The other models the size of that count. Both models use the same predictor variables, but estimate their coefficients separately.  So the predictors can have vastly different effects on the two processes. But a ZIP model requires it be theoretically plausible that some individuals are ineligible for a count.  For example, consider a count of the number of disciplinary incidents in a day in a youth detention center.  True, there may be some youth who would never instigate an incident, but the unit of observation in this case is the center.  It is hard to imagine a situation in which a detention center would have no possibility of any incidents, even if they didn’t occur on some days. Compare that to the number of alcoholic drinks consumed in a day, which could plausibly be fit with a ZIP model.  Some participants do drink alcohol, but will have consumed 0 that day, by chance.   But others just do not drink alcohol, so will never have a non-zero response.  The ZIP model can determine which predictors affect the probability of being an alcohol consumer and which predictors affect how many drinks the consumers consume.  They may not be the same predictors for the two models, or they could even have opposite effects on the two processes. Poisson and Negative Binomial Regression for Count Data Learn when you need to use Poisson or Negative Binomial Regression in your analysis, how to interpret the results, and how they differ from similar models.
Markdown
[![Logo](https://www.theanalysisfactor.com/wp-content/themes/genesis-sample/imm_files/images/taf_logo_new_400.png)](https://www.theanalysisfactor.com/) Menu - [our programs](https://www.theanalysisfactor.com/about/programs/) - [Membership](https://www.theanalysisfactor.com/membership/) - [Workshops](https://www.theanalysisfactor.com/workshops/) - [Tutorials](https://www.theanalysisfactor.com/tutorials/) - [Seminars](https://www.theanalysisfactor.com/seminars/) - [Statistical Consulting Services](https://www.theanalysisfactor.com/statistical-consulting-services/) - [Free Webinars](https://craft.theanalysisfactor.com/) - [statistical resources](https://www.theanalysisfactor.com/resources/) - [blog](https://www.theanalysisfactor.com/blog/) - [about](https://www.theanalysisfactor.com/about/) - [Our Team](https://www.theanalysisfactor.com/about/the-analysis-factor-team-2/) - [Our Core Values](https://www.theanalysisfactor.com/about/core-values/) - [Our Privacy Policy](https://www.theanalysisfactor.com/privacy-policy/) - [Employment](https://www.theanalysisfactor.com/employment/) - [Collaborate with Us](https://www.theanalysisfactor.com/collaborate-with-us/) - [contact](https://www.theanalysisfactor.com/contact-us/) - [login](https://www.theanalysisfactor.com/customer-login/) # Zero-Inflated Poisson Models for Count Outcomes by [Karen Grace-Martin]() [10 Comments](https://www.theanalysisfactor.com/zero-inflated-poisson-models-for-count-outcomes/#comments) There are quite a few types of outcome variables that will never meet ordinary linear model’s assumption of normally distributed residuals. A non-normal outcome variable can have normally distribued residuals, but it does need to be continuous, unbounded, and measured on an interval or ratio scale. Categorical outcome variables clearly don’t fit this requirement, so it’s easy to see that an ordinary linear model is not appropriate. Neither do count variables. It’s less obvious, because they are measured on a ratio scale, so it’s easier to think of them as continuous, or close to it. But they’re neither continuous or unbounded, and this really affects assumptions. Continuous variables measure how much. Count variables measure how many. Count variables can’t be negative—0 is the lowest possible value, and they’re often skewed–so severly that 0 is by far the most common value. And they’re discrete, not continuous. All those jokes about the average family having 1.3 children have a ring of truth in this context. Count variables often follow a Poisson or one of its related distributions. The Poisson distribution assumes that each count is the result of the same Poisson process—a random process that says each counted event is independent and equally likely. If this count variable is used as the outcome of a regression model, we can use Poisson regression to estimate how predictors affect the number of times the event occurred. But the Poisson model has very strict assumptions. One that is often violated is that the mean equals the variance. When the variance is too large because there are many 0s as well as a few very high values, the negative binomial model is an extension that can handle the extra variance. But sometimes it’s just a matter of having too many zeros than a Poisson would predict. In this case, a better solution is often the Zero-Inflated Poisson (ZIP) model. (And when extra variation occurs too, its close relative is the Zero-Inflated Negative Binomial model). ZIP models assume that some zeros occurred by a Poisson process, but others were not even eligible to have the event occur. So there are two processes at work—one that determines if the individual is even eligible for a non-zero response, and the other that determines the count of that response for eligible individuals. The tricky part is either process can result in a 0 count. Since you can’t tell which 0s were eligible for a non-zero count, you can’t tell which zeros were results of which process. The ZIP model fits, simultaneously, two separate regression models. One is a logistic or probit model that models the probability of being eligible for a non-zero count. The other models the size of that count. Both models use the same predictor variables, but estimate their coefficients separately. So the predictors can have vastly different effects on the two processes. But a ZIP model requires it be theoretically plausible that some individuals are ineligible for a count. For example, consider a count of the number of disciplinary incidents in a day in a youth detention center. True, there may be some youth who would never instigate an incident, but the unit of observation in this case is the center. It is hard to imagine a situation in which a detention center would have no possibility of any incidents, even if they didn’t occur on some days. Compare that to the number of alcoholic drinks consumed in a day, which could plausibly be fit with a ZIP model. Some participants do drink alcohol, but will have consumed 0 that day, by chance. But others just do not drink alcohol, so will never have a non-zero response. The ZIP model can determine which predictors affect the probability of being an alcohol consumer and which predictors affect how many drinks the consumers consume. They may not be the same predictors for the two models, or they could even have opposite effects on the two processes. ![](https://www.theanalysisfactor.com/wp-content/uploads/2019/01/cosa_ad_header2.png) Poisson and Negative Binomial Regression for Count Data Learn when you need to use Poisson or Negative Binomial Regression in your analysis, how to interpret the results, and how they differ from similar models. [Take Me to The Video\!](https://craft.theanalysisfactor.com/webinar-recording-signup/?cosid=560) Tagged With: [Count data](https://www.theanalysisfactor.com/tag/count-data/), [Discrete Counts](https://www.theanalysisfactor.com/tag/discrete-counts/), [Poisson Regression](https://www.theanalysisfactor.com/tag/poisson-regression/), [Zero Inflated](https://www.theanalysisfactor.com/tag/zero-inflated/) ### Related Posts - [A Few Resources on Zero-Inflated Poisson Models](https://www.theanalysisfactor.com/a-few-resources-on-zero-inflated-poisson-models/) - [Poisson Regression Analysis for Count Data](https://www.theanalysisfactor.com/poisson-regression-analysis-for-count-data/) - [Member Event: Count Models Accelerator](https://www.theanalysisfactor.com/count-models-accelerator/) - [When Linear Models Don’t Fit Your Data, Now What?](https://www.theanalysisfactor.com/when-dependent-variables-are-not-fit-for-glm-now-what/) ## Reader Interactions ### Comments 1. ![](https://secure.gravatar.com/avatar/1a840ed0d6c4552378034d9e20998892c7aeff3e0d4d41e0daf22c653b3eb513?s=60&r=g)Yitagesu Habtu says [June 5, 2020 at 1:31 pm](https://www.theanalysisfactor.com/zero-inflated-poisson-models-for-count-outcomes/#comment-356302) How, I can perform ” Zero inflated poisson regression , Zero inflated negative binomial regression” the Generalized Poisson Regression Using SPSS ? [Reply](https://www.theanalysisfactor.com/zero-inflated-poisson-models-for-count-outcomes/#comment-356302) 2. ![](https://secure.gravatar.com/avatar/cc693ad9b4c5f3bb10e54d462d2385ff8626aff04d103e3c9c448d55547d387e?s=60&r=g)Jon says [November 17, 2016 at 12:55 pm](https://www.theanalysisfactor.com/zero-inflated-poisson-models-for-count-outcomes/#comment-183149) I have been searching everywhere to answer what is hopefully a simple question. Maybe you have an answer? The question is: Are the excess zeroes included when calculating the second model? To say it another way: If the logit models predicts you are not a candidate for a non-zero count, are your data included in the subsequent Poisson model? [Reply](https://www.theanalysisfactor.com/zero-inflated-poisson-models-for-count-outcomes/#comment-183149) 3. ![](https://secure.gravatar.com/avatar/1e910ef2782ed8221f2f3875e57381320a59b88e1b0facbe9f8b8f3fad94b84e?s=60&r=g)Isha says [October 10, 2016 at 10:34 am](https://www.theanalysisfactor.com/zero-inflated-poisson-models-for-count-outcomes/#comment-181614) Hi Karen , I am a student . I am doing my thesis GLM for non life insurance data set . i am using a simulated and a real motor insurance dataset to estimate number of claims, claims severity and total payment. to model number of claims which is a count process i have tried first poisson regression and found inappropriate because of overdispersion and then i tried Negative Binomial which give scale close to 1. i have a questions that is my dummy data set contains 35 records and contains 3 zero for number of claims variable out 35 . these 3 observation have 0 claims , 0 premium and they have some amount for actual premium. so do i need to use ZINB or NB model is fine with this analysis. I have gone through your videos and The analysis factor pages , it very help to understand poisson and NB. But for ZINB and ZIP i am still confuse . can you give me some suggestions on this topic. Thankyou very much. Best Regards Isha Kamboj [Reply](https://www.theanalysisfactor.com/zero-inflated-poisson-models-for-count-outcomes/#comment-181614) 4. ![](https://secure.gravatar.com/avatar/a308ba61dcc575da906bf3206ae45a416d5761a2b418fe8896c9a224533573b6?s=60&r=g)Zahid Javed says [July 31, 2016 at 1:26 am](https://www.theanalysisfactor.com/zero-inflated-poisson-models-for-count-outcomes/#comment-177626) Can you please tell me about the multilevel Poisson regression [Reply](https://www.theanalysisfactor.com/zero-inflated-poisson-models-for-count-outcomes/#comment-177626) 5. ![](https://secure.gravatar.com/avatar/692d7b0ec0f208ee26ac768543852b4a9bd43d2b4bd907b845441db650e530a8?s=60&r=g)aziz says [May 27, 2015 at 9:46 am](https://www.theanalysisfactor.com/zero-inflated-poisson-models-for-count-outcomes/#comment-154839) Dear karen how can i know the data is zero inflated apart from descriptive statistics? is there any test statistic to commit the zero-inflated? thanks [Reply](https://www.theanalysisfactor.com/zero-inflated-poisson-models-for-count-outcomes/#comment-154839) - ![](https://secure.gravatar.com/avatar/70eaf3855d83021a7b0f380797b3cad99ed3a4a03b9589bf20ef1b5eb923eeb0?s=60&r=g)[Karen](http://www.analysisfactor.com/) says [June 3, 2015 at 9:13 am](https://www.theanalysisfactor.com/zero-inflated-poisson-models-for-count-outcomes/#comment-155197) Really, the only way is to run a model with and without the zero inflation and check if the ZI model fits better. [Reply](https://www.theanalysisfactor.com/zero-inflated-poisson-models-for-count-outcomes/#comment-155197) 6. ![](https://secure.gravatar.com/avatar/d27c9e52112c5e2c20597bdcd60cf4ac6b770bbf5488cdceea1586aefb3d0257?s=60&r=g)Cara says [March 27, 2014 at 3:52 pm](https://www.theanalysisfactor.com/zero-inflated-poisson-models-for-count-outcomes/#comment-13275) Hi Karen, I have started following your webinars and have found your information to be very helpful and remarkably straightforward compared to many other sources out there. I was wondering about fitting ZINB and ZIP models as compared to their negative binomial and Poisson counterparts. I have data that could conceivably work with a ZINB model and indeed the goodness of fit metrics indicate that it works better than a standard negative binomial (AIC, Vuong’s test, Likelihood Ratio Test). However, my predictor variables are not significant in predicting the logistic part of the model. How is it possible for the model to be a better fit if my predictors cannot predict the added part of the model? Are there any models that use a negative binomial regression with an added on function that is just a constant probability of zero? Is that what I should be considering? Thanks\! [Reply](https://www.theanalysisfactor.com/zero-inflated-poisson-models-for-count-outcomes/#comment-13275) - ![](https://secure.gravatar.com/avatar/70eaf3855d83021a7b0f380797b3cad99ed3a4a03b9589bf20ef1b5eb923eeb0?s=60&r=g)[Karen](http://www.analysisfactor.com/) says [April 4, 2014 at 9:36 am](https://www.theanalysisfactor.com/zero-inflated-poisson-models-for-count-outcomes/#comment-13542) Hi Cara, Perhaps a Hurdle model is what you need? That is indeed puzzling. <http://www.polmeth.wustl.edu/media/Paper/zorn96.pdf> [Reply](https://www.theanalysisfactor.com/zero-inflated-poisson-models-for-count-outcomes/#comment-13542) 7. ![](https://secure.gravatar.com/avatar/608b936b0605cd96f69961e429a55189c1f86d7cf0fabaab3f8ba5aa81a16f68?s=60&r=g)Brian Chen says [April 1, 2013 at 11:02 pm](https://www.theanalysisfactor.com/zero-inflated-poisson-models-for-count-outcomes/#comment-5404) Dear Karen, How’ve you been? I am one of your students who are very thankful for your excellence online stats courses. I have a question from this article. Based on your words below: “But the Poisson model has very strict assumptions. One that is often violated is that the mean equals the variance. When the variance is too large because there are many 0s as well as a few very high values, the negative binomial model is an extension that can handle the extra variance…….” Due to the different level or degree of variance, are there rule of thumbs that can help researchers to choose for count data (particularly, I am most interested in hospital visit or inpatient/outpatient visit): (1) Poisson model, (2) Negative binomial model, (3) Zero-inflated Poisson model (4) Other model that can fit hospital visit well Thank you very much. Best wishes, Brian [Reply](https://www.theanalysisfactor.com/zero-inflated-poisson-models-for-count-outcomes/#comment-5404) - ![](https://secure.gravatar.com/avatar/70eaf3855d83021a7b0f380797b3cad99ed3a4a03b9589bf20ef1b5eb923eeb0?s=60&r=g)[Karen](http://www.analysisfactor.com/) says [April 2, 2013 at 5:39 pm](https://www.theanalysisfactor.com/zero-inflated-poisson-models-for-count-outcomes/#comment-5420) Hi Brian, Thanks for the kind words about my workshops. I’m glad you found it helpful. There aren’t really rules of thumb. There is a scale parameter that can either be set to 1 (this is often the default) or allowed to be estimated. Allow it to be estimated in a Poisson. If it’s much \>1, you are violating Poisson assumptions and need to use either NB or ZIP (or even ZINB). Also keep an eye on model fit statistics like -2LL. As for whether to fit NB or ZIP, take a look at the frequency of zeros. If you’re running NB and still have many more zeros than the model is accounting for (this will happen if say 50% or more of observations are 0–but I’m making up 50%. It could be even lower), then you’ll need a zero inflated model. There are a few good books on this, one by J Scott Long and one by Joseph Hilbe. I would suggest reading as much as you can. These get tricky. Karen [Reply](https://www.theanalysisfactor.com/zero-inflated-poisson-models-for-count-outcomes/#comment-5420) *** ### Leave a Reply [Cancel reply](https://www.theanalysisfactor.com/zero-inflated-poisson-models-for-count-outcomes/#respond) Please note that, due to the large number of comments submitted, any questions on problems related to a personal study/project **will not** be answered. We suggest joining [Statistically Speaking](https://www.theanalysisfactor.com/membership/), where you have access to a private forum and more resources 24/7. Quick links [Our Programs](https://www.theanalysisfactor.com/about/programs/) [Statistical Resources](https://www.theanalysisfactor.com/resources/) [Blog/News](https://www.theanalysisfactor.com/blog/) [About](https://www.theanalysisfactor.com/about/) [Contact](https://www.theanalysisfactor.com/contact-us/) [Log in](https://www.theanalysisfactor.com/customer-login/) [Contact](https://www.theanalysisfactor.com/contact-us/) Upcoming [Free Webinars](https://thecraftofstatisticalanalysis.com/home/) [Membership Trainings](https://www.theanalysisfactor.com/?s=member+training) [Workshops](https://www.theanalysisfactor.com/live-online-workshops/) [Privacy Policy](https://www.theanalysisfactor.com/privacy-policy/) [Search](https://www.theanalysisfactor.com/search/) Copyright © 2008–2026 [The Analysis Factor, LLC](https://www.theanalysisfactor.com/). All rights reserved. The Analysis Factor uses cookies to ensure that we give you the best experience of our website. If you continue we assume that you consent to receive cookies on all websites from The Analysis Factor. [Continue]() [Privacy Policy](https://www.theanalysisfactor.com/privacy-policy/) Privacy & Cookies Policy Close #### Privacy Overview This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience. [Necessary]() Necessary Always Enabled Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information. [Non-necessary]() Non-necessary Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website. [SAVE & ACCEPT]()
Readable Markdown
There are quite a few types of outcome variables that will never meet ordinary linear model’s assumption of normally distributed residuals. A non-normal outcome variable can have normally distribued residuals, but it does need to be continuous, unbounded, and measured on an interval or ratio scale. Categorical outcome variables clearly don’t fit this requirement, so it’s easy to see that an ordinary linear model is not appropriate. Neither do count variables. It’s less obvious, because they are measured on a ratio scale, so it’s easier to think of them as continuous, or close to it. But they’re neither continuous or unbounded, and this really affects assumptions. Continuous variables measure how much. Count variables measure how many. Count variables can’t be negative—0 is the lowest possible value, and they’re often skewed–so severly that 0 is by far the most common value. And they’re discrete, not continuous. All those jokes about the average family having 1.3 children have a ring of truth in this context. Count variables often follow a Poisson or one of its related distributions. The Poisson distribution assumes that each count is the result of the same Poisson process—a random process that says each counted event is independent and equally likely. If this count variable is used as the outcome of a regression model, we can use Poisson regression to estimate how predictors affect the number of times the event occurred. But the Poisson model has very strict assumptions. One that is often violated is that the mean equals the variance. When the variance is too large because there are many 0s as well as a few very high values, the negative binomial model is an extension that can handle the extra variance. But sometimes it’s just a matter of having too many zeros than a Poisson would predict. In this case, a better solution is often the Zero-Inflated Poisson (ZIP) model. (And when extra variation occurs too, its close relative is the Zero-Inflated Negative Binomial model). ZIP models assume that some zeros occurred by a Poisson process, but others were not even eligible to have the event occur. So there are two processes at work—one that determines if the individual is even eligible for a non-zero response, and the other that determines the count of that response for eligible individuals. The tricky part is either process can result in a 0 count. Since you can’t tell which 0s were eligible for a non-zero count, you can’t tell which zeros were results of which process. The ZIP model fits, simultaneously, two separate regression models. One is a logistic or probit model that models the probability of being eligible for a non-zero count. The other models the size of that count. Both models use the same predictor variables, but estimate their coefficients separately. So the predictors can have vastly different effects on the two processes. But a ZIP model requires it be theoretically plausible that some individuals are ineligible for a count. For example, consider a count of the number of disciplinary incidents in a day in a youth detention center. True, there may be some youth who would never instigate an incident, but the unit of observation in this case is the center. It is hard to imagine a situation in which a detention center would have no possibility of any incidents, even if they didn’t occur on some days. Compare that to the number of alcoholic drinks consumed in a day, which could plausibly be fit with a ZIP model. Some participants do drink alcohol, but will have consumed 0 that day, by chance. But others just do not drink alcohol, so will never have a non-zero response. The ZIP model can determine which predictors affect the probability of being an alcohol consumer and which predictors affect how many drinks the consumers consume. They may not be the same predictors for the two models, or they could even have opposite effects on the two processes. ![](https://www.theanalysisfactor.com/wp-content/uploads/2019/01/cosa_ad_header2.png) Poisson and Negative Binomial Regression for Count Data Learn when you need to use Poisson or Negative Binomial Regression in your analysis, how to interpret the results, and how they differ from similar models.
Shard106 (laksa)
Root Hash13704746990480630306
Unparsed URLcom,theanalysisfactor!www,/zero-inflated-poisson-models-for-count-outcomes/ s443