Simplify until your fake-data check works, then add complications until you can figure out where the problem is coming from

Statistical Modeling, Causal Inference, and Social Science 2013-05-28

I received the following email:

I am trying to develop a Bayesian model to represent the process through which individual consumers make online product rating decisions. In my model each individual faces total J product options and for each product option (j) each individual (i) needs to make three sequential decisions:

- First he decides whether to consume a specific product option (j) or not (choice decision)

- If he decides to consume a product option j, then after consumption he decides whether to rate it or not (incidence decision)

- If he decides to rate product j then what finally he decides what rating (k) to assign to it (evaluation decision)

We model this decision sequence in terms of three equations. A binary response variable in the first equation represents the choice decision. Another binary response variable in the second equation represents the incidence decision that is observable only when first selection decision is 1. Finally, an ordered response variable in the third stage captures the extent of preference of individual i for product j. This ordered response (rating) is observed only when both first and second decisions are 1. Each of these response variables in turn are dictated by a corresponding latent variable that is assumed to be linearly related to a set of product characteristics.

I have been able to implement the estimation algorithm in R. However, when I tried to apply the algorithm to a simulated data set with known parameter values it failed to recover the parameters. I was wondering if there is something wrong with the estimation method. I am attaching a document outlining the model and the proposed estimation framework. It would be immense help if you kindly have a look at the model and the proposed estimation strategy and suggest any improvement or modification needed.

I replied: I don’t have time to read this, but just to give some general advice: if your fake-data check does not recover your model, I recommend you simplify your model. Go simpler and simpler until you can get it to work, then from there you can try to identify what is the problem.

The post Simplify until your fake-data check works, then add complications until you can figure out where the problem is coming from appeared first on Statistical Modeling, Causal Inference, and Social Science.