Understanding whether a variable’s missingness from a dataset is related to the underlying value of the data is a key concept in the field of missing data analysis. We distinguish three broad categories: missing completely at random (MCAR), missing at random (MAR), and missing not at random (MNAR). In his book Statistical Rethinking, McElreath1 gives an amusing example to illustrate this concept: he considers variants of a dog eating homework and how the dog chooses - if at all - to eat the homework. The examples he give show substantial shifts in observed values, which make for a good illustration of the types of problems you might encounter. A lecture corresponding to the example from the book can be found on YouTube. In this post, I will first briefly review the different missing data mechanisms before implementing McElreath’s examples in SAS.
...