The outcomes of our models are not always what we intent to get. In my opinion, we should pay more attention to attenuation bias (also referred to as measurement error bias). I will describe how to identify potential biases and how this is traditionally solved with orthogonal regression. I will then show the novel solution we use at our company to account for attenuation bias in geo experiments.
The outcomes of our models are not always what we intent to get. In my opinion, we should pay more attention to attenuation bias (also referred to as measurement error bias or regression dilution). When there is noise in the independent variables (i.e. features) the parameters of your model will be biased towards 0. You might think this is only an issue when doing inference, but from a machine learning perspective you might suffer the exact same problems depending on how the predictions are being used.
I will explain orthogonal regression, which is traditionally used to solve attenuation bias. I will use this example to explain why you can only solve correct for attenuation bias when having at least some information about the noise in your independent variables.
The second part will be about how we handle attenuation bias in geo experiments at our company. I will first introduce geo experiments for causal inference and explain why there is potential attenuation bias. We will then dive into the code to show how we can account for this attenuation.To my knowledge, our method is not used outside of our company. Resources like Google's white paper do refer to orthogonal regression, but do not mention the simple but powerful solution we propose.
As a bonus, I would like to explain how this relates to imputation for missing variables. I would then like to argue why you should be careful in following Andrew Gelman in applying random regression imputation because of attenuation bias.