The article seems to analyze a statistical practice from a theoretical perspecti...

The article seems to analyze a statistical practice from a theoretical perspective.

Using the same perspective, another way to formulate this discussion is:

1. Look at all the data in the universe.

2. Choose some to examine (using a non-random procedure).

3. From those, employ a variable selection procedure (the article argues against stepwise selection and somewhat for Lasso).

4. Fit a model to the remaining data.

In reality, there are at least 2 variable selections occurring. In the first variable selection (choosing data to examine from the universe of data), we are choosing those variables based on some procedure that is ultimately grounded in data.

This is a cache22: unless you look at all data that exists, you choose some subset based on all data that exists.