PG: I wonder how well you account for the fact that even though you have 564 data points, you end up having much fewer data points for each type of subsample - and since startup data is by definition highly non-linear and non-normally distributed, you could end up reaching sweeping conclusions from way too few data points