The $ATT$ or the Average Treatment Effect on the Treated, is defined as:

$$ ATT = E[Y(1) - Y(0) | T=1] $$

for potential outcomes $Y(1), Y(0)$ and treatment indicator $T \in \{0,1\}$. It is my understanding that the above is an estimand and in observational studies, the $ATT$ is not equal to the $ATE$, or the average treatment effect. It is however, equal under randomized studies.

The $ATT$ seems to be a measure of the average effect from a treatment to a treated individual randomly drawn from the treated population, rather than to any member of the population. Therefore, it seems to be dependent on the sample itself. Why is it then talked about as an estimand, which hints at the population level?

It seems to me that the reason the $ATT$ is not equal to the $ATE$ because there is not sufficient randomization for the treatment assignment. This in turn has nothing to do with the original groups themselves, but rather what constituted an individual being put into "treatment". This selection seems to be a property that goes into the sample. Why then is the $ATT$ a population estimand parameter?