Abstract: The vast majority of commonly invoked statistical properties are derived for a fixed, a priori known sample size. In both univariate and multivariate statistics, familiar results then follow, such as the consistency, asymptotic normality, and efficiency of the sample average for the mean parameter when the data follow either a normal or a member of a large class of non-normal distributions. Matters change when sample size itself becomes a random. This can take various forms: the sample size can depend on the data collected or not and, whenever it does, it can be governed by either a deterministic or a probabilistic rule. Key such situations include sequential trials, missing data, and completely random sample size. While a lot of work has been done in this area, some of it relatively easy (Grambsch 1983, Barndorff-Nielsen and Cox 1984), it is insightful to place these and related settings into a general joint-modeling-based framework and derive generic results. From there, both parametric (likelihood based) as well as semi-parametric inferences can be drawn. It will be shown that counterintuitive results may follow, such as, for example, the fact that the sample average may exhibit small-sample biased in many cases and, even when it is unbiased, like with a completely random sample size, then it is not optimal, without a uniform optimum existing. We will demonstrate that such results critically depend on key attributes, such as (non-)ancillarity of the sample size and the fact that the sample sum combined with the sample size never is a so-called complete sufficient statistic, as long as at least two different sample sizes have a non-zero probability of occurring. Our results have direct implications for estimation after group sequential trials. Moreover, there are ramifications to other settings, such as random cluster sizes, censored time-to-event data, and the joint modeling of longitudinal and time-to-event data.