The problem of detecting a few anomalous processes among a large number of M processes is considered. At each time, aggregated observations can be taken from a chosen subset of processes, where the chosen subset conforms to a given binary tree structure. The random observations are i.i.d. over time with a general distribution that may depend on the size of the chosen subset and the number of anomalous processes in the subset. The objective is a sequential search strategy that minimizes the sample complexity (i.e., the expected number of observations which represents detection delay) subject to a reliability constraint. A sequential test that results in a biased random walk on the tree is developed and is shown to be asymptotically optimal in terms of detection accuracy. Furthermore, it achieves the optimal logarithmic-order sample complexity in M provided that the Kullback-Liebler divergence between aggregated observations in the presence and the absence of anomalous processes are bounded away from zero at all levels of the tree structure as M approaches infinity. Sufficient conditions on the decaying rate of the aggregated observations to pure noise under which a sublinear scaling in M is preserved are also identified for the Bernoulli case.