Collective Animal Behavior from Bayesian Estimation and Probability Matching by Alfonso Perez-Escudero and Gonzalo G. de Polavieja

Overview of this paper.

Animals make decisions based on both local sensory information as well as social information from their neighbors (Couzin 2009). One common goal of animals’ decision is to choose environmental locations that are best for foraging food. Decision making by individuals that collect exclusively non-social information (e.g., availability of food, threats by predators, or shelter) has been modeled extensively using both heuristic and Bayesian inference frameworks (Bogacz et al 2006). In some instances, optimal decision strategies can be identified by applying Bayesian inference methods to relate accumulated evidence to the underlying truth of the environment. However, principled models of decision making using both social and non-social information have yet to be fully developed. Most collective decision making models tend to be heuristic equations that are then fit to data, ignoring essential components of probabilistic inference.


This paper aims to develop a probabilistic model of decision making in individuals using both local information and knowledge of their neighbors’ behaviors. For the majority of the paper, they focus on decision making between two options. This is meant to model recent experiements on stickleback foraging between two feeding sites (Ward et al 2008). The framework can be extended to a variety of contexts including more than two options as well as considerations of the history-dependence of group decisions, which the authors consider. They start with the assumption that each animal computes the probability that option Y is the “best” one (e.g., safest or highest yielding) given non-social information C and social information B.


Bayes’ theorem can then be used to compute

P(Y|C,B) = \frac{\displaystyle P(B|Y,C)P(Y|C)}{\displaystyle P(B|X,C)P(X|C)+P(B|Y,C)P(Y|C)}

A major insight of the paper is then that by dividing by the numerator, the effects of non-social information can be separated from social information

P(Y|C,B) = \frac{\displaystyle 1}{\displaystyle 1+aS}

where a=P(X|C)/P(Y|C) is the likelihood ratio associated with non-social information and S=P(B|X,C)/P(B|Y,C) contains all the social information.

Now, one issue with the social information term S is that it is comprised of behavioral information from all the other animals, and these behaviors are likely to be correlated. However, the authors assume the focal individual ignores these correlations for simplicity. It would be interesting to examine what is missed by making this independence assumption. In general, independence assumptions allow joint densities to be split into products P(x_1,x_2)=P(x_1)P(x_2), so assuming B=\{b_i\}_{i=1}^N then

S=\prod_{i=1}^N\frac{\displaystyle P(b_i|X,C)}{\displaystyle P(b_i|Y,C)}

For the majority of the paper, the authors focus on three specific behaviors: \beta_x, choosing site x; \beta_y, choosing site y; and \beta_u, remaining undecided. This means that the main parameters of the model are the likelihood ratios

s_k = \frac{\displaystyle P(\beta_k|X,C)}{\displaystyle P(\beta_k|Y,C)}

indicating how informative each behavior is about the quality of a particular site. Since the model has such a low number of parameters, it is straightforward to fit it to existing data.

data_perezThe authors specifically fit it to data collected from laboratory data on sticklebacks performing a binary choice task (Ward et al, 2008), where each option is equally good. In this case, the probability of a fish choosing site y simplifies considerably:

P_y = \left( 1 + s^{- \Delta n} \right)^{-1},

so there is only one free parameter s, which controls the strength of social interaction. For large values of s, the population very quickly will align itself with one of the two options, since animals make choices probabilistically based on P_y. Asymmetries are introduced in the experimental data by placing replica fish at one or both of the possible sites, and this intial condition influences the probability of the remaining fish’s selection. Remarkably, the single parameter model fits data quite well, as shown in the above figure.

From here, the paper goes on to explore more nuances in the model such as the case where one site is noticably better than another or when some replica fish are more attractive than others. All these effects can be captured and fit the data set from Ward et al (2008) fairly well. In general, social interactions in the model setup a bistable system, that tends to one of two steady states where almost all animals choose one of the two sites. This should not be surprising, since the function P_y has a very familiar sigmoidal form often taken as an interaction function in neural network models (Wilson and Cowan, 1972) and ferromagnetic systems. Again, these models tend to admit multistable solutions.

One issue that the authors explore near the end of the paper is the effect of dependencies on the ultimate probability of choice distribution. In this case, the history of a series of choice behaviors is taken into account by animals making subsequent decisions. In this case, animals may actually pay more attention to dissenting individuals that are in the minority than the majority of individuals that are aligned with the prevailing opinion. The general idea is that dissent could indicate some insight that that single animal has over the other. The authors’ exploration of this phenomenon is cursory, but it seems like there is room to explore such extensions in more detail. For instance, animals could weight the opinions of their neighbors based on how recently those decisions were made. An analysis of the influence of the order of decisions on the ultimate group decision would also be a way to generate a more specific link between model and data.

Note: The model the authors develop is closely linked to Polya’s urn, a toy model of how inequalities in distributions are magnified over time. Essentially, the urn contains a collection of balls of different colors (say black and white). A ball is then drawn randomly from the urn and replaced with two balls of that color. This step is then repeated. Thus, an asymmetry in the number of balls of each color will lead to the more prevalent color having a higher likelihood of being selected. This will lead to that color’s dominance being increased. The probability matching in the Perez-Escudero and Polavieja (2011) model plays the role of drawing and replacing process. The distribution of balls is effectively the probability distribution of selecting one of two choices.


Pérez-Escudero, A., & De Polavieja, G. G. (2011). Collective animal behavior from Bayesian estimation and probability matching. PLoS Comput Biol,7(11), e1002282.

Ward, A. J., Sumpter, D. J., Couzin, I. D., Hart, P. J., & Krause, J. (2008). Quorum decision-making facilitates information transfer in fish shoals.Proceedings of the National Academy of Sciences, 105(19), 6948-6953.

Bogacz, R., Brown, E., Moehlis, J., Holmes, P., & Cohen, J. D. (2006). The physics of optimal decision making: a formal analysis of models of performance in two-alternative forced-choice tasks. Psychological review,113(4), 700.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s