Summary of "Misleading biases in methods for estimating wolf abundance using spatial models", a recent peer-reviewed scientific publication
- Robert Crabtree
- Sep 25
- 8 min read
Updated: Oct 19
Abundance (population size) is the primary criterion used for management and conservation decision-making, such as setting hunting quotas and the listing and delisting of species under the ESA. Abundance and its change over time are also primary inputs required for decision-making models, which range from expert opinion to management, AI, statistics, and forecasting that attempt to understand and predict the population response to impacts such as hunting, climate change, disturbance, and habitat alteration. The state of Montana, for example, has recently adopted a new method to determine the abundance (population size) of wolves, which directly informs statewide management decisions and projection models, including where and how many wolves can be killed each year, as well as how many wolves are projected to persist without reaching a critically low level. If the population size of wolves (abundance) is unknown or highly uncertain because the method to determine abundance is unreliable or invalid, then the result of any model that uses it would simply be an exercise in futility. Such a situation poses a high risk to wolves, primarily when the method used reports that abundance is both accurate (unbiased) and precise, yet in fact it overestimates the number of wolves and is imprecise to the point that it can not detect a change.
With a managed species, agencies consider both its ecological role and its economic costs and benefits when making decisions for a resource held in the public trust. For gray wolves, the best available science indicates that wolves have a high value with a relatively small negative impact on livestock. In fact, local economies benefit from wolves for numerous reasons, and wolves play a crucial role in maintaining healthy ecosystems that humans rely on. Direct benefits include the suppression of overabundant elk, deer, and coyote populations which cause damage to livestock and crops, maintaining healthy game populations by culling out the weak and suppressing disease from spilling over from wildlife and humans, restoring vegetation which helps water quality songbirds and insect pollinators, making roads safer by reducing deer-vehicle collisions, and generation of cash and jobs (82 million dollar economy around Yellowstone). Naturally formed larger packs, those that are not killed, enhance these benefits. Overall, wolves play a disproportionately large role, being at the top of the food chain, such that small changes in their abundance have significant effects on species and processes further down the food chain. Not even a grizzly or a polar bear can outcompete a pack of wolves.
Unfortunately, estimating carnivore abundance is exceedingly difficult because they exist at low densities, display high mobility over large landscapes, are cryptic in their behavior, and often exhibit aversive responses to humans and capture methods. Nearly all wildlife studies employ some form of marking technique (radio-collar, DNA, eartag) because raw, unmarked counts are often misleading due to missing individuals that are not observed or duplicate observations of the same animal or group. Marking is frequently used to correct for these biases, which, left uncorrected, lead to bias in the form of under- or overestimation of abundance.
In response to the increasing wolf population since the reintroduction of wolves in nearby Yellowstone and Idaho in 1995-96, the State of Montana developed a new approach called the Integrated Patch Occupancy Model (iPOM), which claims to provide improved accuracy of wolf abundance while reducing costs. It is a complex combination of several different models, each with its own submodels, and has dozens of model assumptions and variables. Although iPOM has been criticized by scientists, judges, and conservationists, we provided an independent, objective, and thorough evaluation by assessing its input data, potential biases, variance, sampling design, model coherence, validation, and reproducibility. Underlying these criteria are the two most important attributes used to assess the quality of a method: bias and precision, which are explained in Figures 1 and 2 of our recently accepted peer-reviewed paper in the scientific journal Academia Biology, see PDF below. It is important to note that the variance of abundance can be estimated from the sample data, but bias cannot be. Still, bias can be assessed biologically (through common knowledge and research) and statistically by testing the method's assumptions. Also, computer simulations can provide an estimate of the magnitude of the bias in abundance. Unfortunately, Montana Fish, Wildlife, and Parks (MFWP) did not conduct such testing; however, we did so where possible.
The concept behind iPOM's method of estimating abundance is relatively simple; however, the techniques employed are not. Conceptually, it uses two spatial models to determine the number of wolf packs (NP) across Montana and then multiplies NP by the average pack size. The first spatial model, originally designed to estimate a species distribution from mapping the location of each sighting, estimates the area occupied by the total of each wolf pack's territorial area. To arrive at NP, this summed area is then divided by the output of a second spatial model, which simulates the average size of an average wolf territory. For example, if there were 60,000 km2 of total area summed from each wolf pack's non-overlapping territories, and the average territory size was 600km2, then 60,000 divided by 600 equals 100 wolf packs in Montana. 100 packs multiplied by a pack size of say, 5, would result in an abundance of 500 wolves.
A wolf pack or a wolf territory is not an easy entity to observe or sample compared to the location of a single wolf observation. Instead, MFWP relies on phone surveys of hunters' recollections of inadvertent, haphazard observations of where, when, and what they saw groups of two or more wolves. They also provide no valid means to distinguish between wolves and coyotes or between a group of two wolves that do or do not belong to a territorial pack. To make matters worse, iPOM requires the skill to observe and correctly classify two or more wolves traveling together as belonging to a stable territorial pack during the 5-week survey period, which coincides with the wolf hunting season set by MFWP. The method requires that an observation of two or more wolves is only counted as belonging to one stable territorial pack, rather than belonging to two or more separate packs. This is highly problematic because counting any wildlife species results in double-counting, unless there are individually recognizable marks that enable biologists to distinguish between them. Because iPOM lacks a sampling design, a hunter or biologist could classify unmarked individuals of the same pack as belonging to one, two, three, or even four different territories if they cross into more than one cell during a different week within the 5-week survey period in late fall, when shifting packs are naturally unstable. The problem of misclassification, high mobility, and repeated counting of one pack causes an overestimation of abundance, among other factors we examined.
Another problem related to hunter’s misclassification errors and double- or triple-counting, is how MFWP hand-marks the center or “centroid” of the assumed singular “known” wolf territories on a map. Previous studies have used statistical analysis of radio-collared wolves in a pack’s territory. In fact, the main model in iPOM requires that a second method, independent from the hunter’s recollections of wolves, be used to determine one “known” wolf territory different from another. If not, science has repeatedly shown that even slight errors (1% to 10%) in this critically sensitive assumption lead to severe overestimation of abundance. Unfortunately, MFWP uses indirect “signs” such as tracks and kill sites of unmarked wolves and then subjectively determines centroid locations. Similar to hunter surveys, research has also shown that even experienced wolf specialists can not determine if the “sign” is from a single wolf or multiple packs because, again, there are no unique marks to prevent double- or triple-counting, as evidenced in MFWP’s reported results. Finally, MFWP reported they don’t have nor archive the hunter survey data, and did not provide us with the pack centroid data when requested. For this reason alone, iPOM fails a basic test of science: it cannot be reproduced or validated.
In addition to the problem of multiple counting, two other sources of bias also result in over-estimation of abundance: resolution and closure violations. Resolution bias is due to the large grid cell size iPOM uses (600 km²). This bias can be minimized in two ways: (1) if sampling occurs at smaller grid cell sizes, as seen in Wisconsin, or (2) if they use the proper measures of habitat associations, for example, examining the relationship between snow and prey (deer and elk) during the survey period. iPOM does neither, but it does offer a flawed model of landscape variables that attempts to correct for bias. Furthermore, their chosen variables are static, changing little, if at all, from year to year (fixed). This renders iPOM unable to respond to the numerous changes affecting wolves and their prey, resulting in unrealistic, constant output.
Another problem is iPOM’s sensitive assumption of “closure” that requires that wolves and their territories are stable both geographically and demographically (no mortalities) during the survey. It is biologically clear that this assumption is violated, which in turn leads to an additional overestimation of abundance. Wolves are highly mobile and range widely over large geographic landscapes during the 5-week survey. Demographic closure is severely violated in numerous ways. Hunters are killing probably 150 or more wolves just before or during the 5-week late fall survey period, and hunters themselves disturb wolves and create attractive gut piles and injured deer and elk. Secondly, a small (5 to 12 percent) but highly significant percentage of wolf packs disintegrate when an alpha breeder is killed. This results in the remaining pack members dispersing across many 600-km² grid cells during the survey period, which further accentuates the closure assumption problem. Unfortunately, these closure violations, along with false-positive identifications and insufficiently corrected resolution bias, all result in a severe overestimation of abundance.
We found that the bias in iPOM variance used to report the confidence interval is more severe than the bias in the point estimate of abundance for several reasons, including mathematical errors and the omission of many variables necessary to capture the actual variation in abundance. This results in substantial underreporting of uncertainty to the point where iPOM can not detect a change in abundance from year to year, nor whether Montana’s wolf population falls below a critical level of, say, 150 individuals. We chronicle numerous other problems with iPOM that appear to be an incoherent mash-up of methods, including a simulation model that ingests no annual empirical information from Montana’s wolf population to determine territory size needed to determine abundance. Their methods to estimate pack size are also flawed, and a recent analysis demonstrated that analysis errors contributed to an additional factor why iPOM overestimates abundance. Other problems with the iPOM method, along with suggested solutions, are covered in the technical paper.
With many recommendations for improvement, we conclude that iPOM is incoherent, lacks the fundamental elements of scientific inference, and produces unreliable predictions. Particularly problematic is that iPOM is misleading, as it reports accurate and precise abundance predictions when abundance is severely biased (overestimation) and imprecise. We simply don’t know how many wolves there are in Montana.

A simplified schematic illustrating some of the problems with iPOM that cause overestimation of abundance.
Four territorial wolf packs are depicted, where wolves are not individually marked as in iPOM. Territorial boundaries are in red with the breeding adults in the center. The average
size of a wolf territory is the same as the area of the grid cell: 600km2. Two sensitive assumptions in MFWP’s iPOM method to estimate abundance require that (1) wolves and their territories are stable – geographical and demographically “closed” during the 5-week late-fall hunting season, and (2) there are no detection errors when classifying a wolf pack to a singular territory in a singular grid cell that week. As wolves move about during the late fall 5-week hunting season, hunters might observe and record two or more wolves (or coyotes) of the same pack as belonging to two or more packs in more than one grid cell during that one week or in another week. Note that each of the three packs (solid lines) includes individuals that overlap along the boundaries in two, three, or four of iPOM’s grid cells. This could result in the recording of nine 600km2 grid cells for only three packs. It could also cause wolf specialists, using “sign” from unmarked wolf pack members, to record two or more territories where there is only one. The one pack with a dashed boundary depicts a “pack dissolution” where a hunter killed an alpha breeder, and the remaining pack members dispersed to other grid cells where they might be mistaken for another pack, yet no pack exists.
Read the full publication here:

Comments