Research Document - 2015/034
Identification of duplicate sightings from the 2013 double-platform High Arctic Cetacean Survey
By D. Pike and T. Doniol-Valcroze
Abstract
One of the key assumptions of distance sampling is that all animals on line are detected by observers. Double-platform methods have been developed to address situations of incomplete detection at the track line, but they require the identification of sightings seen by both observers. However, there is no means to independently and unequivocally determine whether or not a given pair of sightings is in fact a duplicate pair, or to select the most likely duplicate among a set of candidate sightings observed in close proximity. Most previous studies have used ad-hoc methods and arbitrary thresholds. Here, we develop a data-driven approach to identify single and duplicate sightings made during the 2013 High Arctic Cetacean Survey (HACS). We make use of four covariates to compare sightings made by front and rear observers: difference in time of sighting, difference in declination angle, difference in group size and difference in species identity. To estimate the relative weights of these covariates, we compared two datasets in a logistic regression framework: a set of sighting pairs that contain both duplicates and non-duplicates and a similar dataset known to contain no true duplicates (the observations made at the same time but on the other side of the plane). This allowed us to determine which combinations of factors were most successful at discriminating duplicates and to rate each candidate pair within the same-side data with an index of dissimilarity. Candidates with the lowest scores were identified as duplicates using two different methods and a range of threshold values for each covariate. Depending on the procedure used, 19% to 30% of narwhal sightings in the HACS dataset were seen by both observers, whereas 36% to 50% of bowhead whale sightings were seen by both observers. However, the aggregated nature of the sightings and particularly the relatively high proportion of missing primary data such as declination and group size made the identification of duplicates uncertain in many cases.
Accessibility Notice
This document is available in PDF format. If the document is not accessible to you, please contact the Secretariat to obtain another appropriate format, such as regular print, large print, Braille or audio version.
- Date modified: