Advances in non-invasive tracking of wave-type electric fish in natural and laboratory settings

1. Introduction

Unraveling causal factors driving various animal behaviors in experimental and in particular in observational studies is challenging, since most behaviors result from an integration of a broad range of social and environmental stimuli, internal states, and past experiences (Chapman et al., 1995; Sapolsky, 2005; Boon et al., 2007; Markham et al., 2015). In laboratory studies, environments and contexts are systematically simplified in order to minimize the number of potential factors influencing behaviors (e.g., Bastian et al., 2001; Pantoni et al., 2020). Such studies are tailored to specific behaviors and well-defined contexts. However, behaviors in such constrained settings often deviate from behaviors in natural environments and thus have to be interpreted with care (Cheney et al., 1995; Rendall et al., 1999; Henninger et al., 2018). To discover behavioral traits of interest in the first place, field studies or laboratory experiments with complex, more naturalistic designs are needed. Recent technological advances in remote recording techniques, tags, and data loggers, as well as advances in data analysis, facilitate the collection and evaluation of comprehensive and viable data in naturalistic settings with freely moving and interacting animals (Dell et al., 2014; Hughey et al., 2018; Mathis et al., 2018; Jolles, 2021). These new big-data approaches open up new opportunities in behavioral research in that they potentially allow to quantitatively study animal behaviors in more complex and naturalistic settings (Gomez-Marin et al., 2014; Egnor and Branson, 2016).

A suitable recording technique can be selected from a large variety of available devices and sensors to match the requirements imposed by the model species, environmental conditions, and the scientific question (Hughey et al., 2018). This allows for studying various aspects of animal behaviors across species (e.g., Nagy et al., 2010; Robinson et al., 2012; Strandburg-Peshkin et al., 2015, 2018). A commonly used technique to study animals in their natural habitats is the utilization of animal mounted bio-loggers, e.g., small devices equipped with different sensors like GPS-trackers or microphones (Nagy et al., 2010; Strandburg-Peshkin et al., 2017; Hughey et al., 2018). However, bio-loggers require frequent animal handling and animals are required to carry devices, both inducing a potential bias (Saraux et al., 2011). Furthermore, bio-loggers might miss relevant information, since not all interacting animals might be equipped with a logger (e.g., Strandburg-Peshkin et al., 2019), signal detection range is limited, or data is recorded discontinuously to extend the overall recording period (Strandburg-Peshkin et al., 2017; Hughey et al., 2018).

Alternatively, behaving animals can be tracked by means of remote sensing devices (Kühl and Burghardt, 2013; Theriault et al., 2014; Henninger et al., 2018, 2020; Hughey et al., 2018; Torney et al., 2018; Raab et al., 2019; Aspillaga et al., 2021). In this approach, recorded signals can originate from small micro-transmitters that get affixed to animals (e.g., acoustic telemetry system for fish: Aspillaga et al., 2021) or from the animals themselves (photography, video recordings: Sherley et al., 2010; Lahiri et al., 2011; Theriault et al., 2014; Nourizonoz et al., 2020; ultrasound vocalizations: Surlykke and Kalko, 2008; Seibert et al., 2013; Hügel et al., 2017; electric signals: Henninger et al., 2018; Raab et al., 2019; Fortune et al., 2020). These methods benefit from minimal interference with the animals themselves. On the other hand, covering large observation areas is costly. Also, tracking animal identities can be quite challenging and requires sophisticated and computationally demanding pre-processing of the data (Lahiri et al., 2011; Kühl and Burghardt, 2013; Hughey et al., 2018; Henninger et al., 2020). Here, specific animal biometrics, certain aspects of an animal's appearance or signaling properties, have been shown to allow for individual identification and tracking (Kühl and Burghardt, 2013). However, in order to enable reliable tracking, selected animal biometrics need to be displayed universally throughout the study population whilst showing sufficient variation between single individuals (i.e., biometric profiles that allow for reliable individual identification). If individuals do not have specific invariant characteristics, like, for example, the stripes of a zebra (Lahiri et al., 2011), then tracking algorithms need to handle temporally changing biometric profiles that often overlap in their characteristics (e.g., spatial position and orientation, Madhav et al., 2018).

Electric fish are particularly well-suited for being tracked in the laboratory and in their natural habitats based on remote sensing (Jun et al., 2013; Henninger et al., 2018; Madhav et al., 2018; Raab et al., 2019; Fortune et al., 2020). These fish are capable of producing an electric field through discharges of an electric organ (EOD, Turner et al., 2007) used for electrolocation (Fotowat et al., 2013) and communication (Albert and Crampton, 2005; Smith, 2013; Benda, 2020). The EODs of many electric fish can be recorded by means of an array of submerged electrodes without the need to catch and tag the fish (Henninger et al., 2018). From these recordings, electric signals of individual fish have to be identified and tracked over time. Dependent on electric fish species, EODs are either emitted in short and discrete pulses (pulse-type electric fish; Hagedorn, 1988; Albert and Crampton, 2005; Smith, 2013) or in a sinosoidal fashion (wave-type electric fish; Moortgat et al., 1998). For pulse-type electric fish, tracking individual EODs is rather challenging, since signal features largely overlap between individual fish, i.e., EOD frequencies are highly variable and context dependent (Hagedorn, 1988). In order to, nevertheless, track electric behaviors of pulse-type fish, additional spatio-tempoal tracking using video recordings and elaborate machine-learning approaches are usually required (Jun et al., 2013; Pedraja et al., 2021). In wave-type electric fish, however, the frequency of EODs is individual specific and remarkably stable over minutes to hours (Moortgat et al., 1998), providing a characteristic biometric cue which facilitates individual signal tracking. Previous tracking approaches were either based on EOD frequency (Henninger et al., 2020) or on spatial electric field properties that can be reconstructed from signal powers across recording electrodes (Madhav et al., 2018). However, both signal features are sensitive to temporal changes. The latter, spatial electric field properties, depends on the fish's spatial position and orientation. The former, EOD frequency, is sensitive to temperature changes (Dunlap et al., 2000) and is actively modulated for electrocommunication (Smith, 2013). Accordingly, both tracking features might fail when fish are close by, either in their EOD frequency or spatially, especially in recordings of electric fish in high densities.

In the following, we describe and evaluate an improved tracking algorithm for wave-type electric fish recorded with electrode arrays. By combining, refining, and extending previous approaches, our algorithm is capable of tracking EODs of individual fish with unprecedented accuracy, i.e., tracking errors occur less often in complex tracking scenarios (e.g., when EOD frequency traces cross each other, Figure 5) which tremendously reduces required post-processing time to manually correct flawed connections. Since both movement behaviors (Madhav et al., 2018; Henninger et al., 2020) and communication (Smith, 2013; Henninger et al., 2018; Fortune et al., 2020) can be analyzed based on EOD recordings, our algorithm is a fundamental advancement for a wide range of behavioral studies on freely moving and interacting electric fish (Raab et al., 2019, 2021). Finally, we demonstrate the performance of our tracking algorithm on recordings of Apteronotus leptorhynchus taken with an array of 64 electrodes in a stream in the Llanos in Colombia.

2. Materials and equipment 2.1. Data acquisition

EODs of freely swimming fish were recorded with arrays of monopolar electrodes at low-noise buffer headstages (1×gain, 10 × 5 × 5 mm3, Figure 1B) arranged in grid-like structures (Figures 1A,C,E). Electric signals are amplified (100×gain, 100 Hz high-pass filter, 10 kHz low-pass), digitized at 20 kHz with 16 bit resolution, and stored on external data storage devices for later offline analysis. The custom-built recording systems (npi-electronics GmbH, Tamm, Germany) were powered by car batteries (12 V, 80 Ah). Various configurations of the electrode arrays have been successfully used to record populations of electric fish in the wild (Henninger et al., 2018, 2020, unpublished field-trips: Colombia 2016, 2019, Figures 1A,C,D), as well as in the laboratory (Raab et al., 2019, 2021, Figures 1E,F). The first 64-channel amplifier system required an external computer with two data acquisition boards (PCI-6259, National Instruments, Austin, Texas, USA) for digitizing and storing the data (Henninger et al., 2018, 2020, Colombia 2016). For this first setup, data acquisition was controlled by a C++ software (https://github.com/bendalab/fishgrid). For the 2019 recordings in Colombia we used a modular 16-channel system based on a Raspberry Pi 3B (Raspberry Pi Foundation, UK) that stores the data digitized by an USB data acquisition board (USB-1608GX, Measurement Computing, Norton, MA, USA) on an 256 GB USB stick controlled by python software (https://whale.am28.uni-tuebingen.de/git/raab/Rasp_grid.git) (Figure 1A).

FIGURE 1

www.frontiersin.org

Figure 1. Recording systems, electrode arrangements, and corresponding signals of recorded electric fish. (A) Two of the Raspbery Pi-based 16-channel amplifiers and recorders used for an array with 32 electrodes. (B) Monopolar stainless-steel electrode on headstage used for recordings in the field and laboratory experiments (after Henninger, 2015). (C) Recording setup used to record a population of A. leptorhynchus in the Rio Rubiano, Colombia, in 2016. Sixty-four electrodes were mounted on PVC-tubes and arranged in an 8 × 8 grid covering an area of 3.5 × 3.5 m2. (D) Snapshot of the electric signals recorded with the setup shown in (C). The top left panel corresponds to the most upstream electrode mounted on the tube closest to the river bank. (E) Recording setup used to record electric signals of pairs of A. leptorhynchus during competitions in a laboratory experiment (Raab et al., 2021). Fifteen electrodes were uniformly distributed at the bottom of the aquarium and one electrode was placed in the central tube the fish compete for. (F) Snapshot of electric signals recorded during the competition experiment shown in (E). The signal framed in gray is from the central electrode located in the optimal tube. The EOD waveform shows the characteristic shoulder that is generic for EODs of A. leptorhynchus.

2.2. Spectrograms

EODs of individual fish are identified and extracted from the electric recordings based on their EOD frequency and respective harmonic structure (Figure 2C). For each electrode we compute power spectral densities (PSDs) of overlapping data snippets shifted by Δt ≈ 300 ms (Figure 2A). The size of fast Fourier transform (FFT) windows was set to nfft=215≈1.64 s (e.g., Raab et al., 2021) or nfft=216≈3.28 s (e.g., Raab et al., 2019; field recordings displayed in Figure 9) to result in frequency resolutions of 0.6 and 0.3 Hz, respectively, needed to resolve EOD frequencies in high fish densities.

FIGURE 2

www.frontiersin.org

Figure 2. EOD frequency extraction from recordings with an electrode array. As an example, a 3 min snippet of a recording with the 8 × 8 array from Rio Rubiano, Colombia, taken during the day of April 10th, 2016 is shown. (A) Spectrograms from three different electrodes. Warmer colors represent increased power in respective frequencies. EOD frequencies of individual A. leptorhynchus remain rather stable, except during electrocommunication (e.g., EOD frequency trace starting at ~ 917 Hz). A non-logarithmic PSD extracted at time 50 s indicated by the dotted line is shown at the side of each panel. (B) The summed up spectrogram over all electrodes contains distinct traces from many different fish. (C) Peaks are detected in the summed up power spectra that are then clustered into frequency groups of a fundamental frequency and at least two of its harmonics, corresponding to a specific fish (Henninger et al., 2020). Fundamental EOD frequencies, their corresponding powers in each electrode and their detection times are stored for subsequent tracking.

2.3. Extraction of EOD frequencies and feature vector

In order to detect EOD frequencies of all recorded fish, for each time point ti PSDs from all electrodes were summed up (Figure 2B). The summed PSDs were transformed to decibel levels, L(f) = 10 log10(P(f)/P0), relative to a power of P0 = 1 mV2/Hz. In these logarithmic power spectra, peaks were detected (Todd and Andrews, 1999) and groups of harmonics were assigned to their corresponding fundamental frequencies (Figure 2C). See Henninger et al. (2020) and the harmonics.py module in the thunderfish package (https://github.com/bendalab/thunderfish) for details.

Harmonic groups were extracted from the summed power spectra in order to save computing time. Extracting fundamental frequencies from each of n electrodes separately would take n-times longer, but might be more advantageous for separating distant fish that are close by in EOD frequency. We are therefore working on improving the performance of the harmonic-group extraction. The tracking algorithm described in the Section 3 is independent of whether fundamental frequencies were obtained from the individual spectra or the summed one.

For each time point ti and each signal indexed by k, a feature vector

X→ki=(fki,Lki(1),…,Lki(n))    (1)

is assembled that includes the fundamental EOD frequency, fki, and the corresponding logarithmic powers, Lki(x), in the PSDs of all n recording electrodes x. Based on this feature vector the individual fish are tracked as described in the following methods.

3. Methods

In the following we present an algorithm for tracking wave-type electric fish in electrode-array recordings. The algorithm merges and extends two complementary approaches that are based on EOD frequency (Henninger et al., 2018, 2020) or on primarily the spatial distribution of signal powers (Madhav et al., 2018). We then test the performance of the tracking algorithm against manually tracked data. Open-source Python scripts for tracking and post-processing of analyzed data can be obtained from https://github.com/bendalab/wavetracker.

3.1. Algorithm for tracking wave-type electric fish

Both EOD frequency and the spatial distribution of EOD power across electrodes change with time and potentially overlap between fish. EOD frequencies can be actively altered in the context of communication (Smith, 2013; Benda, 2020) and the signal powers across electrodes change with the fish's motion (Madhav et al., 2018). This variability and potential overlap in signal features challenges reliable tracking, especially in recordings with many fish.

Furthermore, the existing algorithms track signals in the order of their temporal detection, i.e., signals detected in consecutive time steps are directly assigned to already tracked EOD frequency traces (Madhav et al., 2018; Henninger et al., 2020). Potentially this leads to tracking errors, because even with the utilization of an electrode array, EODs of freely moving and interacting electric fish are rarely detected continuously, i.e., consecutively in subsequent time steps. Low signal-to-noise ratios, resulting from large distances between fish and recording electrodes or objects like rocks or logs distorting or even blocking electric fields, frequently lead to detection losses. When multiple fish with similar EOD frequencies are recorded simultaneously, EOD frequency traces can potentially cross each other (e.g., in the context of emitted communication signals, Benda, 2020). It is in these occasions in particular, that detection losses frequently result in tracking errors.

In order to improve on these issues, we developed a tracking algorithm which, first, is based on feature vectors that include both EOD frequency and signal power across electrodes (Figure 3) and, second, is less constrained by the temporal sequence of detected signals (Figure 5).

FIGURE 3

www.frontiersin.org

Figure 3. Frequency and field errors. (A) Summed spectrogram of a 30 s long part of the recording shown in Figure 2B. For each electric fish signal, potential connection partners are limited by a time difference threshold, Δtthresh = 10 s, and a frequency difference threshold, Δfthresh = 2.5 Hz. For a given signal α with EOD frequency fαi at time step i (dark blue dot), potential connection candidates β at different times j (light blue dots) need to be within these thresholds (box), whereas signals beyond these thresholds (black dots) are not considered. (B) Absolute frequency differences, Δf Equation (2), are mapped (red lines) to frequency errors, εf, using a logistic function, Equation (4) (line), favoring small frequency differences. (C) The field error as the second tracking parameter is based on spatial profiles, Equation (5), of signal powers over all electrodes (black dots). The field difference, ΔS, is computed as the Euclidean distance, Equation (6), between the spatial profiles, Equation (5), of potential signal pairs (columns). With decreasing similarity (columns left to right) the field difference increases. Displayed signal pairs (columns) were selected to illustrate the full range of possible field differences and are unrelated to (A). Spatial profiles were interpolated with a gaussian-kernel for illustrative purposes. (D) To obtain normalized field errors, εS, in a range similar to the one of the frequency errors, εf, each field difference is set into perspective to a representative cumulative distribution [Equation (7), black line] of field differences obtained by collecting all potential field differences of a manually selected 30 s window in the recording. The cumulative distribution of potential field differences is computed only once per recording for a 30 s window where fish are active (night time). This way we incorporate a broad distribution of possible field differences when determining field errors. The examples from (C) are marked by respectively colored dots.

3.1.1. Distance measure

We start out with extracting feature vectors X→ki, Equation (1), containing an EOD frequency, fki, and its powers, Lki(x), on all electrodes x, for all signals k and each time step i. In a first step the distance between all pairs of feature vectors, X→αi and X→βj, of signals α and β at times i ≠ j are quantified. Only pairs within a time difference of |tj − ti| ≤ Δtthresh = 10 s and a maximum difference

Δfαi,βj=|fαi-fβj|    (2)

≤ Δfthresh = 2.5 Hz between the two EOD frequencies of the feature vectors are considered (Figure 3A).

The distance between the two signals αi and βj

εαi,βj=13εf+23εS    (3)

is computed as a weighted sum of the frequency error, εf, and the field error, εS. Both errors range from 0 to 1 and are explained in the following sections. The field error gets twice the weight of the frequency error, because tracking issues usually arise in spite of low frequency errors. Nevertheless, the frequency error remains a relevant tracking feature, especially when fish are in close proximity resulting in low field errors.

3.1.2. Frequency error

The frequency error is based on the difference in EOD frequencies, Equation (2) and has been used previously to track signals of electric fish (Henninger et al., 2018, 2020). We transform the EOD frequency difference, Equation (2), into the frequency error

εf(Δf)=11+e-Δf-f0df    (4)

via a logistic function, that maps the EOD frequency difference, Δf, onto the interval from zero to one. The turning point of the logistic function at f0 = 0.35 Hz and the corresponding inverse slope, df = 0.08 Hz ensure a maximum frequency error already at small EOD frequency differences of about 0.8 Hz (Figure 3B). This transformation mitigates very small frequency differences and equalizes larger frequency differences in the assessment of whether two signals α and β originate from the same or different fish.

3.1.3. Field error

EOD frequency traces of electric fish occasionally cross each other, e.g., when individuals actively alter their EOD frequency in the context of communication (e.g., Zupanc, 2002; Triefenbach and Zakon, 2008; Raab et al., 2021, Figure 3A). In these situations, frequency as a tracking feature fails. This is where the spatial properties of a signal, i.e., signal powers across recording electrodes that reflect the position and orientation of a fish, come into play (Madhav et al., 2018, Figure 3C). The signal powers, Lki(x), are rescaled to the spatial profile

Ski(x)=Lki(x)-minxLki(x)maxxLki(x)-minxLki(x),    (5)

ranging between 0 and 1, for the smallest and largest power of that signal, respectively.

The field difference ΔS, i.e., the difference between the spatial profiles of two signals α and β at times i and j, is computed as their Euclidean distance according to

ΔSαi,βj=∑x=1n(Sαi(x)-Sβj(x))2    (6)

However, the magnitude of this difference depends on the configuration of the electrode array, especially on the number of recording electrodes. To obtain field errors, εS, that are independent of electrode configuration, we map the field differences through a cumulative distribution of field differences extracted from a manually selected and representative 30 s window:

εS(ΔSαi,βj)=∫0ΔSαi,βjp(ΔS)dΔS    (7)

The distribution of field differences, p(ΔS), is estimated from the field differences between potential signal pair (Δti,j ≤ ± 10 s, any frequency difference) within a 30 s data snippet where fish can be assumed to be active, i.e., during night time (Figure 3D). This way we incorporate a broad distribution of possible field differences when determining field errors.

3.1.4. Tracking within a data window

Now that we have a quantification for the distance ε, Equation (3), between to signals we can proceed with the actual tracking algorithm. Based on the distances, the algorithm decides which signal pairs belong together in order to track individual fish throughout a recording. Computing the distances between all pairs of signals of a recording at once, however, is not feasible. Instead we break down the tracking into tracking windows of 30 s at a time (Figure 5). Within these tracking windows, we first compute the distances, ε, between each potential signal pair αi and βj and store them in a three-dimensional distance cube, where the first two dimensions refer to signals αi and βj and the third dimension to the time steps i where signals αi have been detected (Figure 4). Accordingly, we have different numbers of signals αi for each time step i and, consequently, the number of elements in the second dimension, referring to signals βj from all time steps j > i is also variable. Note that, each signal considered in the distance cube is only referred to as α once, but potentially multiple times as β. For example, a signal that is referred to as βj = βi+1 in the first layer of the distance cube (see Figure 4) is referred to as αi+1 in the next layer of the distance cube.

FIGURE 4

www.frontiersin.org

Figure 4. Distance cube containing all distances, εα, β Equation (3), for possible signal pairs α and β within the current tracking window. Each layer, referring to a time step i, contains the distances between all signals αi detected at this time and their potential signal partners βj detected maximally 10 s after signal αi (ΔI time-steps after i). Distances in gray layers correspond to signal pairs where one signal partner could potentially have a smaller distance to a signal outside the error cube. Only connections based on the distances in the central black layers can be assumed to be valid, since all potential connections of both signal partners are within the error cube. Connections established for the black layers are assigned to signal traces obtained in previous tracking steps in a second step.

For the actual tracking step, signal pairs are connected and assigned to potential fish identities based on the values in the distance cube. The algorithm described in the following (Figure 5) is a kind of clustering algorithm that has a notion of temporal sequence. The resulting clusters are traces of different fish identities (“labels”) tracked over time.

FIGURE 5

www.frontiersin.org

Figure 5. Tracking within a data window. Signals detected in a 30 s data window are connected to each other and assigned to fish identities according to their distance ε, Equation (3). Signal pairs with smaller distances are connected first. With increasing distance values, more connections and identities are formed, complemented, or merged, ensuring no temporal overlap. Different stages of this tracking step are displayed in (A–C). (A) Twenty percent of all possible connections of the displayed tracking window are formed. At this tracking stage a multitude of separate signal traces (different colors) are still present. (B) Forty percent of all possible connections of the displayed tracking window are formed. (C) Final output of the tracking step. All possible connections of the displayed tacking window are formed. The remaining three EOD frequency traces (in the displayed time and frequency segment) correspond to three different fish identities. Only signal pairs within the central 10 s of an 30 s tracking window (vertical lines) are assigned to already established fish identities from previous tracking windows. The summed spectrogram of a 30 s long part of the recording shown in Figure 2B is shown in the background.

The signal pairs are traversed in order of ascending distances. If one of αi or βj have already been assigned to a fish identity, then this pair is added to this fish identity. If αi coincides with one fish identity and βj with another one, then the two fish identities are merged. If neither αi nor βj match an existing fish identity, the pair is assigned to a new fish identity. Assignment to or merging of fish identities are only possible in the absence of temporal conflicts, i.e., a fish identity cannot have more than one signal at the same time. In case of temporal conflicts, the signal pair is ignored and the algorithm proceeds with the next one. As a result, we obtain signal traces built upon minimal signal errors within a 30 s tracking window (Figure 5).

Since signals within the first and last 10 s of a tracking window could have lower distances to signals outside the current tracking window, these connections are potentially flawed (gray layers in Figure 4; gray bars in Figure 5). Only connections established within the central 10 s take all other potential signal partners into account. Accordingly, only the section of assembled signal traces corresponding to these central 10 s of the current tracking window is considered for further processing, where the signal traces are appended to already validated, previously detected ones (Figure 6).

FIGURE 6

www.frontiersin.org

Figure 6. Assembly of tracking results over data windows. (A) New fish identities established within the current tracking window (gray and black bar on top). Only the central 10 s of these EOD frequency traces (solid traces; black bar) can be assumed to be valid since signals before and after (transparent traces; gray bars) have potential signal partners outside the tracking window. (B) Additional display of EOD frequency traces established in previous iterations of the tracking algorithm. (C) Signal traces are connected according to the smallest possible distance measure between any signal between the last 10 s of the established fish identities (10 s < t < 10 s) and the central 10 s of the new fish identities (10 s < t < 20 s). In the example shown, the distance between the origin signal (black dot) and the target signal (green dot) is the smallest between these two signal traces, accordingly the two signal traces are merged (green and orange lines). An alternative signal (red dot) has a larger distance to the origin signal. (D) Final result of the tracking algorithm that will be used for the next iteration.

3.1.5. Assembly of tracking results over data windows

The assignment of the 10 s long signal traces obtained by the tracking algorithm from 30 s long data windows (Figure 6A) to preceding tracking steps (Figure 6B) proceeds, similar to the algorithm described above, based on the smallest distances between them.

First, the distance between those signals α within the first 10 s of the current tracking window of already established fish identities and new signals β from the central 10 s of the current tracking window are computed. Then, starting with the pair with the smallest distance, the new signal trace containing signal β (for example the green dot in Figure 6C), is connected to the established signal trace (from previous tracking steps) containing signal α (for example, the black dot in Figure 6C). This step is repeated with signal pairs of increasing distance until all possible connections are established (Figure 6D).

The described tracking within a data window and the subsequent assignment to previously established fish identities is repeated with data windows shifted by 10 s until the end of the recording is reached. In each iteration, the distance cube is updated. The first layers corresponding to the first 10 s of the previous tracking window are removed (frontal gray layers in Figure 4) and new layers for the next 10 s beyond the last tracking window are extended to the error cube in preparation for the next iteration of tracking.

3.2. GUI for checking and correcting tracking results

Even though the introduced algorithm is capable of resolving most tracking conflicts correctly when tracking EODs of wave-type electric fish, occasional tracking errors still remain. We developed a GUI that allows to visually inspect and validate tracked EOD frequency traces and to fix flawed connections (Figure 7). Flawed connections can easily be identified by their clear deviation from the spectrogram displayed in the background. Furthermore, signal traces with a detection gap beyond the temporal threshold of Δtthresh = 10 s of the tracking algorithm can be manually connected based on visual cues from the spectrogram. The resulting validated signal traces are then stored and further analyzed (e.g., Raab et al., 2019, 2021).

FIGURE 7

www.frontiersin.org

Figure 7. Graphical user interface for validating and fixing tracking results. The user is presented with the tracked signal traces (EOD frequency traces) displayed on top of a spectrogram summed up across recording electrodes. The user can delete, cut, and connect signal traces or delete signals not originating from electric fish.

4. Results

The complexity of the data set we recorded in Colombia in 2016 led us to the development of the presented tracking algorithm. The high density of fish in this data set (about 25 fish within 3.5 × 3.5 m2) results in many individual EOD frequency traces, where EOD frequencies were often very similar and frequently cross each other, in particular in the context of communication (Figure 9). This severely challenged previous tracking approaches (Madhav et al., 2018; Henninger et al., 2020), thus a better tracking algorithm was required. The improved algorithm resolves many tracking issues resulting from crossing EOD frequency traces and facilitates the evaluation of wave-type electric fish recordings even in abundant populations. In the following we evaluate the performance of the developed algorithm and highlight how it can be used to advance our knowledge about the behavior of freely moving and interacting electric fish by facilitating laboratory studies as well as natural field observations.

4.1. Performance of the tracking algorithm

In order to quantify the performance of the presented tracking algorithm, we evaluate potential tracking conflicts that occur during the analysis of a datasets we recorded with an 8 × 8 electrode array in Colombia during the day of April 10th, 2016 for 10 h:50 m. First, we tracked the fish with the presented algorithm and then visually inspected, corrected, and validated the tracking results using the GUI (Figure 7). Second, we run the tracking algorithm again and compared the connections made by the algorithm with the manually improved ones. That is, for each signal αi we inspected all possible connections with a signal βj (one row in the distance cube) within the central 10 s of the current tracking window. If all the βj for a given αi were assigned to the same fish identity in the visually corrected tracking results, we have no potential conflict and these connections were not further considered for quantifying the performance of the algorithm, because these are the simple cases with a single fish within the maximum EOD frequency difference, Δfthresh, of 2.5 Hz. If, however, the possible connections involved two or more fish identities, a tracking conflict was possible. For each such potential tracking conflict, we extracted the EOD frequency difference Δf, Equation (2), field difference ΔS, Equation (6), frequency error εf, Equation (4), field error εS, Equation (7), and resulting distance measure ε, Equation (3), between the signal αi and the best signal partner βj, the one with the smallest distance ε, associated with the same fish identity as in the visually corrected signal traces (true connection), as well as between the signal αi and the best signal partner βj belonging to a different fish identity (false connection). Further fish identities of the βj with larger distances were ignored.

In order to assess the performance of each signal feature difference (Δf & ΔS) and distance measure (εf, εS, ε) in separating true from false connections, we computed the fraction of signal differences or errors of true connections being smaller than those of the corresponding false connections. If this fraction would be 100% then the tracking algorithm would always have connected the right signals. In addition we quantified the overlap of the two distributions by the area under the curve (AUC) of a receiver-operating characteristic (ROC). Despite an overlap (low AUC values) in principle 100% correct connections would be possible, but an overlap demonstrates that fixed decision thresholds are not feasible.

We start with evaluating the 464 tracking conflicts from a 5 min snippet being especially challenging to track, because of several crossings of EOD frequency traces (Figure 8). A small frequency range of this 5 min data snippet is displayed in Figure 7. The least reliable tracking feature appears to be the difference in EOD frequency (Δf and εf). Frequency differences of true connections were smaller than the ones of false connections in only 94.83% (440/464) of the cases (Figures 8A,C). Better results can be achieved based on the field error (ΔS and εS) as a tracking feature. The field differences of true connections were smaller in 99.57% (462/464) of the cases (Figures 8B,D). However, this performance can even be improved when using the distance measure, ε, that combines both the frequency error, εf, and field error, εS. In 99.87% (462/464) of the tracking conflicts, true connections had smaller distances than false connections (Figure 8E). The AUC values for all measures were similar to the fractions of correct connections (Δf and εf: AUC = 95.16%, ΔS and εS: AUC = 99.77%, ε: AUC = 99.86%), indicating a small but existing overlap between the two distributions.

FIGURE 8

www.frontiersin.org

Figure 8. Performance of the tracking algorithm. Conflicts appear if signals could be connected to multiple different fish identities, that have been manually corrected and checked post-hoc (Figure 7). In most but not all cases, correct connections have smaller signal differences or errors (blue) than wrong connections (red). Shown are kernel density estimates (KDE) for the various signal differences, errors, and distances. The overlap of the distributions was quantified by the AUC of an ROC-analysis as indicated in the right column. (A) EOD frequency differences, Δf Equation (2). A logistic function, Equation (4) (black line), translates EOD frequency differences to frequency errors, εf. (B) Field differences, ΔS, Equation (6). The cumulative distribution (black line) of field differences of all pairings, not only from conflicts, translates field differences to field errors, εS, Equation (7). (C) Frequency error, εf, Equation (4). (D) field error, εS, Equation (7). (E) Combined distance measure, ε, Equation (3). Note, that frequency and field errors (C,D) are mapped via monotonically increasing functions from signal differences (A,B) and thus result in the same fraction of correct connections and AUC values. However, the distance measure combining both field and frequency error performs best.

The 261344 tracking conflicts of the whole recording yield similar results. However, the higher proportion of “easy” tracking conflicts increased the performance of the various features in general and differences between them were less pronounced. Nevertheless, EOD frequency still performed worse (99.73% correct connections) than field difference (99.81% correct connections). Again, combining both into the distance measure, Equation (3), resulted in the best performance (99.95% correct connections). Correspondingly, the overlap between the two distributions was reduced (Δf and εf: AUC = 99.79%, ΔS and εS: AUC = 99.85%, ε: AUC = 99.98%).

In order to put these high numbers in perspective, we estimate the time required to post-process signal traces obtained for the whole dataset recorded during the day of April 10th, 2016, in Colombia (including 261, 344 potential tracking conflicts) when using (i) only frequency difference, (ii) only electric field difference, or (iii) the combined signal error εS as tracking parameter. Finding and correcting single tracking errors using our GUI (Figure 7) requires about 15 s each (personal experience). Accordingly, post-processing signal traces of the whole recording would require about 3 h when solely frequency is used as tracking feature, 2 h when the field difference alone is used for tracking, and only about 30 min when our combined signal error εS is used. However, note that the dataset used here to illustrate the performance of the algorithm is the most complex ever recorded to our knowledge. With decreasing complexity, i.e., less fish in a recording, the amount of potential tracking conflicts, and thereby the required post-processing time, rapidly decreases.

Furthermore, a major advancement of the presented algorithm is represented by the tracking process itself, i.e., tracking signals in discrete tracking windows according to the similarity of signal pairs (Figure 5). However, this advancement is only validated by human observers, since the recreation of previous tracking approaches is too demanding for the sole purpose of accuracy comparison.

4.2. Applications of the developed algorithm

By means of the developed algorithm we were able, for the first time, to track electric signals of individual fish for multiple consecutive days in a natural, high density population of A. leptorhynchus recorded in a stream in Colombia (Figure 9). This allowed for novel insights into the natural behavior of these fish in the wild, including their communication and m

留言 (0)

沒有登入
gif