Task-related modulation in the monkey inferotemporal cortex
Introduction
The inferotemporal cortex (IT) of the monkey is the last stage in the ventral visual stream (Ungerleider and Mishkin, 1982). The cells in this area exhibit stimulus selectivity (Desimone et al. 1984; Logothetis and Sheinberg 1996; Tanaka 1996) and are essential for the recognition of complex stimuli (Dean, 1976). Behavioral modifications of the neuronal responses in the different cortical visual areas have recently received considerable attention (Reynolds and Chelazzi, 2004). A recent finding reported that behavioral response latencies are predictable on the basis of the degree of gamma-band synchronization and are accompanied by reduced neuronal response latencies in V4 (Womelsdorf et al., 2006). Tanaka et al. (2001a) concludes that global and local attention activates posterior and anterior IT cortices differently. Socially relevant cues are known to be processed more rapidly in the IT (Kiani et al., 2005), but less is known about how IT cortical neurons change their activity when behaviorally relevant stimuli are presented to the animals (Fuster and Jervey 1981; Richmond and Sato 1987; Vogels et al. 1995). In this study we examined whether the behavioral relevance of a stimulus alters the response of IT neurons to that stimulus.#
Results
Two monkeys were engaged in a fixation task and in a recognition task (see the Experimental procedure) involving 20 color stimuli while single cell activity was recorded in the IT. The two tasks were run in blocks; every stimulus was presented at least 10 times in a semi-random sequence, one block therefore consisted of >200 trials.#
The activities of 110 IT cells were recorded from the two monkeys. Cells for which our algorithm failed to determine the exact latency were excluded from the analysis. Our study is based on the data from the remaining 87 neurons (Table 1).#
In the recognition task the monkeys achieved an average correct performance rate of 91.13%.#
There were no inter-individual differences between the two monkeys in either task in the baseline activity levels or in the mean responses (t-test for independent samples, not significant). Comparison of the neuronal activities did not reveal task related differences in neither the baseline activity or the response to visual stimulation in monkey C (paired t-test; p=0.584 (n.s.) and p=0.991 (n.s.); n=50). In monkey S, however, there was a small, but significant difference in the baseline (8.1 spikes/s vs. 9.1 spikes/s; p=0.047), but not in the response level (p=0.347 (n.s.); n=37).#
SP, used as a measure of stimulus selectivity, ranged between 0.50 and 0.54 across monkeys and tasks. Again, there were no statistical differences.#
The main finding of our study was that, while there was no difference between the firing rates or the SPs, the response latencies were shorter in the recognition task than in the fixation task. This latency reduction was observed during recording from the same cell, with the same stimuli, while the animals performed the two tasks. Fig. 1 presents 4 cells as examples.#
Similar differences were found at the population level. The population histogram of all 110 cells demonstrates a slight difference in the baseline activity, an earlier response onset and an earlier decline of the response in the recognition task (Fig. 2). The distribution of the spikes times for all 110 cells in the two tasks was significantly different (Kolmogorov–Smirnov test, fixation task vs. recognition task, p<0.0001). The average latency for all cells was 144.9 ms in the fixation task, and 135.0 ms in the discrimination task (paired t-test; p<0.0001). The mean latencies across cells in the fixation task and the recognition task were 132.6 vs. 122.4 ms, respectively, in monkey C and 168.3 vs. 159.1 ms in monkey S.#
Fig. 3 presents a scatterplot of the individual latency data of the two monkeys in the two tasks. The majority of the values are below the line representing equal latencies, indicating shorter values for the recognition task as compared with the fixation task. Fig. 4 depicts the distribution of the differences of the pooled latency values for the two monkeys in the two tasks. The distribution is shifted to negative values (mean: −9.87 ms), indicating shorter latencies for the recognition task than for the fixation task.#
Discussion
We found that the neuronal response latencies in the IT of awake, behaving monkeys were shorter in a recognition task than those in a fixation task. The sequence of the tasks cannot explain this effect because the tasks were run in an alternating fashion. Moreover, one or other task (or even both) was sometimes repeated to ensure that the responses from the same cell had indeed been recorded.#
The difference in latencies could be caused by two factors: either by the change in color of the fixation spot between the two tasks or by the tasks themselves. The diameter of the fixation spot is small, at 6 min of arc (radius=3 pixels), the animal has to observe the whole stimulus, measuring 5×6°, and the fixation spot has less luminance in the recognition task than in the fixation task. For these reasons, we consider it very unlikely that such a small area of the stimulus (even if it changes color) could be responsible for the effects, although with the methods we used this can not be ruled out completely.#
The attentional modulation of cellular responses has been reported in various cortical areas, e.g., in the MT (Martinez-Trujillo and Treue, 2004), the V1 (Roelfsema et al., 1998) and the IT (Moran and Desimone 1985; Desimone and Ungerleider 1986) but as far as we are aware, there have been no reports of a task-related reduction of latencies at the single cell level.#
There are a number of ways to evaluate neuronal response latency data (Sato 1988; Azouz and Gray 1999; Baylis et al. 1987; Rolls et al. 1993; Roelfsema et al. 1998; Liu and Richmond 2000; Tamura and Tanaka 2001; Edwards et al. 2003; Friedman and Priebe 1999; Hanes et al. 1995; Thompson et al. 1996; Sary et al. 2004; Kovacs et al. 2003; Commenges and Seal 1985). The typical latency values in the IT are around 100 ms, depending on the (in some cases undefined) method, though values around 150 ms have also been reported (Tanaka et al., 2001b). In this study, we concentrated on the biological aspects of the phenomenon rather than the mathematical details. We needed a robust and reliable method with which to calculate neuronal response latencies. Poisson spike train analysis yielded stable results, which fitted well with the estimations derived from inspection of the PSTHs (Sato 1988; Azouz and Gray 1999). Moreover, for spike trains where the latency value would have been dubious, it indicated so (it did not return a value).#
Poisson spike train analysis operates on a trial-by-trial basis, looking for possible modulation of the firing rate. Evidently, this mostly occurs at the onset/offset of the visual stimuli. Across several trials, an array of modulation times is used to calculate the latency values, thereby making the method sensitive enough to detect small differences (see also Fig. 5).#
The color cue in the recognition task in our experiment signifies that the stimulus requires cognitive processing and appropriate selection of action from the monkey. In contrast, in the case of the fixation cue, no such processing is anticipated. The questions may arise of whether our monkeys performed covertly the other task during the fixation task and how this would affect our data. Indeed, during the first trials of the fixation task, we observed eye movements revealing that the monkeys were attempting to do the recognition task. These attempts always appeared during the first few trials and disappeared very quickly since the animals were not rewarded for them. Each cell in each task participated in at least 200 successful trials, and thus the effect of the first few incorrect trials might not show up at all. Furthermore, any “mixing” of the two tasks would reduce the magnitude of any task-related effects, hence the differences reported in our study could be only lower estimates of the real differences.#
Wherever the modulation originates from, the unchanged sparseness indices show that it does not change the selectivity of the neurons. The selectivity of the IT arises through V4, TEO and special local inhibitions in TE (Wang et al., 2000). A modulatory influence could act on the early visual pathways, or it could be a diffuse input to the abovementioned areas, acting uniformly on most neurons of the IT and thus not changing the selectivity of this area. The lack of altered selectivity could be compensated for by a temporal advantage, which might be more beneficial for the organism. Attention to a particular object promotes a quicker reaction in a situation requiring action, and it can therefore be expected to shorten the latency times in the areas responsible for the processing of information important for the given action. We believe that this is the first documentation of the behavioral modulation of response latencies in the IT. A 10-ms advantage was provided on the sensory side of the see–compute–react loop, and the individual might therefore react faster and have more time to decide concerning on the appropriate action to take.#
Experimental procedure
Behavioral tasks
Details of the surgical procedures are to be found in recent publications (Kovacs et al. 2003; Tompa et al. 2005). Two animals (monkeys C and S) were trained to perform a fixation task and a recognition task with a set of 20 color images (Tompa et al., 2005). Stimuli were presented on a uniform gray background rectangle (side: 18°, luminance=8 cd/m2) positioned in the center of the screen. The 20 stimuli were simple geometrical images filled with a colored, textured pattern or photos of complex, natural or artificial objects. The stimuli occupied the same bounding box of 6×5° and had a mean luminance of 7.9 cd/m2 (SD=5.6 cd/m2). Stimuli were presented centrally.#
For the recognition task, during a training period 10 stimuli were associated with a left-side saccade, while the other 10 stimuli were associated with a right-side saccade.#
The sequence of events in the two tasks was as follows (Fig. 6). In the fixation task, a red fixation spot (arc diameter=6 min or radius=3 pixels, and luminance=5.5 cd/m2) was followed by a gray background (500 ms) and a stimulus (500 ms). The fixation spot remained on the screen for a further 100–300 ms. The animals were rewarded for keeping their gazes on the fixation spot. In the recognition task, the trials started with the onset of a blue fixation spot (same size, luminance: 3 cd/m2) followed by a gray background (500 ms) and the same images (500 ms) as in the fixation task. After the visual stimulus had been switched off, two target dots appeared on the sides of the screen. On the basis of the previous training, the animals had to decide whether the stimulus just seen belonged to the left or to the right target. The animals were rewarded for making a saccade to the correct side. Accordingly, the only difference between the tasks was the color of the fixation spot and the behavioral response requirement. The differences between the stimuli in the fixation task and the recognition task were so small that below we refer to them as the same (see also Fig. 6 and the Discussion). Eye movements were recorded with the scleral search coil method (Judge et al., 1980); the size of the fixation window was 0.5×0.5°. The tasks were presented in blocks, the sequence of which was randomized. The tasks were started in half of the cases with the fixation task and in the other half with the recognition task. If time permitted, the first task was repeated to ensure that the same neuron was recorded (see also the Discussion). For each neuron in each task, the 20 stimuli were presented at least 10 times (i.e., a minimum of 200 successful trials per task). If the monkey broke fixation or responded incorrectly, the trial was aborted. Only data from fully and correctly completed trials were included in the analysis.#
Recording and data analysis
Neuronal activity was recorded with standard electrophysiological methods, using tungsten electrodes (1–3 MΩ, FHC). Signals were amplified, bandpass filtered (FHC) and collected by using custom-made software. Analysis was performed off-line. An effective stimulus in the fixation task was selected for each cell, and the corresponding response was taken from the recognition task. Thus, the neuronal activities of the same cells in response to the same images in the two different tasks were compared and analyzed. The neuronal responses were analyzed in two time windows, one for baseline activity, from −300 ms to 0 ms (0 meaning the time of the stimulus onset), and the other for the response (100–400 ms). For the responses to the visual stimuli, net firing rates were taken: the baseline activity was subtracted from the spike count recorded during stimulus presentation. Net neuronal responses were compared by means of the t-test for dependent samples. For determination and comparison of the selectivity, the sparseness index (SP) was used (Rolls and Tovee, 1995). Latency times were calculated by a Poisson spike train analysis (Hanes et al., 1995) and were compared by using t-tests for dependent samples.#
The Poisson spike train analysis described by Hanes et al. (1995) and Thompson et al. (1996) is a two-step process. In stage one, the intensity parameter of the Poisson process is estimated from the baseline activity and the starting times, and probabilities (“surprise indices”) of significant deviations (bursts) are detected. Stage two calculates a single latency time from these values. Since we had 10 repetitions per stimulus, stage two of the original method reduced to the degenerate case: this gave back the average of the first and the third earliest burst starting times. Hence we modified the algorithm: for each trial, we took the beginning of the first activation after the stimulus onset and corrected it with a weighted sum of the times of occurrence of the spikes preceding it (but still after stimulus presentation). The weights used were the reciprocals of the numbers of spikes after the given spike.tlat=tburst1−∑i=1ntiN−iwhere tlat is the latency, tburst1 is the beginning of the first activation, ti are the arrival times of spikes between the stimulus onset and tburst1, n is the number of spikes between the stimulus onset and tburst1, and N is the number of spikes after the stimulus onset.#
The latency values for each stimulus were the trial-wise averages of these values. This heuristics was remarkably stable and provided latencies consistent with the PSTHs for the widest range of cells (Fig. 5).#
SP is a measure of the proportion of effective stimuli based on the response to each of the 20 stimuli. It indicates the length of the tail of the distribution of the net firing rates for the different stimuli. Low values indicate a long tail of the distribution with only a few stimuli with high response rates. SP for n stimuli is computed by using the following formula:SP=[Σi=1,n(Ri/n)]2/[Σi=1,n(Ri2/n)]where Ri is the response to the ith stimulus of a stimulus set containing n stimuli. SP may range up to 1.0, indicating the case when a neuron responds similarly to all of the stimuli in the stimulus set. Only firing rates during the stimulus presentation interval were used in the calculation of SP, and the negative net responses were clipped to zero.#
All statistical comparisons were considered significant if the corresponding p value was less than 0.05. Both animals are still participating in experiments and histological verification of the recording sites is therefore not available at this time; however, on the basis of CT images, the neuronal responses, the alternation of white and gray matter during the movement of the electrode, the selectivity for complex color stimuli, the response latency values and previous histology made on our monkeys with similar parameters, we are confident that the recordings were made in the anterior part of the IT, in area TE, both from the lower bank of STS and the lateral part of TE (just before the electrode tip reached the bone). All procedures conformed to the guidelines of the NIH and of the Animal Welfare Committee of the University of Szeged.#
Acknowledgments
This work was supported by the following grants: OTKA T-042610, OTKA-F048396 and ETT 429/2003. The authors thank L. Gehér for his comments on the Poisson analysis, G. Dósai and P. Liszli for their technical assistance, K. Hermann for maintaining the laboratory equipment, J. Kóródi for taking care of the laboratory animals, M. Janáky and T. Gyetvai for their help in the eye surgery, and E. Vörös for the NMR and CT images.#
Supplementary data
Supplementary data
Supplementary data associated with this article can be found, in the online version, at doi:10.1016/j.brainres.2006.08.106.#
Figures and Tables
Table 1
| Monkey and task (no. of cells) | Mean (SEM) of latency [ms] | Mean (SEM) of baseline [spikes/s] | Mean (SEM) of net response [spikes/s] | Mean (SEM) of SP |
| C FIX (57) | 132.6 (3.9) | 7.7 (0.18) | 29.3 (2.37) | 0.52 (0.03) |
| C DIS (57) | 122.4 (4.5) | 8.0 (0.21) | 29.2 (2.58) | 0.50 (0.03) |
| S FIX (30) | 168.3 (8.4) | 8.1 (0.26) | 34.2 (4.91) | 0.51 (0.04) |
| S DIS (30) | 159.1 (6.4) | 9.1 (0.27) | 31.9 (3.84) | 0.54 (0.04) |
References
4. P.DeanEffects of inferotemporal lesions on the behavior of monkeysPsychol. Bull.8319764171
15. N.K.LogothetisD.L.SheinbergVisual object recognitionAnnu. Rev. Neurosci.191996577621
18. J.H.ReynoldsL.ChelazziAttentional modulation of visual processingAnnu. Rev. Neurosci.272004611647
26. K.TanakaInferotemporal cortex and object visionAnnu. Rev. Neurosci.191996109139