Understanding subprocesses of working memory through the lens of model-based cognitive neuroscience

Working memory (WM) refers to a set of processes that makes task-relevant information accessible to higher-level cognitive processes. Recent work suggests WM is supported by a variety of information gating, updating, and removal processes, which ensure only task-relevant information occupies WM. Current neurocomputational theory suggests WM gating is accomplished via ‘go/no-go’ signalling in basal ganglia-thalamus-prefrontal cortex pathways, but is less clear about other subprocesses and brain structures known to play a role in WM. We review recent efforts to identify the neural basis of WM subprocesses using the recently developed reference-back task as a benchmark measure of WM subprocesses. Targets for future research using the methods of model-based cognitive neuroscience and novel extensions to the reference-back task are suggested.


Working memory and its subprocesses
Working memory (WM) refers to a set of processes that makes task-relevant information accessible to higher-level cognitive processes such as learning, decision making, reasoning, and reading comprehension [1][2][3]. Working memory is extremely capacitylimited, with current research suggesting that between one and four items 4 can be maintained in an activated state in WM at a time [4][5][6][7]. This strict limit demands a high degree of control over WM content, such that WM must strike a balance between stability (i.e. protecting the current contents of WM from irrelevant or distracting information) and flexibility (i.e. keeping WM up-todate with new relevant information and removing outdated information). This trade-off between stability and flexibility [8][9][10][11] is a core feature of executive control processes (e.g. cognitive control, conflict monitoring/resolution, task switching; [12]) and managing the trade-off strongly depends on the brain's dopamine systems [13 ,14].
Prominent computational theories suggest that WM resolves the stability-flexibility trade-off by operating in two modes: An updating (gate-open) mode, which allows new information to enter WM, and a maintenance (gate-closed) mode, which prevents irrelevant and distracting information from interfering with the current contents of WM [15][16][17][18][19][20][21][22]. In the gate-open mode, updating is further supported by two main subprocesses: Item removal and item substitution, which together ensure that only relevant information is kept active in WM [23,24 ]. Together, these processes allow WM to alternate modes between flexible (when new information is encountered) and stable (when distractors are encountered). This enables successful performance in dynamic environments in which distractions are common and the relevance of information frequently changes.
To-date, the most detailed neurocomputational account of the gating mechanism controlling the trade-off between updating and maintenance is the prefrontal 4 We use the term 'item' to refer to an individual representation held in WM. 'Item' is thus synonymous with 'chunk' [97] and 'cognitive object' [98,7] which denote the same concept. There is ongoing debate about whether items in WM are represented in discrete slots (items held with high precision in a number of discrete memory locations), allocation of continuous resources (items allocated limited resources in inverse proportion to the total number of items in WM), or some hybrid of the two frameworks (e.g. Refs. [78,74,79,80]). The models and general approach that we discuss in this paper are not committed to either architecture but could be used to test between the competing accounts (see Section 'Current directions' below). cortex-basal ganglia WM (PBWM) model ( Figure 1; [25][26][27]). In this model, gating is implemented via basal ganglia (BG)-thalamus-prefrontal cortex (PFC) circuits that control 'go/no-go' signalling. As illustrated in Figure 1, gate opening is controlled by a striatal 'go' signal which inhibits substantia nigra pars reticulata (SNr) and disinhibits thalamus, which in turn excites PFC. This allows information to enter WM and updating to occur. Gate closing 5 is controlled by a striatal 'no-go' signal which inhibits external globus pallidus (GPe), disinhibits SNr, and inhibits thalamus. This in turn inhibits PFC, which prevents WM from being updated ( Figure 1; [26]). In short, the 'go' signal passes through two inhibitory connections (striatum-SNr-thalamus), which excites PFC, while the 'no-go' signal passes through three inhibitory connections (striatum-GPe-SNr-thalamus), which inhibits PFC. These circuits have also been implicated in updating value representations in reinforcement learning and value-based decision making, suggesting a general neural mechanism for accomplishing information gating ([16,28,29,26,20,30,22] However, as will be discussed, recent work highlights that WM also depends on several important subprocesses not accounted for in the PBWM, and on neural substrates outside of the PBWM's BG-thalamus-PFC pathways. Modelling these processes and their neural basis is necessary to achieve a complete neurocomputational understanding of WM. This review discusses recent progress toward this goal. We focus on recent efforts to link brain measurements with behaviour on the reference-back task (Figure 2; [24 ,21 ]), a WM-based decision-making task that provides separate behavioural measures of gate opening and closing, as well as updating and substitution processes not accounted for in the PBWM. In doing so, we suggest that 58 Computational cognitive neuroscience    5 The PBWM model assumes that WM sits in the 'gate-closed'/ maintenance mode by default. We note that this assumption is likely too strong, since it implies that gate opening must always accompany updating. Under this assumption the PBWM would fail to predict the different gating costs to WM updating that occur in behavioural data (e. g. Refs. [24 ,21 ]). 6 The PBWM model suggests a phasic dopaminergic signal from the midbrain dopamine structures only in the early phases of a WM task when the BG must learn when to update. Once WM updating rules are learned, BG nuclei no longer rely on a phasic dopaminergic response but control WM gating via the non-dopaminergic SNr. Any additional dopaminergic input reflects either reward associations or a feedbackbased response which evaluates the updating process based on the reward prediction error coded by the same neurons [84]. This response, in the form of bursts and dips in dopaminergic release onto striatal neurons, is thought to reinforce 'go' and 'no-go' activation, respectively. further progress can be made by applying the methods of model-based cognitive neuroscience [41,42 ], which links brain activity to behaviour via detailed computational models of cognitive and neural processes [43][44][45]. Model-based cognitive neuroscience generates detailed quantitative theories that span multiple levels of abstraction (e.g. behavioural, cognitive, neural). This provides greater constraint on theory and leads to more robust and detailed inferences. In particular, combining modelbased approaches with developments in ultra-high field fMRI enables testing neurocomputational theories of WM (such as the PBWM) with greater spatial and psychometric precision than has previously been possible. Applying these methods to the reference-back task promises a more detailed neurocomputational understanding of WM than is currently available.

Measuring WM subprocesses with the reference-back paradigm
Most laboratory tasks used to study WM (e.g. n-back, delayed-match-to-sample) are designed to investigate the capacity and temporal properties of WM but are unable to differentiate the contribution of WM subprocesses to observed behaviour ([24 ,46,47 ,48 ,21 ,22]). A recently developed exception is the reference-back task [24 ,21 ], which provides dissociable measures of core WM subprocesses (gate opening, gate closing, updating, substitution) from behavioural choice-response time (RT) data.
To perform the reference-back, participants hold one of two stimuli (e.g. an 'X' or 'O') in WM while deciding whether a series of probes match the current item in WM ( Figure 2). On reference trials (indicated by a red frame around the stimulus), the participant must update WM with the currently displayed stimulus. On comparison trials (indicated by a blue frame), the participant simply compares the current stimulus to the one held in WM (the one appearing in the most recent red frame) without updating WM. Both reference and comparison trials require a same/different decision but only reference trials require updating. Comparing performance on reference and comparison trials thus provides a behavioural measure of the cost of updating. By similar logic, switching from comparison to reference trials requires opening the WM gate (to allow for updating), while switching from reference to comparison trials requires closing the WM gate (to maintain the current contents). Gate opening is measured by comparing trials on which participants switch towards a reference trial to those where reference trials are repeated. Likewise, gate closing is measured by comparing trials on which participants switch towards a comparison trial to those where comparison trials are repeated. Finally, substitution is measured via the interaction effect of trial type (reference/comparison) and match type (same/different) and represents the cost of updating a new item into WM.
The benchmark behavioural finding from the referenceback task is that trials requiring additional WM processes tend to have slower RTs and/or more frequent errors than trials that do not require such processes [24 ,47 ,49 ,21 ,50 ,51 ,52 ]. These costs are typically interpreted as reflecting a combination of time required for additional subprocesses to run outside of the same/  Response: Illustration of the reference-back task. On each trial, participants indicate whether the presented letter is same or different from the letter in the most recent red frame. On reference (red frame) trials, participants must also update WM with the currently displayed letter. On comparison (blue frame) trials, participants make the same/different decision but do not update WM. Comparing behavioural outcomes (e.g. response time, error rate) between different trial types measures the cost of gate opening, gate closing, updating, and item substitution processes (see text for details). Explaining these behavioural phenomena via computational cognitive models and establishing further links to neural data is a key goal of current WM research. Adapted from Rac-Lubashevsky and Kessler [21 ] with permission. different decision stage, and subprocesses interfering with the primary task (e.g. creating noisier WM representations due to drawing attention/capacity away from the decision process) [53]. However, distinguishing these accounts requires detailed choice-RT models of the latent cognitive processes underlying memorybased decision making (e.g. the highly successful evidence accumulation framework, [54,55]), which are yet to be applied to the reference-back paradigm. Before discussing approaches to modelling the reference-back task, we first review recent efforts to identify the neural substrates of WM subprocesses by correlating brain activity with behavioural measures derived from the reference-back.
Neural correlates of the reference-back task Two initial studies investigated EEG correlates of the reference-back task. Rac-Lubashevsky and Kessler [51 ] found that gate closing was associated with increased theta power, a neural signature of cognitive control [56][57][58], while gate opening and updating were associated with increased delta power, a signature of reactive (eventdriven) control and action selection processes that engage in response to reward prediction errors [59][60][61]. This suggests a functional role for delta and theta signals in the control of WM consistent with 'go/no-go' signalling in the PBWM model [25,39,26]. A follow-up study explored the role of the P3b EEG signal (a positive event-related potential that signals task-relevant events and peaks 300 ms after stimulus onset) in gating and updating [52 ]. P3b amplitude spiked depending on whether the stimulus matched the WM reference item, implicating P3b in stimulus comparison/categorisation processes rather than updating per se. Greater negative activity (in an N2-like ERP component unrelated to the P3b) was found in anterior cortical regions on reference versus comparison trials. This signal has been implicated in controlled inhibition and action selection [62] and, in the context of the reference-back task, likely reflects a gate-opening or updating signal, consistent with the PBWM's assumption that reference trials trigger an update or 'go' signal to allow new information into WM. This initial work demonstrates that neural signatures of specific updating and gating processes are detectable in EEG oscillatory signals that show activity broadly consistent with 'go/no-go' signalling in BG-thalamus-PFC pathways involved in WM gating [25,26]. However, the poor spatial resolution of EEG limits our ability to draw conclusions about the specific structures associated with each WM subprocess.
Extending this work, Nir-Cohen et al.
[47 ] used 3T fMRI to identify neural substrates of WM subprocesses using a modified reference-back with more complex face-morph stimuli. BG, frontoparietal cortex, and task-relevant sensory areas such as visual cortex were involved in gate opening. Gate closing activated parietal cortex and substitution elicited activation in left dorsolateral PFC and inferior parietal lobule. A whole-brain conjunction analysis revealed shared activity in the supplementary motor area for updating and substitution, while updating and gating both activated the posterior parietal cortex. These results broadly agree with the PBWM model [26] and support the role of BG and PFC in controlling the flow of information into WM and replacing old with new information. However, parietal cortex activation during gate closing is not predicted by the PBWM. This suggests that additional brain structures are involved in controlling WM subprocesses and points to an opportunity to extend the PBWM to explain the neural basis of WM subprocesses beyond gate opening.
Jongkees [49 ] provided further evidence for the dopaminergic basis of WM gating and updating processes by administering dopamine precursor L-tyrosine to young adults and comparing reference-back performance to a placebo-control group. The L-tyrosine group had less variable gate opening times than placebo controls, suggesting that the drug improved WM performance for poor performers but impaired high performers. There was no effect on updating or gate closing, consistent with the role of striatal dopamine signals in opening the gate to WM in line with the PBWM [25,26]. Further indirect support for striatal dopamine involvement comes from a study linking event-based eye-blink rate (a proxy measure of striatal dopamine) to WM updating in the reference-back task [50 ]. However, follow-up work combining this approach with ultra-high field fMRI is needed to identify how activity in small subcortical structures as well as layers in cortex (e.g. striatum, GP, thalamus, PFC) is modulated by dopamine.

Current directions
The work reviewed above has taken important first steps toward identifying the neural substrates of WM subprocesses beyond the BG-thalamus-PFC 'go/nogo' gating mechanism of the PBWM [39,26]. However, existing work has so far been limited to relating brain activity directly to the reference-back's behavioural measures rather than the latent cognitive processes that give rise to behaviour. Model-based approaches that link brain and behaviour via computational cognitive models offer numerous advantages over traditional statistical analyses of mean RT and error rate in understanding the cognitive and neural basis of WM. For example, applying evidence accumulation models of choice-RT (e.g. Refs. [54,55]) to referenceback data would reveal whether performance costs occur because WM subprocesses add time outside of the decision stage (longer nondecision time), interfere with the decision process itself (reduced or noisier processing rate; [53]), or induce strategic adjustments engaging top-down cognitive control (increased response caution). Decomposing behavioural effects (e.g. gating, updating costs) into a set of latent cognitive processes (e.g. accumulation rate, nondecision time, cognitive control of thresholds) rather than coarse behavioural-level summary statistics enables exploring the neural substrates of WM in greater detail than is possible with traditional methods [63,64]. This places stronger constraints on theory and ultimately produces more robust and detailed inferences about the latent processes that generate behaviour. Applying cognitive models to the reference-back holds great promise in this regard.
In its standard form, the reference-back paradigm ignores several important additional WM processes. These include mechanisms that operate on information already active in WM [65][66][67], such as object selection and retrieval [7], item-specific removal ([23]; but see Ref. [68], for evidence of removal in the reference-back), and grouping and reorganization operations (e.g. sorting items into alphabetical or chronological order, chunking or grouping items together to form a single accessible representation, changing the serial position of items; [69][70][71][72]). These mechanisms support effective remembering by restructuring information into more memorable formats and ensuring only relevant information is maintained and retrieved from WM. The standard referenceback also ignores phenomena associated with WM's limited capacity (e.g. WM load/set-size effects; [73][74][75]7]) and the temporal degradation (e.g. by decay or interference) of WM representations (for a review, see Ref. [76]). Analyses that do not account for these processes risk misattributing their effects to other processes, resulting in biased inferences.
Simple extensions to the reference-back task (e.g. using multiple-item WM sets, inserting delays between the update cue and stimulus presentation), however, enable testing such effects alongside the gating and updating processes of the standard reference-back. For example, Verschooren et al. [77] developed a modified referenceback paradigm where one among several items in longterm memory or perception is gated into WM. This allows for comparing gating dynamics for perceptual versus longterm memory information. Similar multiple-item modifications can be used to investigate some of the WM phenomena described above, including informing the ongoing debate about whether items in WM are held in a small number of discrete high-precision slots [74] or allocated capacity from a limited pool of continuous resources [78][79][80]. In discrete slots models, the fidelity of items in WM only degrades once all memory slots are full (e.g. when n > 4). In continuous resource models, an item's fidelity is determined by its share of the available resources and thus should degrade in inverse proportion to the total number of items in WM 7 . Evidence accumulation models are well suited to test between these competing accounts (e.g. via accumulation rate parameters) as they can be used to assess the fidelity of WM representations and measure capacity-sharing effects; [74,81]). Varying set size in the reference-back and assessing the effects on decision-making and WM processes (as measured by cognitive models) could test between slots and resource architectures. Similarly, combining a multiple-item reference-back task with reinforcement learning (e.g. by reinforcing some items but not others) could shed light on the interplay between WM and learning (e.g. Refs. [73,75]) and the role of expected value in WMbased decisions. Overall, we believe that detailed choice-RT modelling will play an important role in resolving these important questions and in explaining additional WM phenomena captured by variants of the referenceback task.
Combining computational approaches with recent developments in ultra-high field fMRI (7T and higher) (e.g. increased resolution and better signal-and contrast-tonoise ratios) holds great promise for identifying activity in small subcortical structures (e.g. GP, SN, subthalamic nucleus, VTA; [82,83]) and gaining a deeper understanding of their functional role in WM than is currently available. For example, this would enable a stronger test of the so-called 'third phase' response of the PBWM model [27], which evaluates the updating process via dopaminergic midbrain neurons that code reward prediction errors [84]. Under the PBWM, midbrain dopamine responses that train the BG when to update should no longer occur once updating-related task rules have been learned. This mechanism has proven difficult to verify with low field strength fMRI [85,86], however, imaging reference-back performance with ultra-high field fMRI and linking neural measurements to cognitive model parameters would enable identifying these anatomical and functional mechanisms in greater detail and provide additional constraint on cognitive models of WM. Specifically, when modelling two or more sources of data (e.g. fMRI and choice-RT) simultaneously, the power to detect joint effects (e.g. correlations between BOLD signal and cognitive model parameters) is determined by the signal-to-noise ratios of each data source. Increasing the signal-to-noise ratio of neural data (e.g. via 7T fMRI; [82]) reduces uncertainty throughout the model, as does including data from additional modalities (e.g. EEG + fMRI + behavioural; [87]) 8 . A further benefit is that connecting neural signals to cognitive model parameters allows for selecting between cognitive models that make identical predictions at the level of choice-RT but differ in their internal dynamics [45,64,88,89]. That is, different internal mechanisms can be titrated by evaluating which is most consistent with the additional structure provided by the neural data. Combining such approaches with the reference-back task has potential to shed light on other structures known to be involved in WM (e.g. hippocampus; [90,91,92 ]), dopaminergic response evaluation (e.g. VTA; [93,94]), and cognitive control (e.g. anterior cingulate cortex; [95]), which are not yet accounted for in existing neurocomputational models. Linking state-of-the-art fMRI to the latent cognitive processes engaged by the reference-back would offer particular insight into the function of small dopamineproducing midbrain structures, with implications for understanding WM impairments in a range of clinical disorders involving abnormal dopamine function [96]. Overall, we believe that viewing the reference-back task through the lens of model-based cognitive neuroscience promises a more detailed understanding of the subprocesses that support WM and their neural substrates.

Concluding remarks
This review discussed recent efforts to identify the neural basis of subprocesses that support WM in the recently developed reference-back task. Current empirical work supports the idea that WM gating is controlled by striatal 'go/no-go' signalling in BG-thalamus-PFC pathways. However, the neural substrates of several additional WM subprocesses are yet to be established, pointing to a need for ultra-high field functional imaging combined with detailed computational cognitive modelling. Targets for future research include extending the reference-back task to account for additional WM subprocesses (e.g. removal, selection, and reorganization operations) and effects of WM load and capacity (e.g. longer retrieval times, noisier WM representations), as ignoring such processes leads to mis-specified models and potentially biased inferences. Applying the methods of model-based cognitive neuroscience to the reference-back task would provide a major advance in understanding WM at neural, cognitive, and behavioural levels. A comprehensive understanding of WM subprocesses and their neural basis is within reach, with implications for both cognitive and clinical neuroscience.

Author contributions
RJB and ACT conducted the literature review and led the writing of the manuscript. All authors provided feedback at different stages, reviewed, edited, and revised the manuscript.

Conflict of interest statement
Nothing declared.

47.
Nir-Cohen G, Kessler Y, Egner T: Distinct neural substrates for opening and closing the gate from perception to working memory. bioRxiv 2019 http://dx.doi.org/10.1101/853630. Used 3T fMRI to identify neural substrates of WM subprocesses using the reference-back paradigm. This work identifies brain regions involved in gate opening, gate closing, updating, and substitution processes, broadly consistent with the PBWM model in which BG-thalamus-PFC pathways control the flow of information into and out of WM.

48.
Lewis-Peacock JA, Kessler Y, Oberauer K: The removal of information from working memory. Ann N Y Acad Sci 2018, 1424:33-44. Reviews behavioural and neural evidence for a selective removal process that is engaged to remove outdated information from WM. The removal mechanism serves to limit working memory load by ensuring only goalrelevant information is maintained in WM. The ways in which selective removal is distinct from temporal decay and interference are also discussed.

49.
Jongkees BJ: Baseline-dependent effect of dopamine's precursor L-tyrosine on working memory gating but not updating. Cogn Affect Behav Neurosci 2020:1-15. Investigated the dopaminergic basis of WM gating and updating processes by administering dopamine precursor L-tyrosine to young adults and comparing reference-back performance to a placebo-control group. Updating and gating were differentially affected by the dopaminergic manipulation, highlighting the importance of the brain's dopamine systems for controlling WM.

50.
Rac-Lubashevsky R, Slagter HA, Kessler Y: Tracking real-time changes in working memory updating and gating with the event-based eye-blink rate. Sci Rep 2017, 7:1-9. Linked event-based eye-blink rate (a proxy for striatal dopamine activity) to reference-back task performance to examine how gating and updating facilitate flexible WM updating. The relation of event-based eye-blink rate to striatal dopamine provides indirect evidence in support of the striatal 'go/no-go' updating signal proposed by neurocomputational models of WM like the PBWM.

51.
Rac-Lubashevsky R, Kessler Y: Oscillatory correlates of control over working memory gating and updating: an EEG study using the reference-back paradigm. J Cogn Neurosci 2018, 30:1870-1882. Linked brain measurements (EEG) with behaviour on the reference-back task. Gating and updating were associated with EEG signatures of cognitive control and action selection processes. Demonstrates that neural signatures of specific updating and gating processes are detectable in EEG oscillatory signals that show activity consistent with 'go/nogo' signalling in BG-thalamus-PFC pathways, as proposed by prominent neurocomputational theories of WM updating.

52.
Rac-Lubashevsky R, Kessler Y: Revisiting the relationship between the P3b and working memory updating. Biol Psychol 2019, 148:107769. Used EEG with the reference-back task to explore the role of the P3b oscilatory signal in WM updating. P3b amplitude was related to stimulus comparison/categorisation but not to updating itself. Suggests that P3b is a neural signature of a goal-directed target identification mechanism that improves WM-based decision making in line with task goals.