Main

Autologous CAR T cell therapies deliver sustained clinical benefit in less than half of treated patients who have relapsed or refractory B cell malignancies, including large B cell lymphoma (LBCL)1,2,3 and B cell acute lymphoblastic leukemia4. Axicabtagene ciloleucel (axi-cel), a commercially available CD19-CAR incorporating a CD28 co-stimulatory domain, mediates disease control at 6 months in approximately 40% of patients with LBCL, and disease progression after this timepoint is uncommon5,6,7,8. Neurological toxicity9 occurs in two-thirds of patients after axi-cel therapy and can be severe, causing seizures, cerebral edema and, in rare cases, death10. Clinically validated biomarkers of disease progression and neurotoxicity are limited.

Current understanding of resistance to CD19-CAR therapies in LBCL implicates both inadequate CAR T cell potency and tumor resistance due to antigen modulation and potentially other mechanisms11. Polyfunctionality, increased stemness and decreased exhaustion features of pre-infusion CAR T cells correlate with increased CD19-CAR T cell potency and increased rates of disease control12,13,14,15. On the tumor side, high pre-treatment tumor burden1,5,15,16 associates with diminished disease control, likely due to induction of CAR T cell exhaustion, and potentially higher risk for antigen loss or other features associated with resistance to CAR T cell killing17,18. CAR T cell potency and tumor resistance are interdependent, because higher tumor burdens require greater CAR T cell proliferation and serial killing for tumor eradication, whereas diminished antigen expression may be overcome in some cases with higher-potency CAR T cells19,20. Current concepts hold that CD19-CAR-associated immune effector cell‐associated neurotoxicity syndrome (ICANS) is linked to cytokine-mediated endothelial activation21,22, ICANS-associated cells in infusion products12 and possible on-target toxicity due to CD19 expression by brain mural cells23, resulting in blood–brain barrier disruption and CAR T cell infiltration10,22,24. Biomarkers of severe neurotoxicity are limited12,15,25,26.

Identification of biomarkers associated with increased risk of disease progression or severe neurotoxicity after CAR T cell therapy could enhance understanding of the biological basis for resistance and toxicity and enable earlier and potentially more effective treatment interventions. Several groups have measured circulating CAR T cell levels after infusion using quantitative polymerase chain reaction (qPCR) or flow cytometry, and some studies have reported that higher peak or cumulative CAR T cell expansion associates with durable disease control or severe neurotoxicity6,15, although a similar relationship was not found in other studies2. In the present study, using high-dimensional single-cell proteomic profiling, we tested the hypothesis that post-infusion circulating CAR T cell subsets could identify patients at risk of disease progression or severe neurotoxicity after CD19-CAR therapy. In our discovery cohort, we studied peripheral blood CAR T cells from 32 patients with LBCL receiving commercial axi-cel. Data-driven analyses identified three populations of CAR T cells circulating on day 7 after infusion that associated with durable disease control or severe neurotoxicity. A higher frequency of CD4+ CAR T cells expressing Helios on day 7 after infusion associated with lower rates of severe neurotoxicity and a higher likelihood of disease progression. These cells were non-clonal and exhibited a TReg cell phenotype, with expression of FOXP3 and CD25. Prospective study of a second validation cohort comprising 31 patients with LBCL confirmed that increased CAR TReg cells early after infusion predict clinical progression and a lowered risk for severe neurotoxicity. Frequency of CAR TReg cells was inversely correlated with total CAR T cell expansion but did not correlate with tumor burden quantified by lactate dehydrogenase (LDH) levels. A model combining CAR TReg and LDH levels was superior for predicting clinical response than each feature alone. Together, these data implicate post-infusion CAR T cells with hallmarks of TReg cells in modulating response and toxicity after CAR T cell therapy.

Results

Circulating CAR T cell counts do not associate with response

We studied 32 consecutive patients with LBCL who were treated with commercial axi-cel at Stanford Hospital between 1 December 2017 and 1 November 2018 and followed until 3 March 2020 (Fig. 1a and Supplementary Table 1). Despite comorbidities that would have rendered 12 of these patients ineligible for the ZUMA-1 trial3 and other differences (Supplementary Table 2), progression-free survival (PFS) and overall survival (OS) were similar between the Stanford and ZUMA-1 cohorts, and, similarly to ZUMA-1, relapses after 6 months were rare6 (Fig. 1b).

Fig. 1: Clinical response in patients with LBCL treated with axi-cel is not associated with peripheral CAR T cell expansion.
figure 1

a, Summary of clinical parameters for the study cohort of patients with LBCL treated with axi-cel (n = 32). b, PFS (left) and OS (right) for the patient cohort after axi-cel infusion (n = 32). Patient 032 died in remission at 4.5 months after infusion. Gray boxes show 6-month and 12-month follow-up visit windows. Dashed lines indicate the end of each visit period, where survival estimates were calculated. c, Flow cytometry assay for monitoring absolute counts of circulating CD4+ and CD8+ CAR T cells in axi-cel patients, gated as CD3ε+CD4+CD8αCAR+ or CD3ε+CD4CD8α+CAR+ events among single viable CD45+ lymphocytes. An example partial gating scheme for patient 040 is shown. d, CAR T cell absolute blood counts on days 7, 14, 21 and 28 after axi-cel infusion (n = 32 patients and 128 observations). e, Spearman correlation in peripheral CAR T cell expansion between flow cytometry and qPCR assays (n = 31 patients and 109 observations). f, CAR T cell AUC0–28 for patients in CR or PD at 6 months (n = 29; patients 042 and 058 had PR and SD at 6 months, respectively, and patient 032 died from a non-progression-related cause before 6 months). g, CAR T AUC0–28 stratified by the maximum ICANS grade: 0–1 versus 2–4 (n = 31; no AUC0–28 data were collected for patient 058). Box plots in f,g show quartiles with a band at median; whiskers indicate 1.5× interquartile range; and all observations are overlaid as dots. P values are from two-sided Mann–Whitney U-tests. DLBCL, diffuse large B cell lymphoma; ECOG, Eastern Cooperative Oncology Group; GCB, germinal center B cell; IPI, International Prognostic Index; PMBCL, primary mediastinal B cell lymphoma; SE, TFL, transformed follicular lymphoma.

A flow cytometry assay incorporating an antibody that specifically detects the FMC63 scFv target-binding domain in axi-cel27 was used to measure circulating CAR T cells sequentially after infusion (Fig. 1a,c). Median CAR T cell expansion peaked at 28.2 CAR T cells per µl on day 7 and steadily declined thereafter (Fig. 1d), with similar behavior of CD4+ and CD8+ CAR+ T cell subsets (Extended Data Fig. 1a) and significant correlation between flow cytometry and qPCR tracking of circulating CAR T cell numbers (Fig. 1e). CAR T area under the curve (AUC) or area under the moment curve (AUMC) quantified for days 0–28 did not differ significantly between patients with complete response (CR) or progressive disease (PD) at 6 months (Fig. 1f and Extended Data Fig. 1b) and did not significantly associate with the best response (Extended Data Fig. 1c). Similarly, neither CAR T cell peak expansion nor expansion on days 7, 14, 21 or 28 was associated with clinical response at 6 months (Extended Data Fig. 1d,e). These results were confirmed with qPCR (Extended Data Fig. 1f–h).

Maximum cytokine release syndrome (CRS) was grade 2 (Fig. 1a), and we could not study high-grade CRS, although higher CAR T cell levels were associated with higher CRS (Extended Data Fig. 1i,j), as previously reported28. ICANS occurred later than CRS (max CRS: days 0–8 and max ICANS: days 4–12), and severe ICANS (grades 2–4) was significantly associated with CAR T AUC0–28, AUMC0–28, peak levels and CAR expansion on day 21 and day 28 (Fig. 1g and Extended Data Fig. 1k,l). In summary, higher post-infusion CAR T cell levels did not associate with disease control but did associate with CRS and severe neurotoxicity.

Three metaclusters associate with response or neurotoxicity

Mass cytometry by time-of-flight (CyTOF) enables high-throughput proteomic monitoring of single-cell phenotypes29. To identify whether expansion of specific CAR T cell subsets associates with clinical response, we used CyTOF to assess expression of 34 surface or intracellular markers relevant to T cell function (Fig. 2a and Supplementary Table 3) and 14 quality control parameters (Methods). Batched blood samples collected before axi-cel infusion, on day 7 (peak expansion) and on day 21 after infusion (late expansion) from 31 patients were analyzed. In this study, we did not have access to the commercial axi-cel infusion products for analysis.

Fig. 2: CyTOF identifies metaclusters of circulating CAR T cells associated with long-term clinical response or neurotoxicity.
figure 2

a, CyTOF panel covering 34 proteins. Quality control channels are not shown. b, MST showing hierarchical consensus clustering of circulating CAR+ T cells on day 7 after axi-cel infusion (n = 31 patients), with 25 clusters grouped automatically into ten metaclusters. No CyTOF data were obtained for patient 038. c, Star plot showing MST from b, with each cluster size scaled to represent its average abundance, and colors indicating expression of each marker used for clustering. d, Schematic for building a lasso model to predict clinical response as CR or PD at 6 months based on metacluster abundance from c. e, Cross-validation results for model from d, with red lines showing optimal model parameters. f, Relative abundance and β coefficients of three metaclusters selected by the lasso model for patients in CR or PD at 6 months (n = 28). g, Expression of eight proteins overlaid onto MST from c. h, Schematic for building a lasso model to predict maximum ICANS grade as 0–1 versus 2–4 based on metaclusters from c. i, Lasso cross-validation results for the model in h, with red lines showing optimal model parameters. j, Relative abundance and β coefficients of two metaclusters selected by the lasso model in h (n = 31 patients). Box plots in f,j show quartiles with a band at median; whiskers indicate 1.5× interquartile range; and all observations are overlaid as dots.

We focused our analysis on day 7 after infusion, which was the time of peak CAR T cell expansion (Fig. 1d). Single-cell clustering identified 25 clusters of day 7 circulating CAR T cells that were grouped automatically into ten metaclusters (Fig. 2b,c). Clusters were connected to their most similar neighbors based on average marker expression, forming a minimum spanning tree (MST). To pinpoint potentially predictive metaclusters for CR versus PD at 6 months, we built a lasso model and applied ten-fold cross-validation to estimate its performance on unseen samples (Fig. 2d). Lasso is a multivariate regression model that uses sparsity to select a set of features that together can predict a given response30. Cross-validation identified optimal model performance at λ = 0.102, corresponding to three metaclusters (Fig. 2e). Relative abundance of CAR T cells in each metacluster and β coefficients of the final lasso model identified metaclusters 3 and 6 as predictive of CR at 6 months, whereas metacluster 4 was predictive of PD at 6 months (Fig. 2f). The magnitude of β coefficients indicates relative importance of each metacluster, and the sign specifies its class (negative for CR and positive for PD).

Relative to other CAR T cells, the two metaclusters associated with CR were CD57+ subsets of CD4+ (metacluster 3) and CD8+ (metacluster 6) CAR T cells that were enriched for Blimp-1 and T-bet transcription factors but not the exhaustion marker CD39 (Fig. 2g and Extended Data Fig. 2). Inhibitory proteins PD1, LAG3 and TIM3, which are linked to activation and exhaustion, were expressed at lower levels in the CD8+CD57+ subset (metacluster 6), and CD28 was also expressed at lower levels in this subset, as previously reported31. Metacluster 3 also expressed high levels of PD1 and ICOS, suggesting similarity to T follicular helper (TFH) cells. In contrast, metacluster 4, which associated with PD, comprised Helios+CD4+ CAR T cells that expressed high levels of CD25, CTLA4 and TIGIT, indicating similarity to an immunosuppressive TReg subset. All three metaclusters expressed Ki-67, a marker of proliferating cells, in at least one of their clusters, indicating that at least a fraction of the cells had recently cycled or were continuing to cycle. Differential abundance analysis32 confirmed these results and also identified CAR T cells with exhaustion features, including CD39, CD101 and CD244 expression, as associated with PD at 6 months (Extended Data Fig. 3).

We repeated the lasso analysis to identify day 7 metaclusters of circulating CAR T cells associated with severe neurotoxicity (Fig. 2h). Here, optimal model performance was achieved at λ = 0.090, corresponding to two metaclusters (Fig. 2i). The final model’s β coefficients and relative abundance of CAR T cells revealed that metaclusters 4 and 8 were associated with reduced neurotoxicity (Fig. 2j). Metacluster 4 was of special interest, because it was the same metacluster associated with PD at 6 months and contained CD4+ CAR T cells with TReg-associated protein expression. Metacluster 8 was similar to metacluster 6, as it contained CD57+CD8+ CAR T cells enriched for Blimp-1 and T-bet, but it also displayed features of exhaustion, including CD39, CD101 and CD244 expression. These findings were consistent with the results of differential abundance analysis (Extended Data Fig. 4).

Together, single-cell proteomic analyses of day 7 circulating CAR T cells in patients treated with axi-cel for LBCL identified three metaclusters that distinguished patients with long-term clinical response and two metaclusters associated with severe neurotoxicity. CD4+ T cells bearing hallmarks of TReg cells correlated in both analyses, being increased in patients with less severe neurotoxicity and those experiencing PD, whereas CD4+ and CD8+ CAR T cells bearing a T effector (TEFF) phenotype with senescence features were increased in patients who experienced durable disease control.

CAR TReg cells linked to progression and lower neurotoxicity

Because cell assignment into metaclusters relies on all 31 markers used for clustering (Fig. 2c), we tested whether a simplified gating strategy using the salient proteins identified in the metaclusters would associate with clinical response at 6 months in the CyTOF dataset (Fig. 3a). We confirmed that, on day 7 after infusion, CD4+ and CD8+ subsets of CD57+T-bet+ CAR T cells were significantly higher in patients experiencing CR at 6 months, whereas CD57Helios+CD4+ CAR T cells were higher in patients with PD (Fig. 3b). Notably, no difference in levels of these subsets was present before axi-cel infusion in blood of patients who experienced CR versus PD (Extended Data Fig. 5a). At day 21 after infusion, CAR T cells showed trends similar to those observed at day 7 (Extended Data Fig. 5b). We also saw significantly higher levels of CAR CD57+T-bet+ T cells in patients with durable CR, which could include CAR-transduced cells that have downregulated their receptor33 and/or may indicate stronger lymphopenia-induced expansion in responding patients (Extended Data Fig. 5c,d). We next assessed relationships between each CAR T cell population (separated by dotted lines; Fig. 3b) and time to progression (TTP) using Kaplan–Meier analysis and observed significant differences in TTP between patients divided into two groups based upon high versus low expansion of each population (Fig. 3c). Although TTP and PFS results are identical here, we used TTP for subsequent analyses to avoid confounding results due to death from infection or causes unrelated to LBCL.

Fig. 3: Helios-expressing population of circulating CD4+ CAR T cells on day 7 is associated with clinical progression and reduced neurotoxicity.
figure 3

a, Gates defined based on three metaclusters identified by the lasso model for predicting clinical response (Fig. 2f). Contour plots show CyTOF data for CAR+ T cells on day 7 from patient 008 (CR at 6 months) and patient 017 (PD at 6 months). b, Summary statistics for three populations as defined in a for patients with CR or PD at 6 months (n = 28). Dotted lines indicate separation between high and low percentages of CAR T cells in each population, with the thresholds selected based on the optimal response separation between the groups. c, Kaplan–Meier analysis of TTP stratified by high versus low percentage of CAR T cells in each population, as shown in b (n = 28). Because clinical outcome was known during patient stratification, P values need to be interpreted with caution. d, Gate from a defined based on metacluster 4 identified by the lasso model for predicting severe neurotoxicity (Fig. 2j). Contour plots show CyTOF data for CAR+ T cells on day 7 from patient 042 (max ICANS grade 0) and patient 050 (max ICANS grade 3). e, Percentage of circulating CD57Helios+ cells among CD4+ CAR T cells on day 7 stratified by maximum ICANS grade (n = 31). f, Percentage of CD57+T-bet+ cells among CD4+ (left) and CD8+ (right) CAR T cells on day 7 for patients with maximum ICANS grade 0–1 or 2–4 (n = 31). Box plots in b,e,f show quartiles with a band at median; whiskers indicate 1.5× interquartile range; and all observations are overlaid as dots. P values are from two-sided Mann–Whitney U-tests. Pt, patient; SE.

To assess whether a simplified gating strategy would identify correlations between higher TReg-like CAR T cells and less severe neurotoxicity, we examined CD4+ CAR T cells in the CD57Helios+ gate (Fig. 3a,d) and confirmed significantly lower frequencies of CD4+CD57 Helios+ CAR T cells in patients with severe neurotoxicity (Fig. 3e). Of note, a gate based on metacluster 8 was not significantly associated with severe neurotoxicity (Extended Data Fig. 5e,f), which is consistent with its lower coefficient magnitude (β = −2.17) as compared to metacluster 4 (β = −5.31) in the lasso model (Fig. 2j). We did not observe differences in the frequency of the circulating CD57Helios+ population among CD4+ T cells before axi-cel infusion in patients stratified by severe neurotoxicity nor among CD4+ CAR T cells on day 21, at which time neurotoxicity was largely resolved (Extended Data Fig. 5g). Notably, both CD57+T-bet+ populations of CD4+ and CD8+ CAR T cells on day 7, which were associated with durable CR, had no association with neurotoxicity (Fig. 3f).

Cytomegalovirus (CMV) is a common pathogen, and chronic CMV infection has been implicated in the accumulation of CD57+ T cells34. We observed a significantly higher frequency of circulating CD57+ T cells in patients with evidence for prior CMV infection before axi-cel infusion (Extended Data Fig. 6a), although these differences became less pronounced in CAR T cells after infusion, likely due to lymphopenia-induced proliferation (Extended Data Fig. 6b,c). CMV status was not significantly associated with any of the CD57+ populations examined among CAR T cells on days 7 and 21 (Extended Data Fig. 6d,e) and did not associate with TTP or OS in this cohort (Extended Data Fig. 6f).

Together, these results establish relationships between three easily defined populations of circulating CAR T cells on day 7 and long-term clinical outcome or severe neurotoxicity in patients with LBCL treated with commercial axi-cel. Early expansion of circulating TReg-like CAR T cells associates with both disease progression and lower levels of severe neurotoxicity, whereas CD57+T-bet+ populations associate with greater disease control.

CAR TReg cells express FOXP3 and lack cytotoxic potential

We next sought to interrogate the functional properties of the three identified CAR T cell populations in response to activation. Peripheral blood mononuclear cells (PBMCs) were stimulated either using small molecule mimics of TCR signaling (PMA and ionomycin) or through CAR (using plate-bound anti-FMC63 antibody) for 6 hours before analysis by intracellular flow cytometry, with gating on CAR+ populations. As controls, we used healthy donor–derived T cells analyzed at baseline and after transduction with a retroviral construct encoding a CD19-CD28ζ CAR, thereby replicating the axi-cel construct. As expected, before transduction, CD4+CD57+T-bet+ T cells were nearly absent, and CD4CD57+T-bet+ T cells were infrequent (Extended Data Fig. 7a). Among CD19-CD28ζ CAR-transduced T cells analyzed on day 11 or day 12 in culture, CD57+T-bet+ CD4+ and CD4 CAR T cells were rare (Extended Data Fig. 7b), in agreement with prior reports that CAR T cells in the infusion product typically do not express CD57 (ref. 35). CD57Helios+ cells were detectable both in healthy donor CD4+ T cells and in CD19-CD28ζ CAR-transduced CD4+ T cells. Of note, we observed a large FOXP3+CD25High population among CD19-CD28ζ CAR-transduced CD4+ T cells, which did not express Helios, in contrast to CD4+ T cells from healthy donors (Extended Data Fig. 7c).

Intracellular flow cytometry of cryopreserved day 7 post-infusion PMBCs from 27 patients revealed that all three populations were readily detectable after activation (Fig. 4a and Extended Data Fig. 7d). The CD4+CD57Helios+ population expressed both CD25 and FOXP3, a key transcription factor of TReg cells, and did not produce IL-2 after activation (Fig. 4a–c). The CD57+T-bet+ populations were strong producers of granzyme B (GZMB), a cytotoxic granule protein, and CD8+CD57+T-bet+ cells expressed the largest amount of surface CD107a, a marker of recent degranulation. The CD4+CD57+T-bet+ populations did not produce IL-2, whereas a small fraction (median 3.1%) of CD4CD57+T-bet+ population produced IL-2 in response to CAR activation only. These data demonstrate that the CD4+CD57Helios+ CAR T cell population is phenotypically similar to TReg cells, providing a basis for its association with progressive disease and reduced neurotoxicity. In contrast, GZMB is expressed by the majority of CD4CD57+T-bet+ (mean ± s.e.m.: 65.8 ± 6.8%) and a large fraction of CD4+CD57+T-bet+ (35.5 ± 9.0%) CAR T cells, consistent with high cytotoxic potential, potentially explaining their association with durable CR.

Fig. 4: CD4+CD57Helios+ CAR T cells express FOXP3 and are not cytotoxic.
figure 4

Cryopreserved PMBCs from the patient cohort (day 7 after axi-cel infusion; n = 27) were incubated with either PMA and ionomycin or plate-bound anti-idiotype antibody for 6 hours and analyzed by flow cytometry. a, Top: Contour plots show three identified populations among CAR+ T cells from patient 017 and patient 050. Bottom: percent of cells from each population falling into the FOXP3+CD25High TReg gate. b,c, Summary statistics for CAR+ T cells falling into TReg, GZMB+ or IL-2+ gates and surface CD107a MFI after stimulation with PMA and ionomycin (b) or plate-bound anti-idiotype antibody (c). Box plots show quartiles with a band at median; whiskers indicating 1.5× interquartiole range; and all observations are overlaid as dots. P values are from the Friedman test, followed by pairwise two-sided Wilcoxon signed-rank tests with Bonferroni correction to adjust for multiple hypothesis testing. MFI, mean fluorescence intensity.

CAR TReg cells are polyclonal with TReg gene signature

To more deeply characterize the outcome-associated CAR T cell populations, we performed single-cell RNA-sequencing (scRNA-seq) with targeted single-cell T cell receptor sequencing (scTCR-seq) of sorted CAR+ T cells from nine patients with LBCL on day 7 after axi-cel infusion (Fig. 5a,b and Supplementary Table 1). As CD57 itself is not encoded by a gene but, rather, represents a carbohydrate epitope induced by the beta-1,3-glucuronyltransferase 1 (B3GAT1) enzyme, we performed antibody-based single-cell surface analysis by cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq)36 to resolve single-cell expression of CD57 as well as a panel of 42 surface proteins (Supplementary Table 3). We found that CD4+CD57Helios+ CAR T cells were enriched for FOXP3, IL2RA, TIGIT, CTLA4 and RTKN2 at the transcript and/or surface protein level, consistent with our prior data and their TReg phenotype (Fig. 5c–g and Extended Data Fig. 8). Furthermore, CD4+CD57Helios+ CAR T cells were similar to CD4+FOXP3+ CAR T cells and to those classified as TReg by projecting our data onto an annotated reference dataset from blood of healthy donors using Azimuth37 (Fig. 5e). Systematic differential gene expression and pathway analyses identified TReg development as one of the top enriched pathways when we compared CD4+CD57Helios+ cells to both CD57+T-bet+ populations, to each CD57+T-bet+ population separately or to all other CAR T cells (Fig. 5f and data not shown).

Fig. 5: Deep phenotyping with single-cell sequencing upholds TReg identity of CD4+CD57Helios+ CAR T cells.
figure 5

a, CAR+ T cells were sorted from nine patients with LBCL on day 7 after axi-cel infusion and analyzed by scRNA-seq, scTCR-seq and CITE-seq. b, Left: Contour plot and schematic show definitions of the three identified CAR T cell populations. Right: identities of CAR T cell populations projected onto the wnnUMAP coordinates based on scRNA-seq and CITE-seq data (n = 6,316 cells). c, Expression of selected genes overlaid onto the wnnUMAP coordinates from b. Protein encoded by each gene is shown in parentheses. d, Surface expression of selected proteins and CD57 (carbohydrate epitope) overlaid onto the wnnUMAP coordinates from b. e, Indicated CAR T cell subsets highlighted in red on the wnnUMAP coordinates from b. f, Volcano plot showing differentially expressed genes (left) and dot plot showing the top ten differentially regulated pathways (right) comparing CD4+CD57Helios+ to CD4+CD57+T-bet+ and CD8+CD57+T-bet+ CAR T cell populations (n = 952 cells). Differentially upregulated genes and pathways are in red; downregulated genes and pathways are in blue; genes used to define each population and pathways with genes both upregulated and downregulated are in black. g, Violin plots showing selected gene and surface protein expression in populations defined in b (n = 6,316 cells). Stars denote significant (P < 0.05) upregulation in the indicated population relative to all populations without stars. Other significant relationships are not denoted. h, TCR clonal expansion overlaid onto the wnnUMAP coordinates from b (left) and shown as a box plot for each CAR T cell population (right). P values in g,h were calculated using the Kruskal–Wallis H-test, followed by unpaired two-sided Wilcoxon–Mann–Whitney U-test applied to each treatment pair, with Bonferroni correction for multiple hypothesis testing.

Both CD4+ and CD8+ subsets of CD57+T-bet+ CAR T cells expressed high levels of GZMA, GZMB, GZMH, PRF1 and NKG7, consistent with a cytolytic TEFF state (Fig. 5c–g and Extended Data Fig. 8). Although CD4+CD57+T-bet+ CAR T cells resembled TFH cells based on our CyTOF data, they did not express BCL6, which encodes a key transcription factor in TFH cell differentiation (2.8% BCL6+; Extended Data Fig. 8d). Based on their mRNA expression of TBX21 (T-bet transcription factor) and CD40LG (CD40L co-stimulatory receptor), CD4+CD57+T-bet+ CAR T cells displayed similarities to both TH1 and TEFF cells, whereas CD8+CD57+T-bet+ CAR T cells most closely resembled TEFF. T cell senescence is characterized by expression of CD57 and loss of CD28 by CD8+ T effector memory (TEM) cells, some of which re-express CD45RA (TEMRA)34,38. CD8+CD57+T-bet+ CAR T cells identified here expressed reduced levels of CD27 and CD28 (Fig. 5g and Extended Data Fig. 8d–f) and contained a TEMRA subset (22.0% versus 8.5% for CD4+CD57+T-bet+, 0.6% for TReg-like and 10.1% for other CAR T cells). However, most CD57+T-bet+ CAR T cells identified here did not demonstrate increased mRNA expression of other senescence-associated genes, such as CDKN1A (P21), CDKN2A (P16), GLB1 (β-galactosidase) and USP16 (Extended Data Fig. 8d,f), indicating that these populations may not have reached terminal senescence. Both CD4+ and CD8+ populations of CD57+T-bet+ CAR T cells expressed variable levels of PDCD1 (PD1) and HAVCR2 (TIM3) inhibitory receptors and low mRNA levels of TOX (TOX)39,40,41,42, NR4A2 (NR4A2)43 and ENTPD1 (CD39)44 terminal T cell exhaustion markers. Together, the profile of the CD57+T-bet+ cells identified here closely resembles that described in terminally differentiated effector T cells with early senescence features rather than exhausted T cells.

Cell cycle analysis using 97 canonical cell cycle markers45 showed that all three populations contained a subset of proliferating cells (18–26%; Extended Data Fig. 8g). TCR clonotype analysis revealed that CD4+CD57Helios+ CAR T cells consistently demonstrated high repertoire diversity and were mostly composed of unique TCR clones. In contrast, both CD57+T-bet+ CAR T cell populations comprised highly clonally expanded cells (Fig. 5h). Altogether, single-cell transcriptomic and proteomic profiling conducted alongside clonality analysis demonstrated that CD4+CD57Helios+ CAR T cells, which associated with poor disease control and diminished neurotoxicity, were largely clonally unique cells that demonstrated hallmark features of TReg cells. In contrast, CD4+ and CD8+ populations of CD57+T-bet+ CAR T cells comprise clonally expanded cells with high cytotoxic potential.

CAR TReg validation and evidence of in vivo suppression

To validate our findings, we prospectively analyzed a cohort of 31 additional patients with LBCL treated with axi-cel at our institution between 2 November 2018 and 1 July 2021 and followed until 17 February 2022 (Supplementary Table 1). The baseline demographics of our validation cohort were similar to our discovery cohort (Supplementary Table 4). Cryopreserved PMBCs collected on day 7 after infusion from this second patient cohort were examined using flow cytometry (Supplementary Table 3). These data confirmed that CAR TReg cells were associated with PD at 6 months and lower neurotoxicity, with nearly identical results for CD57Helios+, CD57FOXP3+ and CD57Helios+FOXP3+ subsets of CD4+ CAR T cells (Fig. 6a,b). When we separated patients into groups with high and low CD4+CD57Helios+ (TReg-like) CAR T cell fractions using the threshold from the discovery cohort (12%; dotted line in Fig. 3b), we observed significant differences in TTP using Kaplan–Meier analysis (Fig. 6c), with results matching those seen earlier (Fig. 3c).

Fig. 6: CAR TReg cells and tumor burden surrogate identify patients with clinical progression.
figure 6

a,b, Percentage of indicated populations among CD4+ CAR T cells in blood on day 7 after infusion in patients from the validation cohort (n = 23), separated by response at 6 months (a) or maximum ICANS grade (b). Box plots show quartiles with a band at median; whiskers indicate 1.5× interquartile range; and all observations are overlaid as dots. P values are from two-sided Mann–Whitney U-tests. c, Kaplan–Meier analysis of TTP stratified by high versus low percentage of CD57Helios+ population among CD4+ (TReg-like) CAR T cells using the threshold from the discovery cohort (Fig. 3c) (n = 23). d, Pearson correlation between percentage of CAR TReg cells and log10 AUC0–28 expansion of CD8+ CAR T cells (n = 54). P value is from the correlation test. e, Radar plot showing mean values for all patients in CR or PD at 6 months (n = 64). Each axis starts at 0 and ends at the maximum value observed. f, Logistic regression model for predicting response at 6 months based on two parameters: percent of CAR TReg cells and whether pre-lymphodepletion LDH levels were above normal. Model was fit using the discovery cohort (n = 28), with parameters shown below the formula. VIF ≥ 5 indicates severe multi-collinearity. g, Performance of the model from f on discovery (n = 28) and validation (n = 23) cohorts. AUROC, area under the receiver operating characteristic. h,i, Kaplan–Meier analysis of TTP (h) and OS (i) stratified by high versus low risk using the model from f in cohorts from g. ECOG, FPR, false-positive rate; IPI, LD, lymphodepletion; SE, TPR, true-positive rate.

We further sought to determine whether there was additional evidence of immune suppression by CAR TReg cells in vivo. An important TReg mechanism of action is suppression of T cell proliferation, especially within the effector CD8+ subset. Here, analysis of all available data showed a clear inverse and significant correlation between the CAR TReg fraction and CAR T cell expansion (Extended Data Fig. 9a), with the strongest trend observed within CD8+ CAR T cells (Fig. 6d). We also observed a trend of delayed peak neurotoxicity in patients with higher CAR TReg fraction on day 7 when stratified by ICANS severity (Extended Data Fig. 9b).

Higher levels of CD57+T-bet+ cells measured by flow cytometry in the validation cohort did not associate with improved outcomes (Extended Data Fig. 9c). The basis for this lack of validation is not entirely clear, but close examination revealed that CD57+T-bet+ populations were nearly mutually exclusive with CAR TReg population in the discovery cohort (Extended Data Fig. 9d and data not shown), whereas, in the validation cohort, a significant fraction of CD57+T-bet+ cells expressed Helios in patients who progressed (Extended Data Fig. 9e). We, therefore, compared Helios+ to Helios CAR T cells within the CD57+T-bet+ population using our single-cell sequencing data. Helios+CD57+T-bet+ cells did not express TReg markers FOXP3, IL2RA or RTKN2 but were enriched for expression of TIGIT and markers of natural killer (NK) cells (Extended Data Fig. 9f–h), indicating similarity to the NK-like subset recently associated with poor outcomes after CAR T cell therapy46. Thus, we observed heterogeneity within CD57+T-bet+ CAR T cells that appears relevant to clinical response, suggesting that these populations may include multiple cell states.

Synergy between CAR TReg cells and a tumor burden surrogate

In both discovery and validation cohorts, we observed a few patients who progressed despite having a low fraction of CAR TReg cells (Figs. 3b and 6a). Increased pre-lymphodepletion LDH level, which provides a surrogate for tumor burden, has emerged as a key correlate of progression5. We, therefore, examined CAR TReg cell levels in the context of LDH and other clinical parameters. CAR TReg fraction was not significantly correlated to LDH (Extended Data Fig. 10a) and was similar in patients with normal or high LDH (Extended Data Fig. 10b). However, both CD4+CD57Helios+ CAR T cell and LDH levels effectively distinguished patients with CR or PD at 6 months (Fig. 6e). Moreover, patients who progressed at 6 months had more circulating CAR TReg cells than those in CR at 6 months in both low and high LDH groups (Extended Data Fig. 10c). Therefore, high LDH and CAR TReg fraction are two independent determinants of clinical progression, prompting us to combine these two features in a logistic regression model (Fig. 6f). We fit this model and identified an optimal classification threshold using the discovery cohort and then tested on the validation cohort, with excellent performance in both cohorts (Fig. 6g). We used the model risk (Supplementary Table 5) to stratify patients for Kaplan–Meier analysis and observed that low-risk patients have a significantly better TTP (Fig. 6h) and OS (Fig. 6i). Notably, this model had better predictive performance with superior TTP and OS separation than models using each feature alone (Extended Data Fig. 10d–g).

Together, these results demonstrate that increased CAR TReg cells circulating early after infusion predict clinical progression and mild neurotoxicity and provide evidence for diminished CAR T cell expansion in patients with high circulating CAR TReg cells. Combining a tumor burden surrogate with enumeration of post-infusion CAR TReg cells yields a powerful predictor of durable complete response versus progression.

Discussion

Identification of biomarkers associated with response or toxicity after CAR T cell therapy could enable earlier interventions to improve outcomes and provide novel mechanistic insights into response and toxicity. Immunokinetic measurements of total CAR T cell levels in LBCL have been performed in numerous studies2,6,8,15,47,48,49,50,51,52 and generally demonstrate inconsistent or weak correlations with response. In line with this, in 32 patients with LBCL receiving commercial axi-cel at our institution, we observed that neither flow cytometric nor qPCR-based measurement of CAR T cell expansion associated with clinical response or long-term disease control (Figs. 1f and 6e and Extended Data Fig. 1b–h). This could reflect the divergent mechanisms of resistance to CAR T cells11, which include inadequate CAR T cell potency associated with low CAR T cell levels but also intrinsic tumor resistance due to antigen-negative or antigen-low variants. Furthermore, efficient CAR T cell homing to the tumor may result in misleadingly low counts in blood, and patients with lower tumor burdens can experience disease control with relatively low levels of CAR T cell expansion in the peripheral blood.

We discovered that CD4+CD57Helios+ CAR T cells circulating at peak expansion (day 7) associate with progression at 6 months and milder neurotoxicity (Fig. 3b,e). Prospective analysis of the second cohort comprising 31 patients with LBCL validated these associations (Fig. 6a,b) and identified an inverse correlation with CAR T cell expansion in vivo (Fig. 6d and Extended Data Fig. 9a). These cells are nearly exclusively FOXP3+CD25High and polyclonal and have a TReg gene expression program and low cytotoxic potential (Figs. 2g, 4 and 5g and Extended Data Figs. 2a and 8f), consistent with thymic-derived classical TReg cells53, whose suppressive function would be expected to diminish anti-tumor response, toxicity and expansion. Although we initially identified this population using Helios, quantification of CAR TReg based on FOXP3 led to nearly identical results (Fig. 6a,b). CAR TReg expansion in vivo was independent from the tumor burden quantified by LDH levels and synergistic with LDH as a predictor of response (Fig. 6f–i and Extended Data Fig. 10). Specific modulation of this subset in the graft or in vivo could provide a novel approach to either enhance anti-tumor effect or diminish neurotoxicity. Although we did not observe CAR TReg differences before infusion (Extended Data Fig. 5a), given the lack of access to axi-cel products, we could not determine whether expansion correlates with levels contained in the manufactured product. Future studies will examine the epigenetic state of the TReg-specific demethylated region (TSDR) within the FOXP3 gene54 and additional functional properties of this population.

The functional and phenotypic features of CD57+T-bet+ CAR T cells described here (Figs. 2, 4 and 5) are consistent with CD57+ terminal effector T cells55,56 in the early stages of senescence that remain highly cytolytic34,38,57. The senescence program may eventually limit the longevity and renewal of these cells, leading to a rapid decline of circulating CAR T cell counts by week 4 after infusion (Fig. 1d). In addition, higher expression of exhaustion marker CD101 in this subset or CD39 in CD57 cells were linked to progression (Extended Data Fig. 3). Helios could trigger an NK cell transition program46 in CD57+T-bet+ subsets in patients who progress (Extended Data Fig. 9). However, this was not the case in our discovery cohort, and a CAR T cell cluster expressing CD57, T-bet and Helios was observed in a patient with chronic lymphocytic leukemia with a decade-long remission58, indicating that biology of this subset still needs to be unraveled. Overall, our work highlights the importance of senescence and NK cell transition, in addition to exhaustion, as potential sources of CAR T cell dysfunction.

Previous studies focused on the CAR T cell products have demonstrated that stemness features correlate with improved outcome12,15, and CAR T cells in infusion products typically do not express CD57 (ref. 35) (Extended Data Fig. 7b). We postulate that CAR T cells with high levels of fitness proliferate and differentiate into cytolytic CD57+T-bet+ CAR T cells early after infusion, leading to a sustained malignant cell clearance but accompanied by severe toxicity when peripheral CAR T cell expansion is high. However, polyclonal CAR TReg cells limit expansion, toxicity and response. In addition, TEFF-intrinsic exhaustion, senescence and NK cell transition may limit CAR T cell function. Here, CAR T cell efficacy still depends on tumor burden, adequate expression of the targeted antigen by the tumor and effective CAR T cell trafficking and function within the tumor microenvironment. Indeed, although CAR T cell products are typically polyclonal12,59, CD57+T-bet+ CAR T cells demonstrated significant clonal restriction, consistent with a recent proliferative burst (Fig. 5h). Together, our data are consistent with CD57 marking clonally expanded CAR T cell populations, thereby providing a surrogate for CAR T cells that have undergone extensive proliferation in vivo.

In summary, despite the early successes of CAR T cell immunotherapies, many challenges remain to develop CAR T cells that induce durable and complete remissions in all patients with cancer with acceptable toxicity. Our results highlight the power of deep single-cell correlative studies to interrogate the biology of a successful CAR T cell response and toxicity. Using this approach, we identified circulating CAR TReg cells as a novel biomarker of progression and reduced toxicity and, thus, defined T cell characteristics that may be targets of manipulation in advanced therapies.

Methods

Patients

Patients with LBCL who received axi-cel as a standard-of-care treatment at the Stanford Hospital Center were eligible for this study. Patients provided informed consent for research using their blood samples and de-identified health information as a part of the Clinical Outcomes Biorepository protocol that was approved by the institutional review board (IRB) of Stanford University (IRB no. 43375). Clinical data were obtained retrospectively from chart review (Supplementary Table 1). Treatment response was assessed radiographically according to the Lugano criteria60, with date of progression defined as the date of radiographic progression. Toxicity was evaluated by the Lee criteria61 for CRS and by ICANS grading10 for neurotoxicity.

CAR T cell production

Where indicated, healthy donor T cells expressing the CD19-CD28ζ CAR were used as controls. As in axi-cel, the CD19-CD28ζ CAR contains the CD19-targeted FMC63 scFv, the transmembrane and intracellular domains of CD28 and the intracellular domain of CD3ζ. CAR T cells were produced as described previously20. In brief, de-identified human buffy coats were obtained from healthy adult donors under informed consent according to an IRB-exempt protocol (Stanford Blood Center). T cells were isolated using the RosetteSep Human T Cell Enrichment Kit (STEMCELL Technologies) according to the manufacturer’s instructions using Lymphoprep density gradient medium and SepMate-50 tubes. For long-term storage, T cells were cryopreserved at 1–2 × 107 T cells per vial in 1 ml of CryoStor CS10 cryopreservation medium (STEMCELL Technologies) and stored in liquid nitrogen. The retroviral vector encoding CD19-CD28ζ CAR was described previously62. This vector was produced using the 293GP packaging cell line co-transfected with RD114 envelope plasmid and MSGV plasmid encoding the CD19-CD28ζ CAR using Lipofectamine 2000 (Thermo Fisher Scientific). Supernatant was collected at 48 hours and 72 hours after transfection, centrifuged to remove cell debris and frozen at −80 °C for future use. Cryopreserved T cells were thawed and activated same day with Human T-Expander CD3/CD28 Dynabeads (Gibco) at 3:1 beads:cell ratio in T cell medium containing AIMV (Gibco) supplemented with 5% heat-inactivated FBS (Sigma-Aldrich), 10 mM HEPES (Gibco), 2 mM GlutaMAX (Gibco), 100 U ml−1 of penicillin and 100 μg ml−1 of streptomycin (Gibco). Recombinant human IL-2 (PeproTech) was added at 100 U ml−1. T cells were transduced with the retroviral vector on days 2 and 3, maintained in T cell medium with IL-2 and cryopreserved on days 11 or 12 after activation.

Flow cytometry assay to monitor CAR T cell expansion

Blood samples were collected before lymphodepletion, on day 0 (the day of axi-cel infusion), day 7 (± 2), day 14 (± 4), day 21 (± 4) and day 28 (± 4) after infusion. PBMCs were isolated from 8 ml of fresh whole blood by density gradient centrifugation using Ficoll-Paque Plus (GE Healthcare) according to the manufacturer’s instructions. PBMCs were stained with fixable Live/Dead Aqua amine-reactive viability stain (Invitrogen, L-34965). Fc receptors were blocked with Human TruStain FcX (BioLegend, 422302) for 5 minutes to prevent non-specific antibody binding. Cells were then stained at room temperature with the fluorochrome-conjugated antibody panel (Supplementary Table 3). In-house CAR T cells were used as a positive control included in daily staining experiments. Stained and fixed cells were acquired on a four-laser LSRII flow cytometer (BD Biosciences; blue: 488-nm, violet: 405-nm, red: 640-nm and green: 532-nm lasers; 21 parameters). At least 106 cells were acquired unless restricted by the number of cells isolated from blood. CD4+ and CD8+ CAR T cells were gated using Cytobank software as CD3ε+CD4+CD8αCAR+ or CD3ε+CD4CD8α+CAR+ events among single viable CD45+ lymphocytes, as defined by the forward and side scatter gates. Absolute CAR T cell numbers were calculated by multiplying the percentages of CAR T cells among lymphocytes by the absolute lymphocyte count obtained on the same day. The assay limit of detection (LOD) was calculated as 1 in 104 of total acquired PBMCs or 0.0125 cells per µl.

qPCR assay to monitor CAR T cell expansion

DNA was extracted from 2–5 × 106 PBMCs using QIAmp DNA Blood Mini Kit (Qiagen, 51306). FMC63 sequence of the CD19-CD28ζ CAR and albumin control sequence were quantified by qPCR using the primer and probe sets listed in Supplementary Table 3. For the standard curve, a custom minigene plasmid was designed containing a partial CAR sequence and a partial albumin sequence, which served as a control for normalization. The standard curve contained a ten-fold serial dilution of plasmid between 5 and 5 × 106 copies. Both plasmid and patient DNA from each timepoint were run in triplicate, with the mean of three replicates reported. Each reaction contained 5 µl (50 ng) of DNA, 100 nM forward and reverse primers and 150 nM probe resuspended in 10 µl of TaqMan Fast Universal PCR Master Mix (2×), No AmpErase UNG or equivalent (Thermo Fisher Scientific) and 5 µl of TE buffer (Invitrogen, AM9935). The Bio-Rad CFX96 Touch Real-Time PCR Detection System was used for qPCR with 20 µl per reaction. The quality metrics for all qPCR standard curve results were R2 > 0.99, −3.38 > slope > −3.71 and efficiency > 90%. The assay LOD was calculated as eight copies per 50 ng of DNA reaction with 95% confidence.

CyTOF

PBMCs cryopreserved before axi-cel infusion (before lymphodepletion or on day 0), on day 7 (± 2) and on day 21 (± 4) were analyzed by CyTOF in batches of 18 samples (six patients). Each batch also contained PBMCs from a healthy donor and in-house CAR T cells as controls. Samples were processed as previously described29. In brief, cryopreserved PBMCs were thawed into 9 ml of cell culture medium (CCM; RPMI 1640 containing 10% FBS, 10 mM HEPES, 2 mM GlutaMAX, 100 U ml−1 of penicillin and 100 μg ml−1 of streptomycin) supplemented with 25 U ml−1 of benzonase (Sigma-Aldrich, E8263-5KU). Cells were then pelleted for 5 minutes at 300g, resuspended in 10 ml of CCM, filtered through a 70-µm strainer and counted.

To stain cells for viability63, cisplatin (Sigma-Aldrich, P4394) was reconstituted to 100 mM in DMSO and incubated at 37 °C for 3 days to prepare a stock solution, which was then stored in aliquots at −20 °C. Cell pellets were resuspended in 1 ml of PBS containing 0.5 µM cisplatin, gently vortexed, incubated for 5 minutes at room temperature, quenched with 3 ml of CCM, pelleted and resuspended in 1 ml of CCM. Cells were fixed by adding 16% paraformaldehyde (PFA; Electron Microscopy Sciences) to a final concentration of 2%, gently vortexed, incubated for 10 minutes at room temperature and washed twice with cell staining media (CSM; PBS with 0.5% BSA and 0.02% sodium azide) to remove residual PFA. All centrifuging steps for fixed cells were done for 5 minutes at 600g and 4 °C. With the exception of antibody titrations, samples were palladium-barcoded and pooled as described64 to improve staining consistency.

CyTOF antibody panels are listed in Supplementary Table 3. With the exception of antibodies purchased from Fluidigm, all antibodies were conjugated to reporter metal isotopes in-house and titrated to determine optimal staining concentrations before incorporating into a staining panel. Antibodies were conjugated using MaxPar Antibody Conjugation Kit (Fluidigm) and titrated on cells both positive and negative for the target antigen expression to identify concentration yielding the best signal-to-noise ratio. Fc receptor blocking was performed with Human TruStain FcX (BioLegend, 422302) following the manufacturer’s instructions to prevent non-specific antibody binding. Antibodies against surface antigens were pooled into a master mix in CSM yielding 50 µl (350 µl if barcoded) of final reaction volumes per sample and filtered through a 0.1-µm filter (Millipore, UFC30VV00) for 5 minutes at 1,000g to remove antibody aggregates. Antibody master mix was then added to each sample and resuspended, and cells were incubated for 30 minutes at room temperature. After the surface stain, cells were washed with CSM, permeabilized with 4 °C methanol for 10 minutes on ice, washed twice with CSM, stained with an antibody master mix (prepared as above) against intracellular antigens in 50 μl (350 µl if barcoded) of CSM for 30 minutes at room temperature and washed once with CSM. To stain DNA, cells were incubated in PBS containing 1:5,000 191Ir/193Ir MaxPar Nucleic Acid Intercalator (Fluidigm) and 1.6% PFA for 1–3 days at 4 °C. Just before analysis, cells were washed once with CSM and twice with filtered double-distilled water, resuspended in normalization beads65 (EQ Beads, Fluidigm), filtered and placed on ice. During event acquisition, cells were kept on ice and analyzed on the Helios mass cytometer (Fluidigm). In addition to reporter metal isotopes listed in antibody panels, we recorded event length, width and channels 102Pd, 104Pd, 105Pd, 106Pd, 108Pd and 110Pd (barcoding); 140Ce, 151Eu, 153Eu, 165Ho and 175Lu (bead normalization); 191Ir and 193Ir (DNA); 195Pt and 196Pt (dead cells); and 138Ba (to help define single cells).

CyTOF data processing

CyTOF data were processed using R (www.r-project.org) and Bioconductor (www.bioconductor.org) software (Supplementary Table 3). Raw FCS files were bead normalized65 using premessa and concatenated using CATALYST. Panels were harmonized and samples were debarcoded64 using premessa. Cells were gated using Cytobank software based on event length and 191Ir/193Ir (DNA) content29. Single live non-apoptotic cells were gated based on 195Pt (viability)63, cleaved PARP (cPARP) (apoptotic cells) and 138Ba (single cells). Lymphocytes were selected based on CD45 expression. Next, data were transformed using inverse hyperbolic sine (asinh) with a cofactor of 5 (and cofactor of 30 for CD4 and CD57) using flowCore66 and ncdfFlow. Batch normalization was performed using cydar32. T cells were gated as CD4+ or CD8α+ events, and CAR+ and CAR T cells were selected based on CAR expression using Cytobank.

Circulating CAR T cells on day 7 after infusion were organized with self-organizing maps (SOMs) into 25 clusters, connected into an MST and grouped into ten metaclusters using FlowSOM67 implementation in Cytobank. Differentiation abundance analysis was performed using cydar32 on CAR T cells in blood on day 7 that were subsampled to 400 cells per patient.

CAR T cell functional assay

PBMCs cryopreserved on day 7 after axi-cel infusion were thawed and counted (as in CyTOF section). Up to 2 × 106 live cells were stimulated with 5 ng ml−1 of phorbol 12-myristate 13-acetate (PMA; Sigma-Aldrich, P1585) and 500 ng ml−1 of ionomycin (Sigma-Aldrich, I0634). To stimulate CAR T cells through CAR, a flat-bottom non-TC-treated 96-well plate (Falcon, 351172) was coated with 5 µg ml−1 of anti-FMC63 scFv idiotype antibody27 in 100 µl of PBS (Gibco) for 16 hours at 4 °C. Just before seeding the cells, coated wells were washed twice with 200 µl of PBS and immediately seeded with 300,000 live cells. All test wells contained 1× monensin (Thermo Fisher Scientific, 00-4505-51) and one test of CD107a-BV785 antibody (Supplementary Table 3) and were filled to 200 µl with CCM. All samples were analyzed in one batch. Healthy donor T cells and in-house CAR T cells were included as controls. Once the assay was set up, the 96-well plate was centrifuged at 200g for 3 minutes and incubated at 37 °C for 6 hours. Cells were then transferred into a 96-well V-bottom plate (Corning, 3894) and analyzed by intracellular flow cytometry (see below).

Intracellular flow cytometry assay

After CAR T cell functional assay, cells were stained with Fixable Viability Stain 780 (BD Biosciences, 565388) for 5 minutes at room temperature. For the validation cohort analysis, we stained day 7 thawed and counted cryopreserved PBMCs with Fixable Aqua Dead Cell Stain Kit (Thermo Fisher Scientific, L34957). Fc receptors were blocked with Human Fc block (BD Biosciences, 564220) in FACS buffer (PBS with 2% FBS). Next, cells were stained with surface and intracellular antibody master mixes (Supplementary Table 3) using Foxp3 / Transcription Factor Staining Buffer Set (eBioscience, 00-5523-00) following the manufacturer’s instructions. Finally, cells were resuspended in 200 µl of FACS buffer and analyzed on an LSRFortessa flow cytometer (BD Biosciences; for CAR T cell functional assay) or a five-laser LSRII flow cytometer (BD Biosciences; for validation cohort analysis). Data were analyzed using Cytobank software. For the CAR T cell functional assay, single live CD4+ and CD4 CAR T cells were gated as CD19CD33CAR+ events among single viable lymphocytes, as defined by the forward and side scatter gates. For the validation cohort analysis, we gated CD3ε+CD4+CD8αCAR+ or CD3ε+CD4CD8α+CAR+ events among single viable CD14CD19CD33CD45+ lymphocytes, as defined by the forward and side scatter gates. Owing to considerable sample degradation and paucity of cells in the validation cohort, we only reported values for CD4+ and CD8+ CAR T cell populations with ≥100 cells detected. When analyzing Helios+ cells within a rare CD57+T-bet+ population, we reported values for all patients with ≥10 CD4+ or CD8+CD57+T-bet+CAR T cells detected.

Single-cell sequencing

5′ single-cell sequencing of whole transcriptome (scRNA-seq), αβTCR (scTCR-seq) and cell surface epitope expression (CITE-seq) of sorted CAR T cells was performed using the 5′ Immune Profiling with Feature Barcoding technology (10x Genomics) according to the manufacturer’s protocol. In brief, fresh PBMCs were obtained on day 7 after axi-cel infusion, resuspended in PBS with 0.2% BSA and stained with Human TruStain FcX (BioLegend, 422302), LIVE/DEAD Fixable Blue Dead Cell Stain Kit (Thermo Fisher Scientific, L34962), antibodies for fluorescence-activated cell sorting (FACS) (Supplementary Table 3) and CITE-seq antibodies (Supplementary Table 3) in the presence of 1% dextran sulfate sodium salt (Thermo Fisher Scientific, AC441490050) to prevent oligo-labeled antibody aggregation. Next, 50,000–70,000 CAR T cells (single live CD4+ or CD8α+CD235aCAR+ events) were sorted on a FACSAria Fusion cell sorter (BD Biosciences) to ≥95% purity. Debris was excluded by adjusting the threshold setting and applying a scatter gate during FACS. Sorted CAR T cells were counted, resuspended to 700-1,200 cells per µl and captured using Single Cell Chip A on the 10x Chromium Controller (10x Genomics) to generate gel bead-in emulsions (GEMs). Reverse transcription inside GEMs was performed using a C1000 Touch Thermal Cycler (Bio-Rad). Barcoded complementary DNA (cDNA) was recovered through post-GEM-RT cleanup and PCR amplification. Recovered cDNA was amplified and used to construct 5′ whole transcriptome, αβTCR and cell surface epitope libraries. Quality of cDNA and each library was assessed using an Agilent 2100 Bioanalyzer. The libraries were indexed using a Chromium i7 Sample Index Kit, pooled and sequenced on HiSeq 4000 or NovaSeq 6000 systems (Illumina).

Singe-cell sequencing data processing

Raw single-cell sequencing data were processed using Cell Ranger software (10x Genomics). Sequencer’s base call files (BCLs) were demultiplexed into FASTQ files using the cellranger mkfastq pipeline. FASTQ reads for scRNA-seq and CITE-seq were aligned to the GRCh38 human genome reference (for scRNA-seq) and to a custom Feature Barcode reference (for CITE-seq) using the cellranger count pipeline. FASTQ reads for scTCR-seq were aligned to the vdj-GRCh38 reference using the cellranger vdj pipeline.

Unique molecular identifier (UMI) count matrices from Cell Ranger were analyzed using Seurat37,68. Clonotype data were included into the metadata for each sample. Dead cells and cell debris with >15% of UMI counts mapping to mitochondrial genes or <300 genes detected were excluded from the analysis. Cell doublets containing >10,000 genes or >100,000 UMI counts were also excluded. Next, scRNA-seq data were normalized by the sequencing depth using the SCTransform pipeline69 using a single model for all samples. CITE-seq data were first normalized using centered log ratio (CLR) transformation with unsatisfactory results. We, therefore, normalized CITE-seq data by the sequencing depth for all samples together using SCTransform, followed by asinh transformation of SCT count data with custom factors for each marker. We predicted surface protein expression for the missing CD8α CITE-seq antibody in CAR T cells from patient 116 (Supplementary Table 3). To do this, we built a weighted nearest neighbor (WNN) embedding reference and a mapping between the scRNA-seq and CITE-seq data37, using the data from all patients except 116 with FindMultiModalNeighbors, RunSPCA, FindNeighbors and FindTransferAnchors. We then predicted CITE-seq data for patient 116 by projecting its scRNA-seq data using MapQuery and the above reference using the transfer anchors calculated above. We only used predicted CITE-seq data for surface CD8α and used measured CITE-seq for all other surface epitopes for patient 116.

Principal component analysis (PCA) of normalized scRNA-seq data was performed using all 2,000 variable genes with the exception of TCR and BCR genes to prevent clonotypes from driving the final layout, as well as sex-associated genes (XIST, RPS4Y1 and RPS4Y2) and mitochondrial genes (MT-*). In addition, a set of curated genes relevant to T cell function or cell type identification was included into the list of variable genes. PCA of normalized CITE-seq data was performed using all measured T cell epitopes. WNN-based uniform manifold approximation and projection (wnnUMAP) embedding was constructed using the first 30 principal components from scRNA-seq data and the first ten principal components from CITE-seq data. Cell cycle score and stage were assigned to each cell using the CellCycleScoring pipeline based on 97 canonical cell cycle markers45. Cell subsets and expression of additional CITE-seq markers were identified by projecting data onto an annotated reference dataset using Azimuth37. We only included predicted expression of surface CTLA4 in Extended Data Fig. 8f and did not use other predicted values. Cell population identities were assigned as shown in Fig. 5b to all cells after removal of non-T cells and aggregates of CAR T cells bound to myeloid cells. Differential expression analysis was performed using the FindMarkers pipeline for all genes expressed in ≥5% cells, with EnhancedVolcano used to visualize results in a volcano plot. Pathway enrichment analysis was performed on results from differential expression analysis using Reactome70.

Statistical analysis

Statistical analysis was performed using R statistical software. Data were summarized using dplyr and plotted with either ggplot2 or ComplexHeatmap71. To test for statistical significance in 2 × 2 tables, we used two-sided Fisher’s exact test in exact2x2. To test association of two factors with more than two levels, we used general Cochran–Mantel–Haenszel chi-squared test in vcdExtra. We applied unpaired two-sided Wilcoxon–Mann–Whitney U-tests to assess statistical significance between two groups of unpaired samples. When more than two groups were compared, we first used the Kruskal–Wallis H-test (one-way analysis of variance on ranks) to check whether there are differences among treatment groups, followed by unpaired two-sided Wilcoxon–Mann–Whitney U-test applied to each treatment pair and Bonferroni correction to correct for multiple hypothesis testing. If samples were paired, we used Friedman test, followed by pairwise two-sided Wilcoxon signed-rank tests and Bonferroni correction to adjust for multiple hypothesis testing. To assess correlation, we calculated either Spearman’s rank correlation coefficient or Pearson’s correlation coefficient, as indicated, and a P value using the correlation test. Radar plots were built using fmsb.

AUC and AUMC were calculated using pkr72 on days 0 (no CAR T cells), 7, 14, 21 and 28. AUC0–28, AUMC0–28 and peak CAR T cell expansion were calculated only when data were obtained for at least two timepoints.

OS, PFS and TTP were estimated with the Kaplan–Meier method using survival and GGally in R. OS was determined from the date of infusion to the date of death or the last follow-up. TTP was calculated from the date of infusion to the date of progression or the last follow-up. PFS was computed from the date of infusion to the date of progression, death or the last follow-up, with both progression and death scored as an event. Owing to an increased rate of death unrelated to progression during the global pandemic, we present data as TTP, except for Fig. 1b, where comparison to PFS in a prior study is made.

To construct a lasso logistic regression (‘binomial’) model, we used glmnet73 with α = 1 and ten-fold cross-validation to tune the L1 regularization parameter λ. Metaclusters 2 (cell doublets) and 10 (cell fragments) were excluded before the model construction. To fit a logistic regression model, we applied glm function in R to the discovery cohort data. We plotted the receiver operating characteristic (ROC) curve and determined the optimal probability threshold using InformationValue on the discovery cohort data. Variable importance in each model was assessed using caret74. Variance inflation factor (VIF) to assess multi-collinearity was calculated using car. The model with the selected threshold was then applied to the validation cohort to assess performance on unseen samples, including sensitivity, specificity and the misclassification error rate.

Figures were created with Illustrator (Adobe) and BioRender (https://biorender.com).

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.