Mutation of TP53 is the most common genetic abnormality in head and neck squamous cell carcinoma (HNSCC) and results in an accumulation and expression of p53 protein in tumor cells. Disruptive TP53 mutations are consistently associated with poor prognosis but correlations of p53 expression with mutation or prognosis have been variable and the usefulness of p53 as a target for immunotherapy is unknown. Favorable prognosis is associated with the accumulation of T lymphocytes (TILs) in the tumor microenvironment and an immune response to p53 has been suggested by demonstration of antibodies to p53 and p53-restricted cytotoxic cells in patients with HNSCC. To investigate if p53 expression is related to accumulation of TILs, p53 expression was measured in a prospective cohort of patients with HNSCC and correlations with TILs and prognosis were determined.
Studied were 534 previously untreated patients (n) with oral cavity (273), oropharynx (158), larynx (81) or hypopharynx (22) cancers. Expression of p53, p21, and p16 and levels of T cell infiltrates (CD4, CD8, FoxP3) were assessed by immunohistology in tissue microarrays from biopsy specimens. HPV testing by routine pathology was available for 401 patients. Associations with clinical variables were tested using Kruskal-Wallis tests (p53, p21) or Chi-square test (p16). Kaplan-Meier and Cox regression methods were used to evaluate univariable and multivariable associations of protein expression with TIL levels and overall survival (OS) and disease specific survival (DSS) after adjusting for known prognostic factors. Median follow up was 44 months.
Higher p53 protein expression was associated with worse OS in univariable (HR = 1.05; 95% C.I. = 1.02, 1.09; p = 0.002) but not in multivariable analysis and did not correlate with increased TIL infiltration. Combined TIL weighted score was a significant independent prognostic factor for OS and DSS, (HR = 0.95; 95% C.I. = 0.93, 0.98; p = 0.0003 and HR = 0.96; 95% C.I. = 0.93, 0.99; p = 0.005, respectively). All of the biomarkers (p53, p21, p16, TILs) differed by HPV status and tumor site (p < 0.0001 each). Further analysis by specific tumor site was unremarkable for an association between p53 expression and TILs or prognosis. However, in multivariable analysis for oropharynx cancer, p53 expression was associated with increased risk of death (HR 2.33; 95% C.I. = 1.00, 5.43; p = 0.05) and in particular for the p16 negative oropharyngeal subgroup (HR = 8.62, 95%; C.I. = 1.94, 38.4; p = 0.005).
Over-expression of p53 is an unreliable biomarker for prognosis and does not correlate with levels of TILs suggesting that p53 neoantigens are unlikely to be useful targets for future immunotherapy. These findings were confirmed when patients were analyzed by specific tumor site, however in the subset of HPV-negative oropharyngeal cancers, p53 expression was associated with treatment outcomes and could be a useful biomarker. (438 words).
p53, Tumor infiltrating lymphocytes, Head and neck cancer
The TP53 gene and its gene products control the cell cycle by inducing arrest in response to DNA damage and apoptosis if the damage is irreparable . Inactivation of TP53 through mutations plays a critical role in early cancer development and progression. TP53 is a key tumor suppressor  and TP53 mutation is the most common genetic change in malignancies, involved in 50% of human cancers  as well as up to 70% of head and neck squamous cell carcinomas (HNSCC) [4-6]. A spectrum of TP53 mutations have been identified, but disruptive mutations are probably the most important and have consistently been associated with aggressive tumor behavior and poor survival in HNSCC [6-9]. Mutations of the gene most commonly involve a missense mutation involving the p53 DNA-binding domain that leads to a change in the binding properties or conformation of the protein that inactivates its function and frequently results in prolonged protein half-life and accumulation in the cytoplasm, allowing detection by immunohistochemistry in tumor cells . These gene products have been thought to be excellent targets for immunotherapy . However, overexpression of p53 is an unreliable indicator of TP53 mutation because of other loss of function, dominant-negative or gain of function alterations, degradation of p53 by HPV-16 and other causes of post transcriptional modifications .
The clinical relevance of p53 protein overexpression as a biomarker and target for immunotherapy in HNSCC has not been fully elucidated and remains controversial. A major goal of the current study was to determine in a large number of patients the potential usefulness of p53 expression as an indicator of immune response and reflection of tumor infiltrating lymphocytes. It is clear that abnormalities in p53 can contribute to aggressive tumor behavior. In laryngeal carcinoma, several investigations showed no association with clinical outcome [13-17]. Some have reported a correlation with reduced survival [18-20], and others with prolonged survival . Similar conflicting data have been reported for other sites in the head and neck region [22,23]. Some investigators have reported a correlation with recurrences but not with overall survival [24,25]. Others have shown an association with improved survival . We previously reported a statistical trend for p53 mutations to predict poor survival  and forp53 expression to predict response to chemotherapy . Conclusions from all of these studies have been limited by the small numbers of patients studied and by varying methods of staining and expression scoring.
Because of the frequency and important biologic effects of TP53, it is clear that biomarkers such as p53 expression could provide useful clinical information if reliable and reproducible correlations with prognosis or treatment response could be demonstrated in large homogenous groups of patients. Most recently, we have shown that the immune response in the tumor microenvironment as reflected by tumor infiltrating lymphocytes (TILs) is a critical factor in predicting prognosis [28-30]. The role of p53 protein as a neoantigen in the tumor microenvironment is currently unclear and it is unknown if over-expression of p53 protein can stimulate or suppress this critical cellular immune response. Evidence has suggested that p53-mediated responses may be responsible for the recruitment of TILs to the tumor microenvironment and can alter the immune response [31,32]. In addition, serum antibodies to p53 are frequently identified in patients with HNSCC . However, studies looking at the ability of serum p53 auto-antibodies to trigger a cellular immune response or predict prognosis have yielded contradicting findings [33-35]. The current investigation was undertaken in a large cohort of patients enrolled in a prospective epidemiology study to determine if p53 overexpression is in fact related to TIL levels in the tumor microenvironment in HNSCC subjects or associated with patient outcomes after treatment.
From November 2008 to October 2014, 1042 previously untreated patients with HNSCC were screened and subsequently approached to enroll in an institutionally approved prospective epidemiology study that included tumor tissue collection and baseline survey of epidemiological characteristics, behavioral modules, and demographics. Of all patients approached, a total of 534 patients were included in the current study (Table 1). Only patients with the most common head and neck cancer sites and suitable biopsy tissue available (n): oral cavity (273), oropharynx (n158), larynx (81), or hypopharynx (22) were selected for this study to enhance tumor site homogeneity. Testing for human papilloma virus-16 (HPV) was performed on biopsy specimens in 401 patients in the research lab by ultrasensitive method using real time competitive polymerase chain reaction and matrix assisted laser desorption/ionization time of flight mass spectroscopy or in our routine pathology labs by in situ hybridization for HPV-16 or p16 IHC and those results were obtained from medical record review of pathology reports. Tumors were considered HPV-related in 143 patients. For the oropharynx site, 86% of tumors were HPV positive. Formalin-fixed paraffin-embedded tissue blocks from diagnostic biopsies were gathered to create tissue microarrays (TMA). Standardized treatment recommendations were made by a multidisciplinary tumor board according to tumor site and extent (Table 2). Extensive clinical data, new tumor events, and tumor status were prospectively collected and updated at each patient visit and annually until death or patients were lost to follow-up. Median follow up was 44 months.
Table 1: Demographic Characteristics and Clinical Variables. View Table 1
Formalin fixed, paraffin embedded (FFPE) biopsy tissue blocks were retrieved from the Department of Pathology at UM or outside hospitals as necessary for patients enrolled in the longitudinal epidemiology study approved by the UM Institutional Review Board. An expert head and neck pathologist (JM) confirmed tumor histology and screened for areas of > 70% cellularity and no necrosis. Areas for sampling were circled for creation of the array. Tissue microarrays were created from the biopsy specimens with sufficient tumor present. Tumor cores (0.7 mm) were taken in triplicate for each patient and representative 5-micron sections prepared from the TMA block for immunohistochemical (IHC) staining of protein expression for p53, p21 and p16 genes and for TIL subsets (CD4, CD8, FoxP3) and beta-4 integrin (CD104) for tumor localization. Staining and scoring were performed as previously described  and summarized briefly below. The TMA slides were digitally imaged/scanned and stored in the public server of Pathology department, and later could be retrieved with Aperio ImageScope v12 software. Cores were scored at 200x magnification.
The TMA slides were incubated in hot-air oven at 65℃ overnight, deparaffinized, rehydrated with xylene, graded alcohols, and buffer immersion steps. Antigen retrieval was carried out using standard HIER (heat-induced epitope retrieval) methodology  in which slides were incubated in a preheated pressure cooker with Citrate buffer pH6 or Tris-EDTA buffer pH9 and blocked with horse serum (30 minutes at 25℃). Immunohistochemical staining was completed on the DAKO autostainer (DAKO, Carpinteria, CA) using LSAB+ and DBA (DAKO labeled avidin-biotin-peroxidase kits) as the chromogens. Four monoclonal antibodies of interest and titrations that were used for T cell assessment included CD4 (1:250, BioCare Medical CM153), CD8 (1:40, BioCare Medical CM154), CD104 (1:50, eBioscience 13-1049) and FoxP3 (1:200, Abcam Ab-20034). Other antibodies were used for gene expression of p53 (1:50, DAKO, M7001, clone DO-7); p21 (1:50, DAKO, M7202, clone SX118) and p16 (1:1200, Epitomics, 3562-1, clone EPR1473). Incubations were for 60 minutes and included control tissues of normal tonsil, cervix, or breast cancer as appropriate.
Based on extensive prior experience with counting infiltrating lymphocytes in both tumor parenchyma and peritumoral stroma, only TILs infiltrating in tumor parenchyma were quantified since it has proven most representative of the prognostically important immune response parameter for TILs [28,29]. TILs in the stroma or adjacent lymphoid aggregates were excluded. Appropriate negative (without primary antibodies) and positive (tonsillar tissue and various carcinomas) controls were stained concurrently on the same slides of tumors of interest.
TMA slides were stained, scanned for digital imaging, and retrieved using Aperio ImageScope version 12 software. Grid software (Measure, C Thing Software 2.01) was employed to overlay each tissue core image before counting. A technician naive to clinical status scored stained TMA slides. CD104 (beta-4 integrin) staining was used before counting TILs to identify the location and extent of carcinoma in each core. Cores were scored as having 25%, 50%, 75%, or 100% of tumor present. Cores deemed to have < 25% tumor parenchyma were not scored. Using 200X magnification, TILs stained with CD4, CD8, and FoxP3 antibodies were manually counted. Only those TILs that infiltrated the tumor parenchyma were counted. The TIL count for each core was normalized by dividing each raw value by the fraction of each 0.7 mm core considered to be tumor as indicated by CD104 staining. Normalizing TIL counts ensured that variation in counted cells was representative of increased TIL density within the tumor parenchyma and not biased by tumor proportion within cores. A semi quantitative TIL weighted sum score (TILws) that included averages of all three T cell subsets was calculated by principal component analysis as previously reported (Spector M, et al. in press) and used in statistical analysis. By using a TMA for these studies, staining conditions were uniform for each antibody tested, limiting variability from immunohistological and scoring techniques.
A modified Quick score was used in the immuno-histochemical evaluation of p53 expression  that included averaging the product of the intensity and percentage of cells staining positive within carcinoma tissue for each tissue core as previously described . Briefly, staining was read in tissue microarrays (TMAs) by a technician naive to patient clinical status and outcome. Nuclear staining for p53 was scored for both proportion and intensity within the tumor parenchyma. Proportion of cells scored positive for nuclear staining was scored as 1 (< 25%), 2 (25-50%), 3 (50-75%), and 4 (75-100%). Intensity of nuclear staining was scored on a scale of 0-3 as follows: 0 (no staining), 1 (weak staining), 2 (moderate staining), 3 (high staining). The product of the two parameters was used to obtain an overall score with values ranging from 0-12. For analysis, scores were considered on a continuous scale and tumors were also classified as having low expression with a score of 0-5.99, or strong expression with a score of 6-12. The overall scores of triplicate core samples from each patient were averaged and the resulting mean was used for statistical analysis. Only cores in the TMA with greater than 1% tumor were scored, while cores with less than 1% tumor were reported as no tumor present. Expression of p21 IHC staining was scored according to nuclear staining intensity. Cores with less than 1% tumor cellularity were not scored and reported as no tumor present. Intensities were scored as follows: 0 (no staining), 1 (weak staining), 2 (moderate staining), 3 (high staining). The average of triplicate scores was reported and used in analysis of p21 expression. Expression of p16 was scored as either positive or negative.
Associations with clinical variables were tested using Kruskal-Wallis tests (p53, p21) or Chi-square test (p16) depending on the marker distribution. Time-to-event outcomes were defined from date of initial diagnosis to date of last follow-up, or date of death for overall survival (OS) and disease specific survival (DSS). For DSS, a death from unknown or other cause was censored at date of death. The Kaplan-Meier method was used to evaluate univariable associations with outcomes based on a priori set cut offs for levels of each biomarker (p53 expression defined as low [< 6] and high [6-12]; p53 expression defined as low [0-3], medium [3-6], high [6-12]); p21 expression defined as none , weak[0-1], moderate[1-2], high [2-3]). Additionally, in order to represent overall immune response, a semi-quantitative TIL weighted sum score (TILws) was calculated from a principal components analysis of individual T cell subset scores as follows in accordance with methodology presented in Spector, et al. (refer Spector, et al. in press) where TILws = 0.35*CD4 + 0.35*CD8 + 0.3*FoxP3. Single variable and multivariable Cox proportional hazard models were used to test associations between markers and time-to-event outcomes. Multivariable models included covariate adjustments for age, stage, disease site, HPV status, comorbidities, treatment, batch effect and smoking history. All statistical analyses were conducted in SASv9.3 and graphed in R (v3.0.3).
Biomarkers and Clinical Characteristics. Scoring for p53 expression was on a modified Quick 12 point composite scale (proportion score x intensity score) . Low expression (score < 6) included 64% of patients. High expression (score 6-12) was present in 36% of patients. Expression of p53 did not differ significantly by tumor stage, T class or N class. Expression of downstream targets of p53 included p21 (scored as none, weak, moderate or high intensity) and p16 (scored as positive or negative) and showed that a total of 26% of patients had no or weak p21 expression, 41% had moderate and 33% had high expression of p21. Positive HPV status assessed on biopsy tissue differed by tumor site with 13% of larynx, 7% of oral cavity, 11% of hypopharynx and 86% or oropharynx cancers positive (Table 3). As expected, HPV status and p16 scores were highly correlated (p < 0.01). Only 5% of HPV positive cases on biopsy tissue were p16 negative on the TMAs in the overall cohort and likewise for the oropharynx subsite.
Table 3: HPV Status by Disease Site. View Table 3
All of the biomarkers (p53, p21, p16, and TILws) differed significantly by HPV status (p < 0.0001 each). Because HPV status also differed significantly by tumor site, each of the markers differed by tumor site (p < 0.0001). Likewise, differences in primary treatment (Table 2) were evident comparing oral, oropharynx and larynx tumor sites and therefore p53, p16 and p21 expression differed significantly by initial treatment modality (surgery or chemoradiation) for p53 (p = 0.03), p16 (p < 0.0001) and p21 (p = 0.0004). Expression of p53 also differed by gender (p = 0.02) which was likely due to the fact that there was an imbalance in gender distribution for oropharyngeal patients with significantly higher rates of HPV positivity in males.
Table 2: Treatments by Disease Site and Stage. View Table 2
The degree of lymphocyte (TILs) infiltration in the microenvironment also differed by tumor site with highest infiltrates in oropharynx cancers and lowest in larynx cancers for each T cell subset. Each T cell subset was also significantly higher in HPV positive cancers compared to HPV negative (data not shown). Because of the association of TILs and oropharynx/HPV status, higher levels of each TIL subset were associated with chemoradiation treatment compared to primary surgery since the majority of oropharynx cancer patients were treated with primary chemoradiation (Table 2).
Scores for p53 expression were significantly lower in patients with oropharyngeal cancers (p = 0.001), HPV positive (p < 0.0001) and p16 positive (p = 0.0001) cancers. (Figure 1). Similarly, p21 expression was highest in oropharynx (p < 0.0001) and p16 positive cancers (p < 0.0001). Percentage of positive p16 tumors was lowest for HPV negative oropharynx cancers, since p16 is often a surrogate marker for HPV status.
Figure 1: Box plots showing differences in p53 expression scores by tumor site, HPV status and p16 expression. P53 expression was significantly higher in oropharynx/hypopharynx subjects (p = 0.001) and in HPV and p16 negative cancers (p < 0.0001 and p = 0.0001 respectively). LA = larynx, OC = oral cavity, OP = oropharynx, HP = hypopharynx. View Figure 1
For the entire cohort, overall (OS) and disease specific survival (DSS) differed by tumor stage and site (Figure2). As expected, highest survival rates were evident for oropharyngeal patients and worst for oral cavity and hypopharynx cancer patients. OS and DSS also differed by high vs. low p53 expression (Figure 3). When analyzed by univariable Cox regression modeling, p53 expression score was associated with slightly higher risk of failure for OS (HR = 1.05 [95% C.I. = 1.02, 1.09] p = 0.002) and DSS (HP = 1.06 [95% C.I. = 1.01, 1.10] p = 0.009) for every unit change. Likewise, positive p16 expression was predictive of improved OS (HR = 0.39 [95% C.I. = 0.27, 0.57] p < 0.0001) and DSS (HR = 0.33 [95% C.I. = 0.20, 0.53] p < 0.0001. In multivariable Cox modeling controlling for age, tumor stage, tumor site, co-morbidity, HPV status and smoking, p53 was no longer a significant prognostic factor and p16 was only significant for DSS (HR = 0.39 [95% C.I. = 0.18, 0.86] p = 0.02) when HPV status was in the model. If HPV status was not included, p16 was associated with significantly reduced risk of death (OS). To determine if there was an interaction of p53 with p16 or p21 expression, we tested the multivariable Cox model that included main effects of p16 and p21 and their interaction with p53. The p53/p16 interaction was significant for OS (p = 0.05) but when p16 positive and negative patients were separately analyzed and the effect of p53 estimated, we saw no significant effect of p53 within either subgroup. We did not find strong evidence for an interaction of p53, p16 or p21 expression and other survival outcomes. We examined whether p53 expression and outcomes differed according to primary treatment modality. In univariable Cox modeling stratified by initial treatment (surgery +/- radiation, n = 317; chemoradiation, n = 164) high p53 expression was associated with worse OS and DSS (HR = 1.54, 95% C.I. = 1.06, 2.23; p = 0.02 and HR = 1.64, 95% C.I. = 1.06, 2.54; p = 0.03 respectively) for patients treated with primary surgery but not for patients treated with chemoradiation. This differed from results in the oropharynx subgroup analysis where high p53 expression was associated with increased risk of death (HR = 2.30, 95% C.I. = 1.07,4.91; p = 0.03). In multivariable analysis after controlling for age, stage, site, comorbidity, HPV and smoking, no significant association of p53 with outcomes was seen.
Figure 2: Kaplan-Meier curves for overall and disease specific survival according to tumor site and stage. LA = larynx, OC = oral cavity, OP = oropharynx, HP = hypophayrnx. View Figure 2
Figure 3: Overall survival and disease specific survival by high (> 6) versus low (< 5.99) p53 expression score. View Figure 3
We previously confirmed the prognostic significance of TIL levels in HNSCC [28, (Spector M, et al., in press)]. In Cox analysis that included p53, TILs remained a significant independent prognostic variable for both OS and DSS (HR = 0.95, 95% C.I. = 0.93, 0.98; p = 0.0003 and HR = 0.96, 95% C.I. = 0.93, 0.99; p = 0.005 respectively, Table 4). Levels of CD4 and CD8 infiltrating T cells tended to correlate directly with p21 expression (p = 0.01 and p < 0.01 respectively). Positive p16 expression correlated directly with levels of each T cell infiltrate (Table 5., p < 0.01 for each). Because we had a large number of patients in our overall cohort, we were able to separately examine the potential prognostic role of p53 expression and correlation with tumor infiltration for oral cavity, oropharynx and larynx subsites individually.
Table 4: Cox proportional hazards models (univariable, multivariable) for p53, p21, p16, TILws and OS, DSS and model results for clinical variables for OS and DSS. View Table 4
Table 5: p53, p21, p16 Protein Expression Correlations with TIL Subsets. View Table 5
In 273 oral cavity patients with median follow up of 41 months, 53% had low or absent p53 expression. Expression of p16 was negative in 89%. Expression of p53 was associated with p16 (p = 0.01) and p21 (p = 0.03) expression. Although there was a trend for p53 and p21 to be associated with HPV status (only 7% of patients positive), this was only significant for p16 (p < 0.0001). Neither p53 expression, p16 positivity or p21 score were associated with OS, or DSS in univariable or multivariable Cox modeling.
In 156 oropharynx cancer patients, p53 expression was absent or low in 85% and high in only 15%. Likewise, p16 was negative in only 19%. Positive p16 was significantly associated with male gender, advanced tumor stage, HPV status, p21 and p53 score. In univariable analysis, high p53 expression was associated with decreased OS (HR = 2.30 [95% C.I. = 1.70, 4.91] p = 0.03) while p16 positivity was associated with decreased risk of death (HR = 0.27 [95% C I. = 0.14, 0.53] p < 0.01) or death from cancer (HR = 0.31 [95% C.I. = 0.14, 0.17] p < 0.01).
In multivariable analysis adjusting for age, tumor stage, co-morbidity, HPV status and smoking, high p53 score was associated with increased risk of death and death from cancer (HR = 2.33; 95% C.I. 1.00, 5.43; p = 0.05 and HR 2.53; 95% C.I. = 0.94, 6.85; p = 0.07, respectively). Not surprisingly, with HPV status removed from the model, p16 positive cancer was associated with significantly reduced risk of death, relapse or death from cancer. Interestingly, with HPV in the model, low or absent p21 expression was associated with reduced risk of death from cancer (HR = 0.10 [95% C.I. = 0.02, 0.48] p = 0.004). When we analyzed multivariable models with the p16/p53 interaction in the model and fixed effects for p53, p16, age, stage, co-morbidity and smoking, we showed a trend for significant interaction for OS and DSS (p-0.16 and p = 0.09). Analyzing p16 positive cases (N = 158) separately, p53 was not prognostic. However, for p16 negative cases (n = 30), high p53 was associated with significantly increased risk of death (HR = 8.62 [95% C.I. = 1.94, 38.37] p = 0.005) and death from cancer (HR = 21.8 [95% C.I. = 2.35, 202.14] p = 0.007). Among all of the T cell subsets, only CD8 T cell infiltrates correlated significantly with p16 score (rho = 0.3, p < 0.01) and p53 score (rho = 0.2, p = 0.02).
In the smaller subset of larynx cancer patients (n = 81), neither p53, nor p16 nor p21 were significant prognostic biomarkers by univariable or multivariable analysis. A total of 68% of cancers had low p53 expression and 87% were p16 negative. Although not associated with outcome, p53 score correlated directly with level of CD4 infiltrates (rho = 0.3, p = 0.03).
Mutation of TP53 is one of the most common genetic changes associated with head and neck cancer and results in over-expression of abnormal p53 proteins in tumor tissues [38,39]. It has been suggested that p53 proteins could play a role as neoantigens and present targets for immunotherapy in HNSCC and other solid malignancies [11,40]. Although TP53 mutations have been demonstrated in up to 70% of patients, and deleterious mutations associated with poor prognosis, over-expression of p53 is variable and correlations with prognosis or withTP53 mutations have been difficult to demonstrate due to variability in assays and small numbers of studied patients [8,12,41]. HPV status and levels of TILs have recently been suggested as two of the most important new prognostic factors that are gaining wide clinical acceptance [28,42,43]. With current interest in the development of effective immunotherapy regimens for head and neck cancer, it would be useful to determine if an adaptive immune response to p53 expression is clinically relevant or could provide a target biomarker for selecting patients for immune modulation. We undertook an investigation to determine if expression of p53 was associated with a clinically meaningful immune response in the TME as manifest by levels of TILs. The most important finding in this large prospective cohort study was the overall lack of correlation with specific levels of TILs and confirmation of the minimal usefulness of p53 over-expression as a prognostic biomarker when other variables such as tumor site and HPV status were considered.
Older studies demonstrating p53-reactive antibodies in the serum of HNSCC patients and p53-specific cytotoxic lymphocytes suggest that p53 proteins might stimulate an immune response [31,32]. Recent in vitro studies implicate p53 as a neoantigen in a variety of neoplasms that could pose good target for immunotherapy or generation of p53-specific TILs. [40,43,44]. Unfortunately, little is known regarding the presentation of abnormal p53 proteins to the immune system or how variations in HLA presentation for products of differing TP53 mutations might affect attraction and retention of cytotoxic T lymphocytes.
Our data do not support a clinically meaningful host reaction to p53 over-expression in the tumor microenvironment as indicated by a lack of correlation of p53 expression with semi-quantitative TIL levels. Expression of p53 was not associated with either high or low levels of TILs and was only prognostic for worse outcomes as a univariable when tumor site and HPV status were not considered. Even without functional assessment of TILs in the current study, the significant correlation of TIL numbers with prognosis suggests that higher infiltrate levels are beneficial [28-30]. In multivariable analysis taking into account known prognostic factors, TIL levels remained a significant prognostic biomarker and p53 expression did not, while p16 positivity was significant for favorable outcome only for DSS when HPV status remained in the model.
Because of the large number of patients in this study, we were able to assess p53 expression in three major tumor sites (oral cavity, oropharynx, larynx) and associations with other clinical variables and outcomes were examined. As expected, p53 expression was low and p16 expression high for the oropharynx site due to the high proportion of HPV-related cancers which typically lack p53 mutation and show inactivation of p53 by viral E6 . In general, TIL levels were higher than other sites but did not correlate with the low expression of p53. It might have been suspected that high p53 expression would have correlated with the lower levels of TILs since we previously found lower levels associated with HPV negative cancers [28,29 (Spector M, et al, in press)]. In fact, we did see a weak direct correlation of p53 expression with CD8 TIL levels not corrected for multiple testing (R = 0.19, p = 0.05). We did find that p53 expression was associated with OS for the oropharynx site, but this seem heavily influenced by the worse prognosis associated with the small number of HPV negative oropharynx cases where high p53 was a significant negative prognostic factor. It was interesting to note that we had dissimilar results in multivariable analysis when patients were stratified by initial treatment. Since most oropharynx patients were treated with primary chemoradiation, it was surprising to find that high p53 expression was not associated with a significant risk of death (HR = 1.32, 95% C.I. = 0.57, 3.07; p = 0.52) in patients treated with chemoradiation. Further, for patients treated with initial surgery (n = 317), high p53 was associated with significantly increased risk of death and death from cancer in univariable analysis, however this disappeared on multivariable analysis adjusted for known prognostic factors.
The data also indicate the lack of clinical usefulness of p53 expression for predicting prognosis for larynx and oral cavity tumor sites. This confirms and extends the findings of others that show variation in expression by tumor site and loss of prognostic significance other clinical prognostic factors are considered . It was interesting to note that for some sites, p53 expression had weak correlations with specific TIL subsets that differed by site. For the oropharynx site, p53 expression score as a continuous variable correlated with increased CD8 infiltrates among p16 positive patients (rho = 0.2, p = 0.01), but the correlation was weak and the analysis suffered from relatively small numbers of patients. For the larynx site (n = 78), only CD4 levels correlated with p53 scores (rho = 0.3, p = 0.03). Additional studies of homogenous groups of patients combined with functional studies will be needed to determine the significance of these observations. It is interesting to note that p53 specific CD4 positive T cell subsets have been reported to respond to p53 proteins in cell line studies from both wild type and mutated cell lines  while others have shown no correlation between p53 expression and anti-p53 T cell receptor transduced T cells in a study of 48 cell lines .
The major limitations in this study included the inability to characterize specific changes in p53 proteins or TP53 mutations that were responsible for IHC overexpression. Likewise, TIL levels were based on IHC phenotype, not on function. The activity of TIL subsets was unknown, but based on functional studies by others, many TILs represent "exhausted" T cells . Although our data suggest that variant p53 protein expression is unlikely to represent neoantigens related to TIL accumulation, some forms of aberrant p53 could still be antigenic based on prior observations of autoantibodies and other in vitro evidence [31,32,47]. As with most retrospective biomarker studies, the associations with treatment are inherently biased since treatment decisions were based on clinical tumor characteristics and not on biomarkers. Thus, correlations of p53 expression need further confirmation in other large studies.
Overexpression of p53 was confirmed to be an unreliable biomarker for prognosis and did not correlate with overall levels of TILs suggesting that p53 neoantigens would be unlikely targets for future immunotherapy. These findings were also confirmed when patients were analyzed by specific tumor sites except for the small subset of HPV-negative oropharynx patients where p53 expression was associated with worse outcomes and might be a useful biomarker.
The authors thank the many investigators in the University of Michigan Head and Neck Specialized Program of Research Excellence for their contributions to patient recruitment, assistance in data collection and encouragement including Carol R. Bradford, MD, Thomas E. Carey, PhD, Douglas B. Chepeha, MD, Sonia Duffy, PhD, Avraham Eisbruch, MD, Joseph Helman, DDS, Kelly M. Malloy, MD, , Scott A. McLean, MD, Tamara H. Miller, RN, Jeff Moyer, MD, Lisa Peterson, MPH, Mark E. Prince, MD, Nancy Rogers, RN, , Nancy E. Wallace, RN, Heather Walline, PhD, Brent Ward, DDS, and Francis Worden, MD. We greatly thank our patients and their families who tirelessly participated in our survey and specimen collections. This work was supported by the University of Michigan Head and Neck Specialized Program of Research Excellence NIH/NCI P50CA097248, UM Rogel Cancer Center P30CA046592 and NIH/NIDCD T32 DC005356.