TCPA: FAQ

Frequently Asked Questions

What is functional proteomics?

Functional proteomics is the large-scale study of proteins at the functional activity level, such as expression and modification. Studies of complex diseases such as cancer have shown that genetic alterations do not account for all of the causes of the disease. Changes in protein levels and structure have also been shown to play critical roles in tumor development and progression, which are not reflected by genetic changes. In cancers, several genetic and epigenetic changes are often required for development of the disease. Studying large-scale epigenetic changes such as protein phosphorylation or cleavage will greatly aid in understanding the causes and determining effective treatment of cancers and other complex diseases.
What is RPPA?

Reverse phase protein array (RPPA) is a high-throughput antibody-based technique with the procedures similar to that of Western blots. Proteins are extracted from tumor tissue or cultured cells, denatured by SDS , printed on nitrocellulose-coated slides followed by antibody probe. Our RPPA platform currently allows for the analysis of >1000 samples using at least 130 different antibodies.
What are the advantages of RPPA?
- Inexpensive, high-throughput method utilizing automation for increased quality and reliability.
- Sample preparation requirements are similar to that of Western blots.
- Complete assay requires only 40 microliters of each sample for 150 antibodies.
- Robust quantification due to serial dilution of samples.
How are the RPPA data processed?
- Level 1 data
  
  Cellular proteins are first denatured by 1% SDS (with beta mercaptoethanol) and diluted in five 2-fold dilutions in dilution buffer (lysis buffer containing 1% SDS). Serial diluted lysates are arrayed on nitrocellulose-coated slides (Grace Biolabs) by the Aushon 2470 Arrayer and probed with validated antibodies. Signals are amplified by TSA and captured by DAB colorimetric reaction. The slides are then scanned, analyzed and quantified by ArrayPro Analyzer to generate spot intensity.
- Level 2 data
  
  Based on Level 1 data, each dilution curve of spot intensities is fitted using the monotone increasing B-spline model in the SuperCurve R package. This fits a single curve using all the samples on a slide with the signal intensity as the response variable and the dilution steps as independent variables. The fitted curve is plotted with the signal intensities on the y-axis and the log2-concentration of proteins on the x-axis for diagnostic purposes.
- Level 3 data
  
  Based on Level 2 data, the data normalization is processed as follows:
  1. Calculate the median for each protein across all the samples.
  2. Subtract the median (from step 1) from values within each protein.
  3. Calculate the median for each sample across all proteins.
  4. Subtract the median (from step 3) from values within each sample.
- Level 4 data
  
  As with any other biological assays, there are batch variations between each RPPA assay. At this time, it is not possible to directly combine the raw or normalized (level 3) protein values. We have developed a replicate-based method to combine RPPA data from different slides, and you should use the RPPA dataset marked with L4 (e.g., Pan-Can 19 L4) when analyzing data across different batches.
How do we quantify protein expression and modification?

We use the approach of "Supercurve Fitting" developed by the Department of Bioinformatics and Computational Biology at MD Anderson Cancer Center to quantify protein expression and modification. Briefly, a "standard curve" is constructed from 5808 spots on each slide (one slide probed for one antibody). These spots include 5 serial dilutions of each sample plus 528 QC spots of standard lysates at different concentrations. Relative levels of protein expression and modification for each sample are determined by interpolation of each dilution curve to the "standard curve" (supercurve) of the slide (antibody).
Can I combine all RPPA data together or RPPA data from different cancers for analysis?
When analyzing the RPPA data:
- For a single disease profiled in a single RPPA batch, either Level 3 (L3) or Level 4 (L4) should be good for single-disease analysis.
- For a single disease profiled in multiple batches, L4 is definitely better than L3 for single-disease analysis because of batch effects among different batches.
- For multiple disease analysis, the merged Pan-Can 19 L4 data should be used. The pan-cancer analysis beyond the Pan-Can 19 is an ongoing effort; and users are encouraged to join the TCGA Pancan working group if they are really interested in.
Who should I contact if there are some questions about TCPA?
- If your question is about how the RPPA date are generated or antibodies used, contact Dr. Yiling Lu (yilinglu@mdanderson.org).
- If your question is about the bugs found at TCPA site, contact Dr. Han Liang (hliang1@mdanderson.org).

Frequently Asked Questions

What is functional proteomics?

What is RPPA?

What are the advantages of RPPA?

How are the RPPA data processed?

Level 1 data

Level 2 data

Level 3 data

Level 4 data

How do we quantify protein expression and modification?

Can I combine all RPPA data together or RPPA data from different cancers for analysis?

Who should I contact if there are some questions about TCPA?