Training Exercises
From GBWiki
Contents |
Exercises
This document provides some example files and tutorial exercises. To run the exercises, you need to have an account on a Phenyx server. Log in to Phenyx and enjoy...
Exercise 1: simple identification
This will allow you to perform a simple identification, to use the results comparison feature, to use the calibration status export
The first sample is a spot from a 2DE gel submitted to trypsin digestion, and analysed in data-dependant mode on a LC-QqTOF (Micromass QTOF1). Proteins are from human nuclei. Proteins were alkylated with iodoacetamide (Cys_CAM modification). File format is dta.
Get the file test1.dta (right-click on the link and save the file locally) and perform a submission. From the Main Desktop, select Submission. In the submission window, either select manually all appropriate parameters. Alternatively, for this exercise, you can also select the "QTOF_std_1rnd_tryp_CAM" submission profile from the profiles drop-down menu at the top of the page. Add a title of your choice and add the peaklist file on the bottom part of the submission page (browse for the file). Select also dta as file format. Click on submit.
- As the submission is running, you can see a blue flag in the job history table of the Main Desktop page. To visualise the status of the submission, you can click on the blue flag.
- What proteins do you identify?
- Can you see the human nucleophosmin (P06748)? Why is it appearing in the subset?
- How many forms of the peptide DELHIVEAEAMNYEGSPIK did you find (modifications, charge states)?
- Does it make sense to identify both nucleophosmin and HNRNPC? (Remember the sample is from one unique 2DE spot)
- How many spectra gave rise to identification? (see the compounds overview page)
- Are all proteins in the nucleophosmin group covered with the same set of peptides?
- Use the result comparison: Select the job number in the desktop, choose Compare in the 'Actions on Selected' drop-down menu. You will see the result comparison main window. Select the proteins of group 1, and choose open in detailed view (bottom of the left hand menu). In the detailed result comparison view, select Best Peptide Match to see each peptide once and Job ID in the grouping choice. Click on Update. You will see proteins covered with all peptides and others with only a portion of them.
- What is the calibration status of the instrument?
- (export text, error distribution report)
- Run the same dataset with Mascot and import into Phenyx (both standard and mudpit mode). Activate the result comparison feature. Can you see the same proteins?
- You can upload the results with the following urls:
- url standard scoring: http://www.matrixscience.com/cgi/master_results.pl?file=../data/20081015/FtTorzuee.dat
- url mudpit scoring: http://www.matrixscience.com/cgi/master_results.pl?file=..%2Fdata%2F20081015%2FFtTorzuee.dat&REPTYPE=peptide&_sigthreshold=0.05&REPORT=AUTO&_server_mudpit_switch=0.000000001&_ignoreionsscorebelow=0&_showsubsets=0&_showpopups=TRUE&_sortunassigned=scoredown&_requireboldred=0)
- Can you explain the differences?
- Hint: some peptide missing in Phenyx are low scores in Mascot. Some additional peptides in Mascot are below homology threshold and should be invalidated (in red in the detailed result comparison view)
- Recover the lower scored peptides by lowering the score threshold in Phenyx (new submission)
Exercise 2
Identification with appreciation of the conflicts and the use of manual validation
The sample is a mixture of proteins, treated with trypsin and analysed with a Quad-TOF instrument: mix_prot. File format: Mascot Generic Format
- Why is rabbit myosin Q28641 identifying a methylated peptide?
- Why do we have so many Albumins?
- Human Myosin 15 is in the list of identified proteins. After manual inspection, it can be rejected: why? Hint: conflicting peptide AEVAESQVNK
- Open the protein details view for Q9Y2K3 (Human Myosin 15). In the compounds overview, reject the 2 occurences of the conflicting peptide AEVAESQVNK and save the manual validation. What happens when you reload the protein details view for Q9Y2K3? What is the impact in the proteins overview?
- On the number of identified proteins. The default value for parent tolerance is 0.4 Da. According to the amount of conflicts and probable more appropriate limit (see the error distribution report or the estimated error from one of the main proteins in the sample), what happens if you decrease to 0.1 Da?
- Using queries in reverse DB, what is the level of score to choose for a say 1% FDR?
- Hint: you can submit a search to a forward and a reverse version of Swiss-Prot. Then export text and choose FDR export, select uniprot_sprot and uniprot_sprot_rev as DBs, and submit. In the excel you can read the values
Exercise 3: ETD-CID
Illustrate the complementarity of ETD and CID in a single acquisition; illustration and use of Swiss-Prot annotations The sample is from recombination of human proteins in E. coli, analysed with an Orbitrap. No chemical modifications, Trypsin as cleavage enzyme.
Notes: Survey scan on the Orbitrap (parent tolerance is 0.01Da), CID and ETD scans in the LTQ (Orbitrap - CID_LTQ_scan_LTQ and ETD ion trap - ETD-LTQ scorings, respectively). File name is MH_K2.zip (right-click on the link to download the file); it is a zip file containing 2534 dta files (7MB). Perform the identification using both an ETD and a CID scoring (here Instrument type=ESI-LTQ-Orbitrap; scoring for parent tolerance at 0.01 Da) on the same dataset.
- What is the main protein identified?
- Can you see automatically identified phosphorylation sites?
- Can you verify if they make sense? (Swiss-Prot entry)
- Is Q16539 (Mitogen activated protein kinase 14) in its active form? (PTM annotation in Swiss-Prot CC lines)
- Can you verify the phosphorylation sites on both CID and ETD?
- What is the calibration status of the instrument? (export text, select error distribution report)?
- What are the peptides specifically identified with CID? with ETD (use the result comparison feature)
- Run the identification in 2 rounds, add half-specific cleavage, M,W oxydation and deamidations. Can you differentiate semi-tryptic cleavage from in source decay? (look at the retention time / scan number in the compounds overview)
Exercise 4:simple iTRAQ analysis
Simple quantitation run
The following dataset is an extract of the ABRF 2006 sample processed with 4-plex iTRAQ reagent (114 and 117 reagents were used). The sample was analysed with a ABI 4700 TOF-TOF instrument. The peaklist file format is Mascot Generic Format. The file is named small_itraq.txt (right-click on the link to download the file).
- what protein do you identify?
- export the reporter intensities in a excel table for manual quantitation: text_export, select iTRAQ
- select the job and activate the iTRAQ quantitation menu (it is using by default the Librus workflow). Select the number of samples (2) and attribute the 114 and 117 reporters to the sample 1 and 2. Put the reporters 115 and 116 in the unused labels. Select to calculate 2 ratios: State 2 / State 1 and State 1 / State 1. run the calculation.
- In the result page, see what proteins were quantified, and see the obtained ratios. Select the show peptide chart. You will see the normalised intensity profile of the 1/1 ratio (State 1/ State 1) in a diagonal of green dots. State2/State1 ratios are represented in orange dots. Upregulated proteins will appear on top of the diagonal, downregulated proteins on the bottom of the diagonal (note that the x and y scales might not be identical).
