Mots-clé
(Anglais)
|
mass spectra; data mining; data preprocessing; data quality; proteomics; biomarkers; diagnosis
|
Programme de recherche
(Anglais)
|
COST-Action 282 - Knowledge Exploration in Science and Technology
|
Description succincte
(Anglais)
|
The long-term goal of the project is the use of data mining techniques for mass spectra-based diagnosis and biomarker discovery. Its short-term objective is to find ways of preprocessing mass spectra in order to assess and control data quality as well as transform mass spectra into a representation appropriate for biomarker discovery.
|
Autres indications
(Anglais)
|
Full name of research-institution/enterprise: Hôpitaux universitaires de Genève Laboratoire central de chimie clinique et examens biologiques
|
Partenaires et organisations internationales
(Anglais)
|
AT, BE, BG, CY, EE, FR, DE, IE, IT, MT, NO, PL, PT, SK, ES, CH, UK
|
Résumé des résultats (Abstract)
(Anglais)
|
Mass spectrometers are an essential part of any proteomic workflow. They generate a large amount of complex data. Data mining technologies are suitable to unravel this complexity. The efficacy of such techniques in extracting meaningful patterns depends strongly on the quality of the acquired spectra and their reproducibility. In order to ensure mass spectra quality control and reproducibility, we studied the different effects of acquisition conditions and parameters on the raw mass spectra. We also studied the stability of the output signal over time as well as its reproducibility. This study was limited to only one device, namely a SELDI time-of-flight mass spectrometer (PCB-II of Ciphergen) installed at the CMU. We observed a very strong dependence of the mass spectra on hardware parameters such as the intensity of the incident laser beam, the gain of the output amplifier and the high-voltage operation values of the detector. These effects coupled with a reproducibility of approximately 25% leads us to strongly recommend the use of specialized calibrants on a daily basis before, during, and immediately after the experiment. In mass spectroscopy, post-acquisition data processing is an essential part of data analysis protocols. In addition to the standard tools supplied by the constructor, specialized software for mass spectra post-processing has been developed by our project partners (Geneva AI Lab) in order to enhance extraction of peaks which are linked to the presence of a protein and its concentration. This software allowed us to evaluate the impact of instrument parameter tuning on the quality of the mass spectra produced. Using a standard Ciphergen-supplied kit containing seven peptides with known masses and concentration in the range of 1000 -7500 Da, we measured detection sensitivity of each peptide and its m/z value, the proportion of detected peaks which did not match any of the peptides, and the number of peaks found for each peptide. In addition, we recorded time-dependent shifts in the m/z values of the different peptides; this information will provide valuable landmarks to computer scientists who need to develop mass alignment algorithms prior to mass spectral data analysis.
|
Références bases de données
(Anglais)
|
Swiss Database: COST-DB of the State Secretariat for Education and Research Hallwylstrasse 4 CH-3003 Berne, Switzerland Tel. +41 31 322 74 82 Swiss Project-Number: C04.0107
|