- Research Article
- Open access
- Published:
Clustering Time-Series Gene Expression Data Using Smoothing Spline Derivatives
EURASIP Journal on Bioinformatics and Systems Biology volume 2007, Article number: 70561 (2007)
Abstract
Microarray data acquired during time-course experiments allow the temporal variations in gene expression to be monitored. An original postprandial fasting experiment was conducted in the mouse and the expression of 200 genes was monitored with a dedicated macroarray at 11 time points between 0 and 72 hours of fasting. The aim of this study was to provide a relevant clustering of gene expression temporal profiles. This was achieved by focusing on the shapes of the curves rather than on the absolute level of expression. Actually, we combined spline smoothing and first derivative computation with hierarchical and partitioning clustering. A heuristic approach was proposed to tune the spline smoothing parameter using both statistical and biological considerations. Clusters are illustrated a posteriori through principal component analysis and heatmap visualization. Most results were found to be in agreement with the literature on the effects of fasting on the mouse liver and provide promising directions for future biological investigations.
References
Park T, Yi S-G, Lee S, et al.: Statistical tests for identifying differentially expressed genes in time-course microarray experiments. Bioinformatics 2003, 19(6):694-703. 10.1093/bioinformatics/btg068
Peddada SD, Lobenhofer EK, Li L, Afshari CA, Weinberg CR, Umbach DM: Gene selection and clustering for time-course and dose-response microarray experiments using order-restricted inference. Bioinformatics 2003, 19(7):834-841. 10.1093/bioinformatics/btg093
Storey JD, Xiao W, Leek JT, Tompkins RG, Davis RW: Significance analysis of time course microarray experiments. Proceedings of the National Academy of Sciences of the United States of America 2005, 102(36):12837-12842. 10.1073/pnas.0504609102
Tai YC, Speed TP: A multivariate empirical Bayes statistic for replicated microarray time course data. The Annals of Statistics 2006, 34(5):2387-2412. 10.1214/009053606000000759
Ramoni MF, Sebastiani P, Kohane IS: Cluster analysis of gene expression dynamics. Proceedings of the National Academy of Sciences of the United States of America 2002, 99(14):9121-9126. 10.1073/pnas.132656399
Ernst J, Nau GJ, Bar-Joseph Z: Clustering short time series gene expression data. Bioinformatics 2005, 21(1):i159-i168. 10.1093/bioinformatics/bti1022
Giurcǎneanu CD, Tǎbuş I, Astola J: Clustering time series gene expression data based on sum-of-exponentials fitting. EURASIP Journal on Applied Signal Processing 2005, 2005(8):1159-1173. 10.1155/ASP.2005.1159
Heard NA, Holmes CC, Stephens DA, Hand DJ, Dimopoulos G: Bayesian coclustering of Anopheles gene expression time series: study of immune defense response to multiple experimental challenges. Proceedings of the National Academy of Sciences of the United States of America 2005, 102(47):16939-16944. 10.1073/pnas.0408393102
Conesa A, Nueda MJ, Ferrer A, Talón M: maSigPro: a method to identify significantly differential expression profiles in time-course microarray experiments. Bioinformatics 2006, 22(9):1096-1102. 10.1093/bioinformatics/btl056
Letowski J, Brousseau R, Masson L: Designing better probes: effect of probe size, mismatch position and number on hybridization in DNA oligonucleotide microarrays. Journal of Microbiological Methods 2004, 57(2):269-278. 10.1016/j.mimet.2004.02.002
Ramsay J, Silverman B: Functional Data Analysis. 2nd edition. Springer, New York, NY, USA; 2005.
Bar-Joseph Z, Gerber GK, Gifford DK, Jaakkola TS, Simon I: Continuous representations of time-series gene expression data. Journal of Computational Biology 2003, 10(3-4):341-356. 10.1089/10665270360688057
Bar-Joseph Z: Analyzing time series gene expression data. Bioinformatics 2004, 20(16):2493-2503. 10.1093/bioinformatics/bth283
Martin PGP, Lasserre F, Calleja C, et al.:Transcriptional modulations by RXR agonists are only partially subordinated to PPAR signaling and attest additional, organ-specific, molecular cross-talks. Gene Expression 2005, 12(3):177-192. 10.3727/000000005783992098
Martin PGP, Guillou H, Lasserre F, et al.:Novel aspects of PPAR-mediated regulation of lipid and xenobiotic metabolism revealed through a nutrigenomic study. Hepatology 2007, 45(3):767-777. 10.1002/hep.21510
INRArray: Laboratoire de Pharmacologie et Toxicologie, INRA.2005. [http://www.inra.fr/internet/Centres/toulouse/pharmacologie/lpt.htm]
Silverman B: Some aspects of the spline smoothing approach to non-parametric regression curve fitting. Journal of the Royal Statistical Society: Series B 1985, 47(1):1-52.
Besse P, Cardot H, Ferraty F: Simultaneous non-parametric regressions of unbalanced longitudinal data. Computational Statistics & Data Analysis 1997, 24(3):255-270. 10.1016/S0167-9473(96)00067-9
Seber GAF: Multivariate Observations. John Wiley & Sons, New York, NY, USA; 1984.
Yeung KY, Ruzzo WL: Principal component analysis for clustering gene expression data. Bioinformatics 2001, 17(9):763-774. 10.1093/bioinformatics/17.9.763
Chipman H, Hastie TJ, Tibshirani T: Clustering microarray data. In Statistical Analysis of Gene Expression Microarray Data. Edited by: Speed T. Chapmann & Hall/CRC Press, Boca Raton, Fla, USA; 2003:159-200.
Kersten S, Seydoux J, Peters JM, Gonzalez FJ, Desvergne B, Wahli W:Peroxisome proliferator-activated receptor mediates the adaptive response to fasting. Journal of Clinical Investigation 1999, 103(11):1489-1498. 10.1172/JCI6223
Mandard S, Müller M, Kersten S:Peroxisome proliferator-activated receptor target genes. Cellular and Molecular Life Sciences 2004, 61(4):393-416. 10.1007/s00018-003-3216-3
Bauer M, Hamm AC, Bonaus M, et al.: Starvation response in mouse liver shows strong correlation with life-span-prolonging processes. Physiological Genomics 2004, 17(2):230-244. 10.1152/physiolgenomics.00203.2003
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Déjean, S., Martin, P., Baccini, A. et al. Clustering Time-Series Gene Expression Data Using Smoothing Spline Derivatives. J Bioinform Sys Biology 2007, 70561 (2007). https://doi.org/10.1155/2007/70561
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1155/2007/70561