Genes are transcribed into mRNAs which in turn are translated into proteins. Some of these proteins activate or inhibit, as transcription factors (TFs), the transcription of a number of other genes creating a complex *gene regulatory network*. The number of transcription factors is believed to be much smaller than the number of regulated genes. Moreover, most genes are known to be regulated only by a very restricted number of transcription factors. This induces a sparse connectivity matrix for the representation of the connections between the TFs and the regulated genes. Microarray experiments measure the expression level of thousands of genes simultaneously. Unfortunately, a similar method that would allow us to measure simultaneously the abundance or activities of a larger number of proteins that act as TFs is not yet available. Some progress has been made with measurements of protein abundance by flow cytometry [1] following a dozen or so proteins of interest which need to be identified in advance. Still, such experiments are less available than gene expression experiments and cannot compete in terms of the number of tracked genes. ChIP-on-chip experiments, on the other hand, provide only static binding information about transcription factors. Thus, current approaches that use microarray experiments make a strong assumption: the protein levels are proportional to the mRNA levels. This assumption is not necessarily true due to the complexity of transcription, translation, and posttranslation modification. In more recent studies, two-level networks have been studied with hidden profiles of the transcription factors at the top level and the observed expression levels of the regulated genes at the lower level. Some of these studies [2–4] are concerned with factor analysis algorithms.

Factor analysis (FA) is often used as a dimensionality reduction approach assuming that the large number of observed variables becomes uncorrelated given a much smaller number of hidden variables called *factors*. Some of the advantages of FA over principle component analysis are the incorporation of independent additive measurement errors on the observed variables, the identification of an underlying structure, and the assignment of the factors as defined entities (in our case transcription factors). Finally, in contrast to independent component analysis, the factors are not assumed to be statistically independent.

In a recent paper, [4], we examined the suitability of five FA algorithms for reconstructing both gene regulatory networks and TF activity profiles. We showed that FA faithfully reconstructs TF activity profiles as other more widely known reconstruction approaches do such as network component analysis (NCA) [5] (see also [67]) and a piecewise least-square (plsgenomics) algorithm [8]. The advantage of FA analysis over these algorithms is the ability to also reconstruct the connectivity matrix. NCA and plsgenomics rely heavily on the availability of connectivity information. Nonzero positions in the connectivity matrix, which describes the connections between the factors and the genes, need to be specified in advance. The algorithms then estimate the values at these positions which actually might turn out to be zero as well. This is a strong limitation since often only little information about genes regulated by specific TFs is available. A further advantage of the Bayesian FA models is that any information from literature surveys, ChIP-on-chip experiments, or sequence analyses about the underlying structure can be easily incorporated through priors.

A serious concern regarding the currently applied FA algorithms as well as NCA and plsgenomics algorithms is the lack of incorporation of any time information provided by the experiments. Actually, most available microarray data such as the *E. coli*, yeast, and *Arabidopsis* data are obtained from time series experiments. Unfortunately, the present time correlation within the TFs is ignored in the above algorithms. Time information can act as a smoothing approach on the TF profiles and thus can improve the reconstruction process. As in our previous paper, here we are still concerned with sparse connectivity matrices, but we also aim to include time correlation within the factors. For this purpose, we extend the algorithm by Fokoué and Titterington. [9], which performed well and was computationally efficient in our comparison [4], to handle time correlation information. However, the extensions we suggest can be easily applied to alternative FA algorithms analysed in [4]. If we allowed a general form for the correlation matrix between the factors, we would run into the problem of estimating a large number of unknown parameters given only a small number of data points. We investigated a number of possible correlation structures and present one that performs well on gene regulatory networks in this paper.

Other algorithms such as the linear dynamic systems or Kalman filter models have also been suggested for estimation of the parameters of a time series model with hidden states. Ghahramani and Hinton [10] presented an EM algorithm for the estimation of the parameters of linear dynamical systems. This is an extension of the factor analysis algorithm [11] that was evaluated in our previous paper and performed less well than some alternative FA algorithms, in particular a Bayesian version. A Bayesian version of an FA algorithm allows one to use sparsity priors on the connectivity matrix and also to integrate prior information regarding the system under study. More recently, Beal et al. [12] presented a state space model for the reconstruction of transcriptional networks from gene expression time series data. The focus of their algorithm is to reconstruct a complete regulatory interaction network and not only the connectivity between TFs and genes. Thus, the hidden states do not represent TFs but any hidden variables that can not be directly measured by gene expression experiments such as missing genes, protein activity profiles, and protein degradation.

Barenco et al. [13] reconstruct the transcription factor activity of p53 from time series expression profiles of known target genes and a differential equation model of gene induction. Based on the same data set and a similar induction model, Sanguinetti et al. [14] suggest the usage of Gaussian processes to estimate the activity profile of the p53 transcription factor. In both cases, only one profile is reconstructed, albeit in great detail, and with an assumed knowledge of the dependent genes.

In this paper, we show how to incorporate time information in the factor analysis approach. Factor analysis is attractive, since it is on of the most straightforward ways to link hidden transcription factor activities to observed outputs without knowledge of the connectivity. However, time series information is ignored in all the methods discussed in our previous paper. Here, we explore an extension to factor analysis that integrates time series correlation. Since some data might show very little correlation or none at all, we estimate the posterior distribution of the strength of correlation of TF activities from one time point to the next. This information is useful in several respects as we show for gene expression data for *E. coli* from [6] and for yeast from Spellman et al. [15]. Based on these datasets, we highlight some important points: (a) the correlation parameter within the factors reveals whether the time step during experimental sampling is large or small in relation to gene regulatory processes, and what the effect of this choice has on the reconstruction process; (b) our analysis also indicates that data obtained under different experimental conditions can show quite different dynamics as reflected in the correlation, and caution is required when combining such data sets for joint inference of regulatory relationships.