Precise periodic components estimation for chronobiological signals through Bayesian Inference with sparsity enforcing prior
 Mircea Dumitru^{1, 2}Email author,
 Ali MohammadDjafari^{1} and
 Simona Baghai Sain^{1, 3}
DOI: 10.1186/s1363701500336
© Dumitru et al. 2016
Received: 15 February 2015
Accepted: 7 December 2015
Published: 20 January 2016
Abstract
The toxicity and efficacy of more than 30 anticancer agents present very high variations, depending on the dosing time. Therefore, the biologists studying the circadian rhythm require a very precise method for estimating the periodic component (PC) vector of chronobiological signals. Moreover, in recent developments, not only the dominant period or the PC vector present a crucial interest but also their stability or variability. In cancer treatment experiments, the recorded signals corresponding to different phases of treatment are short, from 7 days for the synchronization segment to 2 or 3 days for the aftertreatment segment. When studying the stability of the dominant period, we have to consider very short length signals relative to the prior knowledge of the dominant period, placed in the circadian domain. The classical approaches, based on Fourier transform (FT) methods are inefficient (i.e., lack of precision) considering the particularities of the data (i.e., the short length). Another particularity of the signals considered in such experiments is the level of noise: such signals are very noisy and establishing the periodic components that are associated with the biological phenomena and distinguishing them from the ones associated with the noise are difficult tasks. In this paper, we propose a new method for the estimation of the PC vector of biomedical signals, using the biological prior informations and considering a model that accounts for the noise. The experiments developed in cancer treatment context are recording signals expressing a limited number of periods. This is a prior information that can be translated as the sparsity of the PC vector. The proposed method considers the PC vector estimation as an Inverse Problem (IP) using the general Bayesian inference in order to infer the unknown of our model, i.e. the PC vector but also the hyperparameters (i.e the variances). The sparsity prior information is modeled using a sparsity enforcing prior law. In this paper, we propose a Student’s t distribution, viewed as the marginal distribution of a bivariate normalinverse gamma distribution. We build a general infinite Gaussian scale mixture (IGSM) hierarchical model where we assign prior distributions also for the hyperparameters. The expression of the joint posterior law of the unknown PC vector and hyperparameters is obtained via Bayes rule, and then, the unknowns are estimated via joint maximum a posteriori (JMAP) or posterior mean (PM). For the PM estimator, the expression of the posterior distribution is approximated by a separable one, via variational Bayesian approximation (VBA), using the KullbackLeibler (KL) divergence. For the PM estimation, two possibilities are considered: an approximation with a partially separable distribution and an approximation with a fully separable one. Both resulting algorithms corresponding to the PM estimation and the one corresponding to the JMAP estimation are iterative algorithms. The algorithms are presented in detail and are compared with the ones corresponding to the Gaussian model. We examine the convergency of the algorithms and give simulation results to compare their performances. Finally, we show simulation results on synthetic and real data in cancer treatment applications. The real data considered in this paper examines the restactivity patterns of KI/KI Per2::luc mouse, aged 10 weeks, singly housed in RealTime Biolumicorder (RTBIO).
Keywords
Periodic component (PC) vector estimation Sparsity enforcing Bayesian parameter estimation Variational Bayesian approximation (VBA) KullbackLeibler (KL) divergence Infinite Gaussian scale mixture (IGSM) Normalinverse gamma Inverse problem Joint maximum a posteriori (JMAP) Posterior mean (PM) Chronobiology Circadian rhythm Cancer treatment1 Introduction
Several biological processes in living organisms follow oscillations that repeat themselves about every 24 h—these oscillations are called circadian rhythms and together with other periodic phenomena, they are the object of study of chronobiology [1–3]. In mammals, circadian rhythms involve all organs, tissues, and cells and are supervised by the circadian timing system (CTS), a set of molecular clock genes that crossregulate each other by positive and negative feedback loops [4–6]. More precisely, the CTS consists of a central pacemaker, the suprachiasmatic nuclei (SCN) in the hypothalamus, which is made sensitive to light by retinal afferents and which coordinates the molecular clocks in the peripheral organs by releasing diffusible and neurophysiological signals [3]. The period of the CTS, which is about 24 h, is therefore regularly calibrated by the succession of light and day and can be influenced by other environmental factors, such as socioprofessional interactions and feeding times [5]. The resulting circadian physiologic fluctuations are observed in sleepwakefulness and restactivity alternations, body temperature, cortisol secretion by the adrenal gland, and melatonin secretion by the pineal gland, and they involve as well the sympathetic and the parasympathetic systems [6].
Former studies have already shown how taking chronobiology into account can improve anticancer treatment efficacy and reduce at the same time their toxicity (increasing therefore their tolerability), contrary to the previous “the worst the toxicity, the better the efficacy” paradigm [7–10]. The molecular clocks are involved in the regulation of important processes such as cell cycle and proliferation, DNA damage sensing and repair, apoptosis, angiogenesis, pharmacodynamics, and pharmacokinetics; therefore, they can greatly influence the metabolism, transportation, and detoxification of drugs [11].
Tolerability to anticancer treatments has been proven to depend significantly on their timing in respect to the circadian rhythms, measuring up to tenfold changes in the tolerability to drug administration at different circadian times for 40 anticancer drugs in rodents and up to fivefold in patients [11, 12]. Notably, chemotherapeutic agents proved to be at their best efficacy, both administered alone and combined, when they are also at their best tolerability level, i.e., when they are least toxic to the healthy tissues. Furthermore, relevant interpatient variability in circadian rhythms have been observed and can be due to factors such as gender, age, and genetic polymorphisms; therefore, anticancer drugs dosing and timing need to be personalized, at least for subtypes of patients with similar chronotoxicity key features. Modulating drugs administration according to the patient’s circadian rhythms is known as chronotherapy [13, 14]. On the other hand, administrating anticancer drugs at their most toxic time causes the disruption of molecular clocks synchronization, which has been shown to accelerate the cancer evolution [15–20].
In order to optimize cancer treatment, once proven that a certain drug effects are susceptible to circadian rhythms, we want to identify its best administration time. First, for each drug is proved the correlation with the circadian rhythms in a rodent model, which has been proved to well represent the human circadian physiology [11]. This is achieved by studying the chronotoxicity of the drug, inferred by body weight loss and histopathologic lesions, at different circadian times (CT or ZT, from Zeitgeber time). The mice circadian clock is synchronized by exposure to light for 12 h, followed by 12 h of dark, repeating this cycle and its rhythm is detected by tracking the expression of one or more of its core genes (normally Bmal1, Per2, Reverb α, or Clock are used). Mice with a disrupted clock (clockdefective mice, obtained via the functional knock out of one of its genes, normally Per2) are used to confirm the relevance of the molecular clock for the drug toxicity. At the same time, the main characteristics of the circadian expression of these observed genes are studied to observe whether the administration of the drug modifies them.
Once defined the CTs at which the drug best and worst tolerability is observed, we can look for the molecular mechanisms that influence it. Genes influencing the pharmacokinetics (absorption, distribution, metabolism, and excretion) of the drug are a good starting point, and we can follow how their expression correlates with the higher or lower drug chronotoxicity. For instance, the transporter abcc2, involved in the cellular efflux of several drugs, has been shown to influence irinotecan chronotolerance in ileum, according to the circadian changes in abcc2 local expression [21]. The circadian clocks of the mice used in the experiments whose data we analyse are first synchronized to the same daynight alternation where 12 h of light are followed by 12 h of dark (LD12:12). After synchronization, the mice are kept in constant darkness (DD), which implies the subtraction of the light. Throughout the experience, gene expression and restactivity are measured to establish how the basic parameters of their circadian rhythms (period, acrophase, amplitude) vary in respect to the drug treatment. Both measures are allowed by an innovative monitoring device, the RealTimeBiolumicorder (RTBIO) [22]. The locomotor activity is detected by an infrared sensor, whereas the gene expression is measured at the posttranslational level in mice engineered to express the gene of interest together with luciferase (fLUC), so that the gene activity is marked by bioluminescence detected by a photomultiplier tube. Common mouse strains used are C57BL/6based [7, 21] and 129S1/SvImJ [23]. The acrophase and amplitude depend on the periodic component (PC) vector, so a major interest is the study of the periodicity of such time series, i.e., the estimation of the PC vector and the stability or the variability of the dominant period, requiring a precise PC vector variation analysis. The periodical phenomena were studied with different approaches in different particular conditions [24–40] using in general fast Fourier transform (FFT)based methods. The major limitation when studying such data is given by their reduced length, due to the duration of the experiments. The objective of an accurate description of the periodic components variation during the experiments can be formulated as the need of a method that can give a precise estimation of the PC vector from a limited number of data. Also, the method must be able to distinguish the peaks from the PC vector due to the biological phenomena and the peaks due to the measurements errors. The real data considered in this article is a chronobiological time series, measuring the locomotor activity. In order to observe the variation or the stability of the dominant periods, very short intervals of the recorded time series are considered. The prior knowledge is the presence of the circadian rhythm: the PC vector is sparse, having a limited number of nonzero elements, inside the circadian interval.
The article is positioned in the context of the need of a method capable to estimate the PC vector of a time series in the following conditions: (a) very limited number of data (4day length) for circadian periodic components (24 ± 6 h) estimation and (b) precision that can be adjusted depending on the chronobiological context, 1hour precision required in the particular experiment discussed in this article. The method proposed in this article formulates the estimation of the PC vector as an inverse problem, using the general Bayesian inference to infer the unknowns of the considered linear model. This approach is presented in Section 3. A hierarchical prior model is considered, using the Student’s t distribution (expressed as the marginal of a normalinverse gamma bivariate distribution) as the sparsity enforcing prior law for the PC vector and assigning prior distributions for the hyperparameters involved in the model, namely the variances associated with the PC vector and the noise (Subsection 3.2 and Section 4). From the analytical expression of the joint posterior law of the unknown PC vector and hyperparameters, obtained via Bayes rule, the unknowns are estimated via joint maximum a posteriori (JMAP) (Subsection 4.1) or posterior mean (PM). For the PM estimator, the expression of the posterior law is approximated by a separable one, via the variational Bayesian approximation (VBA), using the KullbackLeibler (KL) divergence. For the PM estimation, two possibilities are considered: an approximation with a partially separable law (Subsection 4.2) and one with a full separable one (Subsection 4.3). Simulation results on synthetic data (5 dB) and real data in cancer treatment applications are presented in Section 5. More simulations for the synthetic case (10 and 15 dB) are presented in the Additional file 1.
2 Classical Fourier transform methods
More general, if the prior knowledge sets the dominant period around a value P in order to obtain a PC vector that contains the period P and also the periods P − 1 and P + 1, the signal must be observed for (P−1)(P+1) periods. In chronobiology applications, where the circadian period is around 24 h, a signal should be recorded for 575 days in order to obtain a periodic component vector that contains 23, 24, and 25h periods.
3 Inverse problem approach and general Bayesian inference

g represents the observed data, i.e., the chronobiological time series: \(\boldsymbol {g}\;=\; \left [{g}_{1}, {g}_{2} \ldots {g}_{{N}}\right ]^{T} \in \mathcal {M}_{N\times 1},\) an Ndimensional vector

f represents the unknowns, i.e., the PC vector: \(\boldsymbol {f}\;=\;\left [{f}_{1}, {f}_{2}, \ldots, {f}_{M}\right ]^{T} \in \mathcal {M}_{M \times 1},\) a Mdimensional vector

ε represents the errors: \(\boldsymbol {\epsilon }\;=\;\left [{\epsilon }_{1}, {\epsilon }_{2}, \ldots, {\epsilon }_{N}\right ]^{T} \in \mathcal {M}_{N \times 1},\) is an Ndimensional vector
The goal is to estimate the unknowns of the model, Eq. (3), i.e., the PC vector f and the error vector ε. In this paper, we propose an inversion based on general Bayesian inference, building a hierarchical model and estimating the unknowns from the posterior probability density function, using the available data g.
For the application considered in this paper, the matrix H used in the model presented in Eq. (3) has very high conditioning numbers, so the problem is illconditioned. As mentioned above, in this paper, we focus on an inversion based on general Bayesian inference. Nevertheless, in literature, many other approaches are possible. One particular case of the considered linear model is the case where the error vector is neglected (ε=0), and the matrix H is invertible and orthogonal, i.e., H ^{ T } H=I. This is the case of the FT matrix with M=N. Then, the solution is given by \({\widehat {\boldsymbol {f}}}=\boldsymbol {H}^{T}\boldsymbol {g}\), which corresponds to IFT. However, in general as in our case M≠N. When M<N, a minimum norm solution \({\widehat {\boldsymbol {f}}}_{\text {MN}} = \left (\boldsymbol {H} \boldsymbol {H}^{T} \right)^{1} \boldsymbol {H}^{T} \boldsymbol {g}\) can be obtained, and when M>N, the classical solution is the least square solution \({\widehat {\boldsymbol {f}}}_{\text {LS}} = \boldsymbol {H}^{T} \left (\boldsymbol {H} \boldsymbol {H}^{T} \right)^{1} \boldsymbol {g}\). Since in the case of chronobiological times series the matrix H is proved to have a very high conditioning number, those generalized inverse solutions are, in general, too sensitive to the errors due to the illconditioning of the matrix H. The regularization methods can partially solve this difficulty. For example, the regularization methods such as truncated single value decomposition (TSVD) or Tikhonov regularization methods (TRM) can be used, but the solutions depend on the threshold in the first case (TSVD) and on the regularization parameter in the second case. When M≠N and when the error vector is not neglected (ε≠0), the regularization methods can still be applied and an estimation can be obtained for f and ε, but with the following drawbacks: in general, determining the regularization parameters is difficult and there is not a good way to handle other a priori knowledge we may have on the noise statistics and on the unknowns.
3.1 Bayesian inference
where θ represents the hyperparameters that appear in the model.
Such an extension presents two particular advantages: one advantage is evidently the possibility of estimating the hyperparameters and obtaining numerical values for variances and the second one is that such an approach can be developed into a nonsupervised algorithm.
3.2 Hierarchical prior models
The hierarchical model represents the set of probability density functions assigned for the probabilities involved in (6), namely the assignment of the prior p (gf,θ _{1}), the likelihood p (fθ _{2}), and the hyperparameters priors p(θ _{1}),p(θ _{2}).
The prior biological knowledge leads to the search of good sparsity enforcing priors. In literature [54], certain classes of distribution (heavytailed, mixture models) are well known as good sparsity enforcing priors. In this paper, we consider a general infinite Gaussian scale mixture (IGSM) hierarchical model [55]. The prior distribution for the PC vector is a Student’s t distribution expressed via a normalinverse gamma distribution. The error vector is also modeled using the IGSM, considering nonstationary variances for the noise, generalizing the results from [56]. In Section 5, during the simulations results, we include comparisons with the Gaussian hierarchical model for the synthetic data.
4 Hierarchical model infinite Gaussian scale mixture
The error variance priors, Eq. (9), the likelihood, Eq. (11), and the prior, Eq. (13), represents the IGSM hierarchical model. The analytical form is presented in Eq. (14):
4.1 Joint MAP estimation
4.2 Posterior mean (via VBA) IGSM (partial separability)
Equation 24a provides the dependency of the parameters corresponding to the multivariate normal distribution q _{1}(f) and the others hyperparameters involved in the hierarchical model: the mean \({\widehat {\boldsymbol {f}}}_{\text {PM}}\) and the covariance matrix \({\widehat {\boldsymbol {\Sigma }}}\) depend on \({\widehat {\boldsymbol {V}_{\boldsymbol {\epsilon }}^{1}}}\) and \({\widehat {\boldsymbol {V}_{\boldsymbol {f}}^{1}}}\). Eq. (70) (in Appendix 2) defines \({\widehat {\boldsymbol {V}_{\boldsymbol {\epsilon }}^{1}}}\) and \({\widehat {\boldsymbol {V}_{\boldsymbol {f}}^{1}}}\) via \(\left \{ \alpha _{\epsilon _{i}},\beta _{\epsilon _{i}}\right \}, i \in \left \{ 1, 2, \ldots, N \right \}\) and \(\left \{ \alpha _{f_{j}},\beta _{f_{j}}\right \}, j \in \left \{ 1, 2, \ldots, M \right \}\). For the mean and the variance, we obtain the following dependency:

Initialization

Use Eqs. (24a) and (70) to compute \({\widehat {\boldsymbol {f}}}_{\text {PM}}, {\widehat {\boldsymbol {\Sigma }}}\)

Use Eq. (24b) to compute \(\left \{\alpha _{\epsilon _{j}},\beta _{\epsilon _{j}}\right \}\) and \({\widehat {\boldsymbol {V}_{\boldsymbol {\epsilon }}^{1}}}\)

Use Eq. (24c) to compute \(\left \{\alpha _{f_{j}},\beta _{f_{j}}\right \}\) and \({\widehat {\boldsymbol {V}_{\boldsymbol {f}}^{1}}}\)
In order to initialize the algorithm, we define the matrices \({\widehat {\boldsymbol {V}_{\boldsymbol {\epsilon }}^{1}}^{(0)}}\) and \({\widehat {\boldsymbol {V}_{\boldsymbol {f}}^{1}}^{(0)}}\), corresponding to the iteration zero of the algorithm. For the first iteration, using those values of matrices, the algorithm updates the estimations corresponding to the PC vector and the corresponding covariance matrix (a). Except the two matrices used, the other terms involved in the equations are known: the recorded signal g and the matrix H. After the PC vector and the covariance matrix are updated, they are used as terms in the equations updating the hyperparameters involved in the model. For updating the hyperparameters corresponding to the noise variances (b) and PC variances (c), the algorithm is using the estimation of the PC vector and the covariance matrix corresponding to the first iteration, obtained in (a). Then, the estimation corresponding to the noise variance (b) are used as input in (a), corresponding to the second iteration, via (d) and (e).
4.3 Posterior mean (via VBA) IGSM (full separability)
5 Simulations
This section presents the simulations corresponding to synthetic and real data. For synthetic data, we compare five algorithms: joint MAP with Gaussian prior, posterior mean with Gaussian prior, joint MAP with IGSM prior, posterior mean (via VBA) with IGSM prior (partial separability), and posterior mean (via VBA) with IGSM prior (full separability). For each iterative algorithm, we present a comparison between the algorithm’s estimation and the synthetic data, i.e., a comparison between \(\widehat {\boldsymbol {f}}_{\text {Method}}\) and f, between \(\widehat {\boldsymbol {g}}_{\text {Method}}\) and g and between \(\widehat {\boldsymbol {g}}_{\text {Method}}\) and g _{0} theoretical signal (g without noise). For every algorithm considered, we present the convergency analysis of the parameters and hyperparameters involved. Then, we present a comparison between the estimations of proposed algorithms and the classical FFT method. Finally, the proposed algorithms are tested 10 times over the same data, but different noise realization, in order to obtain the L _{2} error vector (the normalized difference between data and estimated data, considered for f, g, and theoretical signal g _{0}) and compare the performances of each algorithm. These comparisons between error vectors corresponding to each algorithm are presented at the end of the subsection. For the synthetic data, we consider the following protocol: we consider a theoretical PC vector f and the corresponding theoretical signal H f and we consider the corresponding signal g=H f+ε, by adding noise over the theoretical signal. In this article, we consider for the synthetic case three different levels of noise: 15, 10, and 5 dB. In this section, we include only the detailed simulations for the 5dB case. The other two cases are presented in the Additional file 1. The considered signal represents a 4day signal, sampled every hour. The matrix H considered in this set of simulations is a cosine plus sine matrix.
5.1 Synthetic data 05 dB
For testing, we have considered a 4day signal, corresponding to a sparse PC vector, having nonzero values for 11, 15, and 23 h. We consider this particular structure for the following reason: we want to verify if the proposed method can precisely distinguish the peaks inside the circadian domain. As we have mentioned, for such signals, via the FFT, we obtain a high peak corresponding to 24 h and the corresponding harmonics, but this method offers no information for certain values in the circadian domain. We have showed in Section 2 (Fig. 1) that a dominant period, corresponding to 23 h, is wrongly estimated at 24 h via FFT method and offers no other informations in the interval [20–31].
5.1.1 Data 05 dB
Figure 7 a shows the theoretical PC, having the nonzero periods corresponding to 11, 15, and 23 h. All the other values in the PC vector are zero. Figure 7 b presents the signal corresponding to the linear model considered in Eq. (3), neglecting the errors, g _{0}=H f. We note that the conditioning number of the matrix H is cond(H)=56,798,792,591. All the simulations are done using the input as the noisy signal g corresponding to the linear model, Eq. (3), presented in Fig. 7 c. We compare the estimated PC vector with the theoretical one (Fig. 7 a) and the corresponding reconstructed signal with g _{0} and g. The comparison with the theoretical signal g _{0} is important in order to verify if the propose algorithm can distinguish the peaks corresponding to the biological phenomena from the ones corresponding to the noise.
5.1.2 JMAP IGSM 05 dB
The proposed method is searching for a sparse solution corresponding to the linear model, Eq. (3). The comparison between the theoretical signal g _{0} and \({\widehat {\boldsymbol {g}}}_{\text {JMAP}}\) (Fig. 8 b) shows that the proposed algorithm is converging to a solution that leads to a fairly accurate reconstruction, having the L _{2} norm error \(\delta \boldsymbol {g}_{0} = \frac {\\boldsymbol {g}_{0}{\widehat {\boldsymbol {g}}}_{\textit {JMAP}}\_{2}^{2}}{\\boldsymbol {g}_{0}\_{2}^{2}}=0.0524\). For the PC vector, the reconstruction error is \(\delta \boldsymbol {f} = \frac {\\boldsymbol {f}\widehat {\boldsymbol {f}}_{\textit {JMAP}}\_{2}^{2}}{\\boldsymbol {f}\_{2}^{2}} = 0.0726\). For the JMAP estimation, the condition imposed for the searched solution, i.e., the sparsity is not respected (Fig. 8 a). In fact, the alternate optimization algorithm considered for searching the JMAP solution is converging to a local minimum and the estimation errors corresponding to the JMAP estimation might be far from the example presented.
5.1.3 PM (via VBA, partial separability) IGSM 05 dB
In the case of the posterior mean estimation via VBA, both the PC estimation and theoretical signal g _{0} reconstruction are very accurate (Fig. 10 a, b). For the reconstruction of the theoretical signal g _{0}, the L _{2} error norm is \(\delta \boldsymbol {g}_{0} = \frac {\\boldsymbol {g}_{0}\widehat {\boldsymbol {g}}_{\textit {PM}}\_{2}^{2}}{\\boldsymbol {g}_{0}\_{2}^{2}}=0.0275\). For the PC vector, the reconstruction error is \(\delta \boldsymbol {f} = \frac {\\boldsymbol {f}\widehat {\boldsymbol {f}}_{\textit {PM}}\_{2}^{2}}{\\boldsymbol {f}\_{2}^{2}} = 0.0283\). The algorithm is converging to a sparse solution where all the nonzero peaks are detected. The residual error computed between g and the reconstructed signal is consistent with the error considered in the model, 5 dB (Fig. 10 c). During the algorithm, both inverse gamma shape parameters are constant (Eqs. (24b) and (24c)).
5.1.4 PM (via VBA, full separability) IGSM 05 dB
Numerically, for the reconstruction of the theoretical signal g _{0}, the L _{2} error norm is \(\delta \boldsymbol {g}_{0} = \frac {\\boldsymbol {g}_{0}\widehat {\boldsymbol {g}}_{\text {PM}}\_{2}^{2}}{\\boldsymbol {g}_{0}\_{2}^{2}} = 0.0247\). For the PC vector, the reconstruction error is \(\delta \boldsymbol {f} = \frac {\\boldsymbol {f}\widehat {\boldsymbol {f}}_{\text {PM}}\_{2}^{2}}{\\boldsymbol {f}\_{2}^{2}} = 0.0234\).
5.1.5 Methods comparison 05 dB
The L _{2} estimation error for the PC vector is very high for the two Gaussian models. Also, the estimations are not sparse. For the IGSM models, the JMAP estimator is providing a good estimation, but it is unstable. PM via VBA estimation, both partial and fully separable, provides very accurate stable estimations.
5.1.6 Error comparison 05 dB
The L2 error corresponds to the PM via the VBA IGSM model corresponding to the PC vector estimation; Fig. 16 a shows the performances of the proposed algorithm compared to the Gaussian model and the JMAP estimation for IGSM model.
5.2 Real data
For the LD segment, 7 days are available. We compute the PC corresponding to the signal using the proposed method and also using the FFT.
Via the FFT, the highest peak is set at 24 h and the next highest peak is set at 8 h. Given the short length of the signal, 3 days, and the limitations of the FFT method, all the values inside the interval (18, 36) except 24 are not present in the estimated vector, so the values are uncertain. Via the proposed method, the dominant period is set at 22 h. For the duringtreatment part of the data, a 5daylength signal is available.
6 Conclusions
In this article, we have proposed a new method for a precise estimation of the PC vector for biomedical signals, based on the general Bayesian inference and using a hierarchical model with sparsity enforcing prior. The prior considered was a Student’s t distribution expressed as the marginal of an infinite Gaussian scale mixture. The context of our work were the short signals relative to the prior knowledge for the dominant period (4day signals and 24h period). In Subsection 5.2, we applied the proposed method also for 2 and 3daylength signals. The objective was to develop a method that can improve the precision given by the FFT method and also to account for the possible effects of the measurement errors and the uncertainties. The method was tested first on synthetic data, in order to be validated. The algorithms corresponding to the Gaussian model (JMAP and PM estimators) fail to accurately reconstruct the sparse theoretical PC vector. When using the JMAP estimator for the IGSM hierarchical model, the estimation is unstable. The error vectors corresponding to the JMAPIGSM estimation (Fig. 16 a, b) are showing the drawbacks of the method. Both PMIGSM models accurately estimate the theoretical PC vector, (Fig. 10 a, SNR =05 dB). The comparison between the reconstructed signal and the theoretical input (Fig. 10 b, SNR =05 dB) and the comparison between the reconstructed signal and the noised input (Fig. 10 c, SNR =05 dB) show a good reconstruction and a good residual error, consistent with the considered added noise for the noised signal g. These algorithms allow the estimation of the covariance matrix. The convergence of f and hyperparameters is showing a fast convergence of the PM algorithms. The proposed method, PM via VBA, IGSM model was validated for a different set of data, at different ratios of noise, and the estimate was accurate in all the cases. For the real data, a comparison between the outputs is impossible. We have presented a comparison between the PC estimate corresponding to the PMIGSM algorithm and the FFT estimate. The proposed method offers more precision compared to the FFT and is able to select the peaks corresponding to the biological phenomena. Via the proposed method, the conclusion imposed by the FFT method that the considered experiment presents a stability of the dominant period at 24 h is invalidated, showing a variation of the dominant period between 22 and 25 h.
7 Appendices
7.1 Appendix 1
7.1.1 Computations for JMAP estimation
7.2 Appendix 2
7.2.1 Computations for PM estimation via VBA, partial separability
The criterion J(f) introduced in Eq. (42) is quadratic in f. Equation 43 establishes a proportionality relation between q _{1}(f) and an exponential function having as argument a quadratic criterion. This leads to the following:
Intermediate conclusion 1.
The probability distribution function q _{1}(f) is a multivariate normal distribution.
∙ Expression of \(\boldsymbol {q_{2i}\!\left ({v}_{{\epsilon }_{i}}\right):}\)
Equation (59) leads to the following.
Intermediate conclusion 2.
The probability distribution function \(q_{3i}\left ({v}_{{\epsilon }_{i}}\right)\) is an inverse gamma distribution, with the parameters \(\alpha _{\epsilon _{i}}\) and \(\beta _{\epsilon _{i}}\):
∙ Expression of \(\phantom {\dot {i}\!}\boldsymbol {q_{3j}({v}_{{f}_{j}}):}\)
Equation (67) leads to the following.
Intermediate conclusion 3.
The probability distribution function q _{4}(v _{ f }) is an inverse gamma distribution, with the parameters \(\alpha _{f_{j}}\) and \(\beta _{f_{j}}\):
Expressions (51), (60), and (68) resume the distributions families and the corresponding parameters for q _{1}(f), \(q_{2i}\left (v_{{\epsilon }_{i}}\right)\), i∈{1,2,…,N} and \(q_{3j}\left (v_{f_{j}}\right)\), j∈{1,2,…,M}. However, the parameters corresponding to the multivariate normal distribution are expressed via \({\widetilde {\boldsymbol {V}_{\boldsymbol {\epsilon }}^{1}}}\) and \({\widetilde {\boldsymbol {V}_{\boldsymbol {f}}^{1}}}\) (and by extension, all elements forming the three matrices \({\widetilde {v_{{\epsilon }_{i}}^{1}}}\), i∈{1,2,…,N} and \({\widetilde {v_{f_{j}}^{1}}}\), j∈{1,2,…,M}).
Remark.
In Eq. (70), we have introduced other notations for \({\widetilde {\boldsymbol {V}_{\boldsymbol {f}}^{1}}}\) and \({\widetilde {\boldsymbol {V}_{\boldsymbol {\epsilon }}^{1}}}\). All three values were expressed during the model via unknown expectancies, but at this point, we arrive at expressions that do not contain any more integrals to be computed. Therefore, the new notations represent the final expressions for the density functions q that depend only on numerical hyperparameters, set in the prior modeling.
7.3 Appendix 3
7.3.1 Computations for PM estimation via VBA, full separability
This section presents the computation for the PM estimation, via VBA, full separability (Subsection 4.3). The expression of the logarithm lnp(f,v _{ ε },v _{ f }g) was established in the preview section (Eq. (33)).
Defining the criterion \(J\left ({f}_{j}\right) = \left (\ \left ({\widetilde {\boldsymbol {V}_{\boldsymbol {\epsilon }}^{1}}}\right)^{1/2} \boldsymbol {H}^{j}\^{2} + {\widetilde {v_{f_{j}}^{1}}} \right) {f}_{j}^{2}  2 \boldsymbol {H}^{{j} T} {\widetilde {\boldsymbol {V}_{\boldsymbol {\epsilon }}^{1}}} \left (\boldsymbol {g}  \boldsymbol {H}^{{j}}\,{\widetilde {\boldsymbol {f}^{{j}}}} \right)f_{j}\), we arrive to the following.
Intermediate conclusion 4.
The probability distribution function q _{1j }(f _{ j }) is a normal distribution.
Equation 82 leads to the following.
Intermediate conclusion 5.
The probability distribution function \(q_{2i}\left ({v}_{{\epsilon }_{i}}\right)\) is an inverse gamma distribution, with the parameters \(\alpha _{\epsilon _{i}}\) and \(\beta _{\epsilon _{i}}\).
Equation 85 leads to the following.
Intermediate conclusion 6.
The probability distribution function \(q_{3j}\left ({v}_{{f}_{j}}\right)\) is an inverse gamma distribution, with the parameters \(\alpha _{f_{j}}\) and \(\beta _{f_{j}}\).
7.4 Appendix 4
7.4.1 List of symbols and abbreviations
List of symbols
 1.
H—the matrix used in the linear model considered during all the article. \(\boldsymbol {H} \in \mathcal {M}_{N\times M}\). The matrix corresponds to the IFT and can be derived from Eq. (2).
 2.
H _{ i } represents the i line of the matrix H. \(\boldsymbol {H}_{i} \in \mathcal {M}_{1\times M}\)
 3.
g _{0} represents the “theoretical” signal, i.e., the signal corresponding to the considered model (2) that does not account for the noise, g _{0}=H f. During the synthetic simulation section, the comparison between the estimated signal \({\widehat {\boldsymbol {g}_{0}}}\) and the theoretical signal g _{0} is particular important, measuring if the propose algorithm selects the solution corresponding to the biological phenomena.
 4.
f represents the PC vector, \(\boldsymbol {f} \in \mathcal {M}_{1\times M}\). This is the fundamental unknown of our model. All the estimates of the PC vector are denoted \({\widehat {\boldsymbol {f}}}\) and in specific cases the particular estimation used in the model is indicated: \({\widehat {\boldsymbol {f}}_{\textit {JMAP}}}\) or \({\widehat {\boldsymbol {f}}_{\text {PM}}}\). During the article, the subscript used for indicating an element of the PC vector is i: f _{ i } and the element is not bold, being a scalar.
 5.
ε represents the errors: \(\boldsymbol {\epsilon } = \left [{\epsilon }_{1}, {\epsilon }_{2}, \ldots, {\epsilon }_{N}\right ]^{T} \in \mathcal {M}_{N \times 1},\) is an Ndimensional vector
 1.
CT—circadian time
 2.
CTS—circadian timing system
 3.
FFT—fast Fourier transform
 4.
IGSM—infinite Gaussian scale mixture
 5.
IP—inverse problem
 6.
JMAP—joint maximum a posteriori
 7.
KL—KullbackLeibler
 8.
PC vector—periodic component vector
 9.
PM—posterior mean
 10.
RTBIO—RealTime Biolumicorder
 11.
TSVD—truncated single value decomposition
 12.
TRM—Tikhonov regularization methods
 13.
VBA—variational Bayesian approximation
 14.
ZT—Zeitgeber time
Declarations
Acknowledgements
The authors wish to gratefully acknowledge the reviewers for critically reading the manuscript and suggesting substantial improvements.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Authors’ Affiliations
References
 ME Hastings, AB Reddy, ES Maywood, A clockwork web: circadian timing in brain and periphery, in health and disease. Nat Rev Neurosci. 4(8), 649–661 (2008).View ArticleGoogle Scholar
 C Saini, DM Suter, A Liani, P Gos, U Schibler, The mammalian circadian timing system: synchronization of peripheral clocks. Cold Spring Harb Symp Quant Biol. 76:, 39–47 (2011).View ArticleGoogle Scholar
 C Dibner, U Schibler, U Albrecht, The mammalian circadian timing system: organisation and coordination of central and peripheral clocks. Annu Rev Physiol. 72:, 517–549 (2010).View ArticleGoogle Scholar
 JA Mohawk, CB Green, JS Takahashi, Central and peripheral circadian clocks in mammals. Annu Rev Neurosci. 35:, 445–462 (2012).View ArticleGoogle Scholar
 RE Mistlberger, DJ Skene, Social influences on mammalian circadian rhythms: animal and human studies. Biol Rev. 79:, 533–556 (2004).View ArticleGoogle Scholar
 Y Touitou, E Haus (eds.), Biologic rhythms in clinical and laboratory medicine (Springer Berlin Heidelberg, Berlin, 1993). doi:10.1007/9783642787348 Google Scholar
 XM Li, A MohammadDjafari, M Dumitru, S Dulong, E Filipski, S SiffroiFernandez, A Mteyrek, F Scaglione, C Guettier, F Delaunay, F Levi, A circadian clock transcription model for the personalization of cancer chronotherapy. Cancer Res. 73(24), 7176–7188 (2013).View ArticleGoogle Scholar
 PF Innominato, S Giacchetti, T Moreau, R Smaaland, C Focan, GA Bjarnason, Prediction of survival by neutropenia according to delivery schedule of oxaliplatin5fluorouracilleucovorin for metastatic colorectal cancer in a randomized international trial. Chronobiol Int. 28:, 586–600 (2011).View ArticleGoogle Scholar
 PF Innominato, S Giacchetti, GA Bjarnason, C Focan, C Garufi, B Coudert, Prediction of overall survival through circadian restactivity monitoring during chemotherapy for metastatic colorectal cancer. Int J Cancer. 131:, 2684–2692 (2012).View ArticleGoogle Scholar
 E OrtizTudela, A Mteyrek, A Ballesta, PF Innominato, F Levi, Cancer chronotherapeutics: experimental, theoretical, and clinical aspects. Handb Exp Pharmacol. 217:, 261–288 (2013).View ArticleGoogle Scholar
 F Levi, A Okyar, S Dulong, PF Innominato, J Clairambault, Circadian timing in cancer treatment. Ann Rev Pharmacol Toxicol. 50:, 377–421 (2010).View ArticleGoogle Scholar
 F Levi, U Schibler, Circadian rhythms: mechanisms and therapeutic implications. Annu Rev Pharmacol Toxicol. 47:, 593–628 (2007).View ArticleGoogle Scholar
 L F, Circadian chronotherapy for human cancers. Lancet Oncol. 2:, 307–315 (2001).View ArticleGoogle Scholar
 MC Mormont, F Levi, Cancer chronotherapy: principles, applications, and perspectives. Cancer. 97:, 155–169 (2003).View ArticleGoogle Scholar
 E Filipski, VM King, X Li, TG Granda, MC Mormont, Host circadian clock as a control point in tumor progression. J Natl Cancer Inst. 94:, 690–697 (2002).View ArticleGoogle Scholar
 L Fu, CC Lee, The circadian clock: pacemaker and tumor suppressor. Nat Rev Cancer. 3:, 350–361 (2003).View ArticleGoogle Scholar
 PF Innominato, C Focan, T Gorlia, T Moreau, C Garufi, Circadian rhythm in rest and activity: a biological correlate of quality of life and a predictor of survival in patients with metastatic colorectal cancer. Cancer Res. 69:, 4700–4707 (2009).View ArticleGoogle Scholar
 MC Mormont, J Waterhouse, P Bleuzen, S Giacchetti, A Jami, Marked 24h rest/activity rhythms are associated with better quality of life, better response, and longer survival in patients with metastatic colorectal cancer and good performance status. Clin Cancer Res. 6:, 3038–3045 (2000).Google Scholar
 SE Sephton, RM Sapolsky, HC Kraemer, D Spiegel, Diurnal cortisol rhythm as a predictor of breast cancer survival. J Natl Cancer Inst. 92:, 994–1000 (2000).View ArticleGoogle Scholar
 E Filipski, PF Innominato, M Wu, XM Li, S Iacobelli, Effects of light and food schedules on liver and tumor molecular clocks in mice. J Nat Cancer Inst. 97:, 507–517 (2005).View ArticleGoogle Scholar
 A Okyar, E Piccolo, C Ahowesso, E Filipski, V Hossard, C Guettier, R La Sorda, N Tinari, S Iacobelli, F Levi, Strain and sexdependent circadian changes in abcc2 transporter expression: implications for irinotecan chronotolerance in mouse ileum. PLoS One. 6(6), e20393 (2011).View ArticleGoogle Scholar
 C Saini, A Liani, T Curie, P Gos, F Kreppel, Y Emmenegger, L Bonacina, JP Wolf, YA Poget, P Franken, U Schibler, Realtime recording of circadian liver gene expression in freely moving mice reveals the phasesetting behavior of hepatocyte clocks. Genes Dev. 27:, 1526–1536 (2013).View ArticleGoogle Scholar
 Gu L, WM Tsark, DA Brown, S Blanchard, TW Synold, SE Kane, A new model for studying tissuespecific mdr1a gene expression in vivo by live imaging. Proc Nat Acad Sci USA. 106:, 5394–5399 (2009).View ArticleGoogle Scholar
 K Aczél, I Vajk. Separation of periodic and aperiodic sound components by employing frequency estimation, (2008). http://www.scopus.com/inward/record.url?eid=2s2.084863771902&partnerID=40&md5=e489960c7cb3987fa63da10c5e76317e
 JU Blackford, RM Salomon, NG Waller, Detecting change in biological rhythms: a multivariate permutation test approach to fouriertransformed data. Chronobiol Int. 26(2), 258–281 (2009).View ArticleGoogle Scholar
 S Bourguignon, H Carfantan, New methods for fitting multiple sinusoids from irregularly sampled data. Stat Methodol. 5(4), 318–327 (2008).View ArticleMathSciNetMATHGoogle Scholar
 MJ Costa, B Finkenstadt, V Roche, F Levi, PD Gould, J Foreman, K Halliday, A Hall, DA Rand, Inference on periodicity of circadian time series. Biostatistics. 14(4), 792–806 (2013).View ArticleGoogle Scholar
 CR Smith, GJ Erickson, PO Neudorfer. Parameter estimation in chirped signals, (1989), pp. 538–539. http://www.scopus.com/inward/record.url?eid=2s2.00024927097&partnerID=40&md5=daa8c2dd6a477336001496f89e296426.
 R Llinares, J Igual, J MiroBorras, A Camacho. Atrial activity estimation using periodic component analysis, (2010). http://www.scopus.com/inward/record.url?eid=2s2.079959472102&partnerID=40&md5=4d3039a80f7bb1ea568085f07f93b0fb.
 BA Rosa, Y Jiao, S Oh, BL Montgomery, W Qin, J Chen, Frequencybased timeseries gene expression recomposition using priism. BMC Syst Biol. 6(1), 1–16 (2012). ISSN: 17520509, doi:10.1186/17520509669,http://dx.doi.org/10.1186/17520509669.View ArticleGoogle Scholar
 K Paraschakis, R Dahlhaus, Frequency and phase estimation in time series with quasi periodic components. J Time Ser Anal. 33(1), 13–31 (2012).View ArticleMathSciNetMATHGoogle Scholar
 Y Jiang, Y He, Highaccuracy phase difference estimation between same frequency components in two periodic signals. Diangong Jishu Xuebao/Trans China Electromagn Soc. 21(11), 116–120126 (2006).MathSciNetGoogle Scholar
 N Radde, L Kaderali, Inference of an oscillating model for the yeast cell cycle. Discret Appl Math. 157(10), 2285–2295 (2009).View ArticleMathSciNetMATHGoogle Scholar
 W Gersch, in Decision and Control, 1987. 26th IEEE Conference on, 26. Some applications of smoothness priors in time series, (1987), pp. 1684–1689. doi:10.1109/CDC.1987.272756 http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4049585&isnumber=4049208.
 A Aderhold, D Husmeier, M Grzegorczyk, Statistical inference of regulatory networks for circadian regulation. Stat Appl Genet Mol Biol. 13(3), 227–273 (2014).MathSciNetMATHGoogle Scholar
 LA Walls, A Bendell, Time series methods in reliability. Reliab Eng. 18(4), 239–265 (1987).View ArticleGoogle Scholar
 P Babu, P Stoica. Sparse spectralline estimation for nonuniformly sampled multivariate time series: SPICE, LIKES and MSBL, (2012), pp. 445–449. http://www.scopus.com/inward/record.url?eid=2s2.084869834781&partnerID=40&md5=3813f3549fbe7eea2228597d512c4658.
 A Deckard, RC Anafi, JB Hogenesch, SB Haase, J Harer, Design and analysis of largescale biological rhythm studies: a comparison of algorithms for detecting periodic signals in biological data. Bioinformatics (Oxford, England). 29(24), 3174–3180 (2013).View ArticleGoogle Scholar
 L Holmstrom, I Launonen, Posterior singular spectrum analysis. Stat Anal Data Min. 6(5), 387–402 (2013).MathSciNetGoogle Scholar
 J Hong, 2. On modeling nonstationary geomagnetic signal, (1998), pp. 1593–1596. http://www.scopus.com/inward/record.url?eid=2s2.00032277163&partnerID=40&md5=46bac254ef8f7febc13aea42a3c55468.
 J Zhang, Y Li, Y Zhu, B Li, 9159. Estimation and prediction of noise power based on variational Bayesian and adaptive ARMA time series, (2014). http://www.scopus.com/inward/record.url?eid=2s2.084902290799&partnerID=40&md5=8f9593133a7e69bea5277f98b622e2e3.
 M West, 1. Timefrequency decompositions: Bayesian modelbased approaches, (1998), p. 276. http://www.scopus.com/inward/record.url?eid=2s2.00032268549partnerID=40md5=e90265c0f5e0a3b470de1c86981184fc.
 CK Carter, R Kohn, Semiparametric bayesian inference for time series with mixed spectra. JR Stat Soc Ser B Stat Methodol. 59(1), 255–268 (1997).View ArticleMathSciNetMATHGoogle Scholar
 R Paroli, L Spezia, Bayesian inference in nonhomogeneous markov mixtures of periodic autoregressions with statedependent exogenous variables. Comput Stat Data Anal. 52(5), 2311–2330 (2008).View ArticleMathSciNetMATHGoogle Scholar
 EJ McCoy, DA Stephens, Bayesian time series analysis of periodic behaviour and spectral structure. Int J Forecast. 20(4), 713–730 (2004).View ArticleGoogle Scholar
 ER Morrissey, MA Juarez, KJ Denby, NJ Burroughs, Inferring the timeinvariant topology of a nonlinear sparse gene regulatory network using fully bayesian spline autoregression. Biostatistics. 12(4), 682–694 (2011).View ArticleGoogle Scholar
 AS Dabye, Bayesian estimation for a poisson process with a discontinuous intensity [estimation bayesienne pour un processus de poisson d’intensite discontinue]. Comptes Rendus de l’Academie des Sciences  Series I: Mathematics. 328(5), 427–430 (1999).MathSciNetMATHGoogle Scholar
 E Granqvist, GE Oldroyd, RJ Morris, Automated bayesian model development for frequency detection in biological time series. BMC Syst Biol. 5(1), 1–4 (2011). ISSN: 17520509, doi:10.1186/17520509597 http://dx.doi.org/10.1186/17520509597.View ArticleGoogle Scholar
 G Padmanabhan, AR Rao, Maximum entropy spectra of some rainfall and river flow time series from southern and central india. Theor Appl Climatol. 37(1–2), 63–73 (1986).View ArticleGoogle Scholar
 PA Sturrock, JD Scargle, A bayesian assessment of pvalues for significance estimation of power spectra and an alternative procedure, with application to solar neutrino data. Astron J. 706(1), 393–398 (2009).View ArticleGoogle Scholar
 G Huerta, M West, Bayesian inference on periodicities and component spectral structure in time series. J Time Ser Anal. 20(4), 401–416 (1999).View ArticleMathSciNetMATHGoogle Scholar
 G Demoment, A Houacine, A Herment, I Mouttappa. Adaptive Bayesian spectrum estimation, (1988), pp. 33–38. http://www.scopus.com/inward/record.url?eid=2s2.00024143384&partnerID=40&md5=bc0098500514d7c0c651609ee516db28.
 DV Divine, F Godtliebsen, Bayesian modeling and significant features exploration in wavelet power spectra. Nonlinear Process Geophys. 14(1), 79–88 (2007).View ArticleGoogle Scholar
 A MohammadDjafari, Bayesian approach with prior models which enforce sparsity in signal and image processing. EURASIP J Adv Signal Process. 1:, 52–71 (2012).View ArticleGoogle Scholar
 MJ Wainwright, EP Simoncelli, in Adv Neural Inform Process Syst (NIPS’99), 12, ed. by SA Solla, TK Leen, and KR Müller. Scale mixtures of gaussians and the statistics of natural images (MIT PressCambridge, MA, 2000), pp. 855–861. http://papers.nips.cc/paper/1750scalemixturesofgaussiansandthestatisticsofnaturalimages.Google Scholar
 M Dumitru, A MohammadDjafari, in the 34th International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Engineering (MaxEnt 2014) Proceedings, 1641. Estimating the periodic components of a biomedical signal through inverse problem modelling and Bayesian inference with sparsity enforcing prior (Château Clos Lucé, Parc Leonardo DaVinciAmboise, France, 2014), pp. 548–555, doi:10.1063/1.4906021.Google Scholar