関数sequence_trend マニュアル

(The documentation of function sequence_trend)

Last Update: 2021/12/8


◆機能・用途(Purpose)

時系列データを直線でフィットしたときの傾きと切片を求める。
Calculate the slope and intercept of the straight line that best fits a time series data.


◆形式(Format)

#include <sequence/statistics.h>
inline double ∗sequence_trend(struct sequence seq)


◆引数(Arguments)

seq 計算に用いる時系列データを表す構造体。
A structure to represent the time series data used for the computation.


◆戻り値(Return value)

傾きと切片を並べた配列。 切片は時系列データの定義域に関わらず直線の時刻\(t=0\)での値とする。
An array composed of the slope and intercept. The intercept is defined as the value of the straight line at time \(t=0\) regardless of the definition range of the time series data.


◆使用例(Example)

struct sequence seq;
double ∗trend=sequence_trend(seq);

この例ではtrend[0]が傾き、trend[1]が切片となる。
In this example, trend[0] and trend[1] are the slope and intercept, respectively.


◆計算式(Formula)

時系列データが時刻\(t=t_0,t_1,\cdots,t_{N-1}\)において定義されているものとし、 それらの時刻での値を\(f_0,f_1,\cdots,f_{N-1}\)とする。 この時系列データを最も良く説明する直線を\(f=at+b\)として 最小二乗フィッティングによって傾き\(a\)と切片\(b\)を求める。 観測方程式は \[\begin{equation} f_i=at_i+b \hspace{1em} (i=0,1,\cdots,N-1) \label{eq.solve} \end{equation}\] であり、行列形式で \[\begin{equation} \begin{pmatrix} f_0 \\ \vdots \\ f_{N-1} \end{pmatrix} = \begin{pmatrix} t_0 & 1 \\ \vdots & \vdots \\ t_{N-1} & 1 \end{pmatrix} \begin{pmatrix} a \\ b \end{pmatrix} \label{eq.solve.matrix} \end{equation}\] と書ける。この最小二乗解は \[\begin{equation} \begin{pmatrix} t_0 & \cdots & t_{N-1} \\ 1 & \cdots & 1 \end{pmatrix} \begin{pmatrix} f_0 \\ \vdots \\ f_{N-1} \end{pmatrix} = \begin{pmatrix} t_0 & \cdots & t_{N-1} \\ 1 & \cdots & 1 \end{pmatrix} \begin{pmatrix} t_0 & 1 \\ \vdots & \vdots \\ t_{N-1} & 1 \end{pmatrix} \begin{pmatrix} a \\ b \end{pmatrix} \label{eq.solve.matrix.leastsquare} \end{equation}\] を解くことによって求められる。 左辺・右辺の係数行列を計算すると \[\begin{equation} \begin{pmatrix} \sum_{i=0}^{N-1} t_i f_i \\ \sum_{i=0}^{N-1} f_i \end{pmatrix} = \begin{pmatrix} \sum_{i=0}^{N-1} t_i^2 & \sum_{i=0}^{N-1} t_i \\ \sum_{i=0}^{N-1} t_i & N \end{pmatrix} \begin{pmatrix} a \\ b \end{pmatrix} \label{eq.solve.matrix.leastsquare.arrange} \end{equation}\] となり、クラメルの公式により \[\begin{eqnarray} a &=& \frac{\begin{vmatrix} \sum_{i=0}^{N-1} t_i f_i & \sum_{i=0}^{N-1} t_i \\ \sum_{i=0}^{N-1} f_i & N \end{vmatrix}} {\begin{vmatrix} \sum_{i=0}^{N-1} t_i^2 & \sum_{i=0}^{N-1} t_i \\ \sum_{i=0}^{N-1} t_i & N \end{vmatrix}} \nonumber \\ &=& \frac{N\sum_{i=0}^{N-1} t_i f_i -\left(\sum_{i=0}^{N-1} t_i\right)\left(\sum_{i=0}^{N-1} f_i\right)} {N\sum_{i=0}^{N-1} t_i^2-\left(\sum_{i=0}^{N-1} t_i\right)^2} \label{eq.a} \end{eqnarray}\] \[\begin{eqnarray} b &=& \frac{\begin{vmatrix} \sum_{i=0}^{N-1} t_i^2 & \sum_{i=0}^{N-1} t_i f_i \\ \sum_{i=0}^{N-1} t_i & \sum_{i=0}^{N-1} f_i \end{vmatrix}} {\begin{vmatrix} \sum_{i=0}^{N-1} t_i^2 & \sum_{i=0}^{N-1} t_i \\ \sum_{i=0}^{N-1} t_i & N \end{vmatrix}} \nonumber \\ &=& \frac{\left(\sum_{i=0}^{N-1} t_i^2\right) \left(\sum_{i=0}^{N-1} f_i\right) -\left(\sum_{i=0}^{N-1} t_i\right) \left(\sum_{i=0}^{N-1} t_i f_i\right)} {N\sum_{i=0}^{N-1} t_i^2-\left(\sum_{i=0}^{N-1} t_i\right)^2} \label{eq.b} \end{eqnarray}\] を得る。
Let the time series data be defined at time \(t=t_0,t_1,\cdots,t_{N-1}\), and the values of the data at these time samples be \(f_0,f_1,\cdots,f_{N-1}\). Let the straight line that best explains this time series data be \(f=at+b\). We solve for the slope \(a\) and the intersect \(b\) of this straight line by the least squares fitting. The observation equation is eq. (\ref{eq.solve}), which can be written in a matrix form as (\ref{eq.solve.matrix}). The least squares solution of this equation can be obtained by solving eq. (\ref{eq.solve.matrix.leastsquare}), which can be arranged as (\ref{eq.solve.matrix.leastsquare.arrange}). According to Cramer's formula, the solution of this equation is given by eqs. (\ref{eq.a}) and (\ref{eq.b}).

時系列データのサンプリング間隔を\(\Delta t\)とおくと \[\begin{equation} t_i=t_0+i\Delta t \label{eq.ti} \end{equation}\] であるので、 \[\begin{eqnarray} \sum_{i=0}^{N-1} t_i &=& \sum_{i=0}^{N-1}(t_0+i\Delta t) \nonumber \\ &=& Nt_0+\Delta t\sum_{i=1}^{N-1}i \nonumber \\ &=& Nt_0+\Delta t\frac{N(N-1)}{2} \label{eq.sumti} \end{eqnarray}\] \[\begin{eqnarray} \sum_{i=0}^{N-1} t_i^2 &=& \sum_{i=0}^{N-1}(t_0+i\Delta t)^2 \nonumber \\ &=& \sum_{i=0}^{N-1}(t_0^2 + 2t_0 i \Delta t + i^2\Delta t^2) \nonumber \\ &=& Nt_0^2 + 2t_0\Delta t\sum_{i=1}^{N-1}i + \Delta t^2\sum_{i=1}^{N-1}i^2 \nonumber \\ &=& Nt_0^2 + t_0\Delta t N(N-1) +\Delta t^2\frac{N(N-1)(2N-1)}{6} \label{eq.sumti2} \end{eqnarray}\] が得られる。 \(\sum_{i=0}^{N-1} t_i\), \(\sum_{i=0}^{N-1} t_i^2\)の計算には これらの解析解を利用できるので、 (\ref{eq.a})(\ref{eq.b})式の和の計算で数値的に行う必要があるのは \(f_i\)を含む項のみとなる。
The time \(t_i\) is given by eq. (\ref{eq.ti}), where \(\Delta t\) is the sampling interval. Then eqs. (\ref{eq.sumti}) and (\ref{eq.sumti2}) are defined. Using these relations, \(\sum_{i=0}^{N-1} t_i\) and \(\sum_{i=0}^{N-1} t_i^2\) can be computed analytically. Therefore only the terms of summations that consist of \(f_i\) must be calculated numerically in eqs. (\ref{eq.a}) and (\ref{eq.b}).