関数sequence_cumul_probability_diff_Gaussian マニュアル

(The documentation of function sequence_cumul_probability_diff_Gaussian)

Last Update: 2025/7/16

◆機能・用途(Purpose)

実際の時系列データの絶対値振幅の累積確率分布と、時系列データが正規分布に従うランダムノイズである場合に理論的に期待される絶対値振幅の累積確率分布とのRMS残差を計算する。
Compute the root mean square residual between an actual cumulative frequency distribution of the absolute amplitudes of an observed time series data and a theoretical cumulative probability distribution of the absolute amplitudes of a time series data that obeys a normal distribution.

この関数は時系列データにおけるバックグラウンドノイズレベルを Maeda et al. (2020)の方法で計算するための補助関数として作成したものである。時系列データのうちバックグラウンドノイズレベル以下の振幅は正規分布に従い、バックグラウドノイズを超える振幅は正規分布からずれるという考え方に基づいている。
This function was developed to support computation of the background noise level of a time series data using the method proposed by Maeda et al. (2020). The basic idea of the algorithm is that the amplitudes of a time series data below a background noise level would obey a normal distribution, while the amplitudes above the background noise level would deviate from the normal distribution.

\(N\)個の時刻サンプルから成る時系列データの各時刻での絶対値を小さい順に並べたものを \(|v_n|\) (\(n=0,1,\cdots,N-1\))とする。このうちの絶対値が小さい順に\(N’\)個の値を用いたときのそのデータに基づく絶対値振幅の累積確率分布を\(P_{obs}(|v|;N’)\)とする。この累積確率分布はデータが存在する振幅においてのみ定義可能で \[\begin{equation} P_{obs}(|v_n|;N’)=\frac{n}{N’-1} \label{eq.Pobs} \end{equation}\] となる。
Let \(|v_n|\) (\(n=0,1,\cdots,N-1\)) be the absolute values of a time series data at all \(N\) time samples sorted in the ascending order. Suppose that the smallest \(N’\) absolute values of them are used for the evaluation, and let \(P_{obs}(|v|;N’)\) be the cumulative probability distribution of absolute amplitudes based on this data. This cumulative probability distribution can be defined only at the amplitudes where the data exists, and is given by Eq. (\ref{eq.Pobs}).

このデータ(絶対値が小さい順に\(N’\)個の値)が平均値0の正規分布に従うと仮定すると、その標準偏差は \[\begin{equation} \sigma(N’)= \sqrt{\frac{1}{N’}\sum_{n=0}^{N’-1} v_n^2} \label{eq.sigma} \end{equation}\] と計算できる。この標準偏差を用いて振幅\(v\)の正規分布は \[\begin{equation} G(v;N’) =\frac{1}{\sqrt{2\pi}\sigma(N’)} \exp\left[-\frac{v^2}{2\sigma(N’)^2}\right] \label{eq.G} \end{equation}\] と表され、絶対値振幅\(|v|\)の累積確率分布は \[\begin{equation} P_{syn}(|v|;N’) =\int_0^{|v|}2G(v’;N’)dv’ =erf\left[\frac{|v|}{\sqrt{2}\sigma(N’)}\right] \label{eq.Psyn} \end{equation}\] となる。ここで\(erf\)は誤差関数を表す。
Now, suppose that this data, for the smallest \(N’\) absolute amplitudes, obeys a normal distribution with an average of zero. Its standard deviation is computed by Eq. (\ref{eq.sigma}), and the normal distribution is expressed as Eq. (\ref{eq.G}), where \(v\) is an amplitude of the data. The cumulative probability distribution of amplitude amplitudes \(|v|\) is given by Eq. (\ref{eq.Psyn}), where \(erf\) is an error function.

これら2種類の累積確率分布間のRMS残差は \[\begin{equation} \mu(N’)= \sqrt{\frac{1}{N’}\sum_{n=0}^{N’-1} \left[P_{obs}(|v_n|;N’)-P_{syn}(|v_n|;N’)\right]^2} \label{eq.RMSresidual} \end{equation}\] であり、この関数ではこの\(\mu(N’)\)を計算する。
The root mean square residual between the two cumulative probability distributions is computed by Eq. (\ref{eq.RMSresidual}). This function computes this quantity.

◆形式(Format)

#include <sequence/statistics.h>
inline double sequence_cumul_probability_diff_Gaussian
(const int Ndash,const double sigma,
const struct sequence waveform,const int ∗order)

◆引数(Arguments)

Ndash	使用する\(N’\)の値。時系列データ(引数waveform)の時刻サンプル数以下の正の値でなければならない。 The value of \(N’\), which must be positive and less than the number of time samples of the time series data given by argument waveform.
sigma	\(\sigma(N’)\)の値。時系列データ(引数waveform)から計算するのではなく、関数の外で予め計算して与える。その理由はMaeda et al. (2020)において時系列データを奇数番目と偶数番目の時刻サンプルに分割し、偶数番目の時刻サンプルを用いた時系列データに対しては奇数番目の時系列データから計算した\(\sigma(N’)\) 奇数番目の時刻サンプルを用いた時系列データに対しては偶数番目の時系列データから計算した\(\sigma(N’)\) を用いるためである。すなわち、与える\(\sigma(N’)\)の値と引数waveformを用いて(\ref{eq.sigma})式から計算される\(\sigma(N’)\)の値は厳密には等しくない。 The value of \(\sigma(N’)\). This value is not calculated from the time series data of argument waveform but calculated outside this function because in Maeda et al. (2020), the time series data is divided into odd- and even-numbered samples, and \(\sigma(N’)\) calculated from the odd-numbered data is used for the time series data of even-numbered samples (and via versa); the given \(\sigma(N’)\) value is not exactly equal to \(\sigma(N’)\) calculated from argument waveform using Eq. (\ref{eq.sigma}).
waveform	絶対値振幅の時系列データ。 Maeda et al. (2020)のアルゴリズムに厳密に準拠する場合は奇数番目または偶数番目の時刻サンプルを取り出した時系列データを与える。 A time series data of absolute amplitudes. To exactly follow the algorithm of Maeda et al. (2020), give a time series data composed of only odd- or even-numbered samples.
order	引数waveformで与える時系列データの絶対値振幅を昇順に並べるための配列要素番号リスト。 waveform.value[order[n]]が昇順になるように与える。 A list of array component indices that realizes waveform.value[order[n]] to be the ascending order.

◆戻り値(Return value)

(\ref{eq.RMSresidual})式の\(\mu(N’)\)の値。
The value of \(\mu(N’)\) (Eq. \ref{eq.RMSresidual}).

◆使用例(Example)

◆引用文献(References)

Maeda Y, Yamanaka Y, Ito T, Horikawa S (2020) Machine learning based detection of volcano seismicity using the spatial pattern of amplitudes, Geophys J Int 225(1), 416-444. https://doi.org/10.1093/gji/ggaa593

この論文は下記の名古屋大学リポジトリからも利用できる。
This paper is also available from the repository of Nagoya University below.
https://nagoya.repo.nii.ac.jp/records/2001674