関数calculate_average_remove_outliers マニュアル

(The documentation of function calculate_average_remove_outliers)

Last Update: 2025/8/7

◆機能・用途(Purpose)

実数値のリストから大きい方と小さい方の一定割合を除いた上で平均値を計算する。
Calculate the average of a list of real numbers, excluding a given fraction in larger and smaller sides.

\(N\)個の実数\(d_1,\cdots,d_N\)を降順に並べ替えたリストを \(\hat{d}_1,\cdots,\hat{d}_N\)とし、大きい方と小さい方からそれぞれ\(N_{cut}\)個を除く場合、平均値の計算式は \[\begin{equation} \bar{d}=\frac{1}{N-2N_{cut}}\sum_{i=N_{cut}+1}^{N-N_{cut}}\hat{d}_i \label{eq.average} \end{equation}\] となる。この関数では\(N_{cut}\)を直接指定するのではなく除くデータの割合\(r\)を与え、 \(rN\)を四捨五入した値を\(N_{cut}\)として計算を行う。
Let \(N\) be the number of real numbers, and \(d_1,\cdots,d_N\) be the list of real numbers. Let \(\hat{d}_1,\cdots,\hat{d}_N\) be the list of real numbers sorted in a descending order, and \(N_{cut}\) be the number of data samples removed from each of the largest and smallest sides. Then the average is given by eq. (\ref{eq.average}). In this function, \(N_{cut}\) is not directly specified. Instead, the ratio, \(r\), of the data samples to be removed from each of the largest and smallest sides is specified, and \(N_{cut}\) is calculated by rounding \(rN\) to the nearest integer.

◆形式(Format)

#include <statistics.h>
inline double calculate_average_remove_outliers
(const int N,const double ∗d,const double ratio)

◆引数(Arguments)

N	データサンプル数\(N\)。 The number of data samples.
d	実数値\(d_1,\cdots,d_N\)を並べた配列。 An array composed of the real numbers \(d_1,\cdots,d_N\).
ratio	大きい方と小さい方の計算から除外するデータサンプル数の割合\(r\)。大きい方と小さい方からそれぞれ\(r\)の割合ずつ、合計で\(2r\)の割合が除外される。したがって\(0\leq r < 0.5\)とし、かつ除外後に最低1サンプルが残らなければならない。 Ratio, \(r\), of the number of data samples removed from the largest/smallest sides of the list of the data for the calculation of the average. Since the data samples of the given ratio are removed from each of the largest/smallest sides, a total of \(2r\) are removed. The value thus must satisfy \(0\leq r < 0.5\), and in addition at least 1 data sample must be remained.

◆戻り値(Return value)

(\ref{eq.average})式で計算した平均値。
The average calculated with eq. (\ref{eq.average}).

◆使用例(Example)

const int N=10;
const double d[]={1.2,3.4,5.6,7.8,9.1,2.3,4.5,6.7,8.9,0.1};
double a=calculate_average_remove_outliers(N,d,0.2);

この例では実数値のリストdを大きい順に並べると

9.1
8.9
7.8
6.7
5.6
4.5
3.4
2.3
1.2
0.1

となる。この中から大きい方と小さい方の\(r=0.2\)の割合(すなわち20%) ずつを除くと

7.8
6.7
5.6
4.5
3.4
2.3

となり、その平均値は5.05である。したがってa=5.05となる。
In this example, the real numbers listed in “d” can be sorted as an ascending order as:

9.1
8.9
7.8
6.7
5.6
4.5
3.4
2.3
1.2
0.1

and removing the ratios of \(r=0.2\) (i.e., 20%) from the largest and smallest sides results in:

7.8
6.7
5.6
4.5
3.4
2.3

The average of this list is 5.05. Therefore the program above results in a=5.05.

◆検証(Validation)

上の「使用例」の計算をこの関数を用いて行い、正しい結果(5.05)が得られることを確認した。
A calculation of the “Example” above using this function yielded a correct result (5.05).

◆補足(Additional notes)

大きい方と小さい方の\(r\)の割合のサンプルは単に平均値の計算から除外されるだけであり、配列dの要素は関数呼び出し前後で変化しない。
The samples of ratio \(r\) in each of the largest and smallest sides are simply not used for calculation of the average, without changes for the array components of “d”.