Results overview

Training set

The following figure presents the performance top 5 aggregation strategies within each group on the training set (by the lowest average total cost). For the reference, the original and uncertaintified models are also plotted.

Test set

The left part of the following figure compares the total cost performance on the test set among the original and uncertaintified models and each aggregation group (by the lowest cost). The right part compares accuracy, sensitivity, specificity and decisiveness on test set among among the original and uncertaintified models and each aggregation group.

Detailed performance measures with 95% confidence intervals on test set for both original and uncertaintified methods and best aggregation methods.
Method	Accuracy	Acc. 95% CI	Cost matrix	Cost m. 95% CI	Decisiveness	Dec. 95% CI	Sensitivity	Sen. 95% CI	Specificity	Spec. 95% CI
orig. Alc	0.889	(0.773-0.975)	189.0	(170.000-209.233)	0.206	(0.146-0.260)	0.882	(0.708-1.000)	0.895	(0.739-1.000)
orig. LR1	0.771	(0.650-0.875)	184.5	(163.000-204.500)	0.274	(0.217-0.337)	0.926	(0.830-1.000)	0.571	(0.333-0.796)
orig. LR2	0.828	(0.717-0.914)	164.0	(144.533-188.000)	0.331	(0.269-0.394)	0.943	(0.857-1.000)	0.652	(0.432-0.857)
orig. RMI	0.838	(0.769-0.913)	156.5	(128.067-183.233)	0.566	(0.483-0.646)	0.759	(0.592-0.918)	0.871	(0.799-0.943)
orig. SM	0.791	(0.713-0.856)	142.5	(117.500-169.233)	0.629	(0.560-0.703)	0.946	(0.861-1.000)	0.712	(0.607-0.806)
orig. Tim	0.916	(0.852-0.972)	159.0	(137.000-181.500)	0.474	(0.400-0.549)	0.667	(0.406-0.900)	0.971	(0.925-1.000)
unc. Alc	0.851	(0.780-0.918)	149.0	(124.533-177.500)	0.537	(0.466-0.614)	0.862	(0.714-0.967)	0.846	(0.754-0.925)
unc. LR1	0.721	(0.643-0.798)	154.5	(132.767-179.733)	0.594	(0.531-0.663)	0.957	(0.889-1.000)	0.534	(0.404-0.669)
unc. LR2	0.712	(0.639-0.786)	157.0	(134.533-180.733)	0.594	(0.531-0.669)	0.957	(0.889-1.000)	0.517	(0.388-0.650)
unc. RMI	0.873	(0.811-0.924)	123.0	(96.000-150.967)	0.766	(0.700-0.826)	0.767	(0.587-0.903)	0.904	(0.837-0.951)
unc. SM	0.790	(0.723-0.843)	114.0	(86.500-147.667)	0.926	(0.886-0.960)	0.913	(0.826-0.980)	0.741	(0.646-0.811)
unc. Tim	0.901	(0.846-0.949)	127.0	(100.267-157.733)	0.691	(0.623-0.754)	0.724	(0.537-0.885)	0.957	(0.908-0.990)
cho_auc_cen_0.025	0.901	(0.853-0.945)	80.0	(53.267-110.967)	0.920	(0.880-0.960)	0.826	(0.713-0.932)	0.930	(0.881-0.974)
i_cho_auc_(cen_0.025)	0.901	(0.853-0.945)	80.0	(53.267-110.967)	0.920	(0.880-0.960)	0.826	(0.713-0.932)	0.930	(0.881-0.974)
iMean_(dec_(owa_cen))_(cen_0.025)	0.886	(0.833-0.934)	70.0	(44.033-102.233)	0.949	(0.914-0.983)	0.902	(0.814-0.980)	0.878	(0.816-0.934)
iMean_wid_(cen_0.025)_2	0.865	(0.813-0.911)	75.5	(52.500-105.000)	0.971	(0.943-0.994)	0.918	(0.837-0.980)	0.843	(0.780-0.904)
i_(soft_s.min_0.25)_(cen_0.025)	0.867	(0.815-0.912)	78.0	(55.000-108.233)	0.943	(0.903-0.971)	0.918	(0.840-0.980)	0.845	(0.776-0.907)
mean_(dec_(owa_cen))_cen_0.025	0.886	(0.833-0.934)	70.0	(44.033-102.233)	0.949	(0.914-0.983)	0.902	(0.814-0.980)	0.878	(0.816-0.934)
mean_ep_min_0.0_3	0.876	(0.824-0.919)	72.0	(47.267-102.233)	0.971	(0.943-0.994)	0.900	(0.806-0.979)	0.867	(0.806-0.925)
(soft_s.min_0.25)_cen_0.025	0.861	(0.811-0.904)	82.0	(57.767-112.733)	0.949	(0.914-0.971)	0.898	(0.811-0.980)	0.846	(0.779-0.907)

Selected aggregation strategies

The left part of the following figure compares the total cost performance on the test set of the aggreagation strategies which fulfil medical requirements in the diagnostic process. The right part compares accuracy, sensitivity, specificity and decisiveness on these aggregation strategies.

Detailed performance measures with 95% confidence intervals on test set for selected aggregation methods.
Method	Accuracy	Acc. 95% CI	Cost matrix	Cost m. 95% CI	Decisiveness	Dec. 95% CI	Sensitivity	Sen. 95% CI	Specificity	Spec. 95% CI
iMean_(dec_(owa_min))_(cen_0.025)	0.876	(0.822-0.923)	72.0	(44.767-103.467)	0.966	(0.937-0.994)	0.902	(0.814-0.980)	0.864	(0.805-0.924)
iMean_ep_(max_0.0)_3	0.859	(0.805-0.906)	79.5	(53.033-111.000)	0.971	(0.943-0.994)	0.900	(0.806-0.979)	0.842	(0.777-0.904)
iMean_wid_(cen_0.025)_2	0.865	(0.813-0.911)	75.5	(52.500-105.000)	0.971	(0.943-0.994)	0.918	(0.837-0.980)	0.843	(0.780-0.904)
mean_(dec_(owa_min))_cen_0.025	0.876	(0.822-0.923)	72.0	(44.767-103.467)	0.966	(0.937-0.994)	0.902	(0.814-0.980)	0.864	(0.805-0.924)
mean_ep_cen_0.0_3	0.871	(0.820-0.913)	74.5	(49.267-105.000)	0.971	(0.943-0.994)	0.900	(0.806-0.979)	0.858	(0.798-0.916)
mean_ep_max_0.0_3	0.859	(0.805-0.906)	79.5	(53.033-111.000)	0.971	(0.943-0.994)	0.900	(0.806-0.979)	0.842	(0.777-0.904)
mean_ep_min_0.0_3	0.876	(0.824-0.919)	72.0	(47.267-102.233)	0.971	(0.943-0.994)	0.900	(0.806-0.979)	0.867	(0.806-0.925)

Difference between aggregation strategies and uncertaintified models

The following table shows the resuls McNemar’s test among the selected aggregation strategies and with relation to the uncertaintified models. The NaN values indicate that in a given pair of methods classify identically.

Legend: short ids for selected aggregation strategies
Id	Method	Class	Subclass	Subsubclass
A	iMean_(dec_(owa_min))_(cen_0.025)	Aggregation	OWA	Interval
B	iMean_ep_(max_0.0)_3	Aggregation	Mean	Interval
C	iMean_wid_(cen_0.025)_2	Aggregation	Mean	Interval
D	mean_(dec_(owa_min))_cen_0.025	Aggregation	OWA	Numeric
E	mean_ep_cen_0.0_3	Aggregation	Mean	Numeric
F	mean_ep_max_0.0_3	Aggregation	Mean	Numeric
G	mean_ep_min_0.0_3	Aggregation	Mean	Numeric

McNemar’s test with Benjamini-Hochberg correction among selected aggregation strategies and bettwen the strageties and the uncertaintified models
	A	B	C	D	E	F	G
A
B	0.955
C	1.000	1.000
D	NaN	0.955	1.000
E	1.000	0.657	1.000	1.000
F	0.955	NaN	1.000	0.955	0.657
G	1.000	0.355	0.955	1.000	1.000	0.355
Alc	0.000	0.000	0.000	0.000	0.000	0.000	0.000
LR1	0.000	0.000	0.000	0.000	0.000	0.000	0.000
LR2	0.000	0.000	0.000	0.000	0.000	0.000	0.000
RMI	0.000	0.000	0.000	0.000	0.000	0.000	0.000
SM	0.001	0.002	0.002	0.001	0.001	0.002	0.001
Tim	0.000	0.000	0.000	0.000	0.000	0.000	0.000

Results overview

Andrzej Wójtowicz, Patryk Żywica

Executive summary

Introduction

Results

Training set

Test set

Selected aggregation strategies

Difference between aggregation strategies and uncertaintified models