Document generation date: 2016-03-20 12:17:28

Executive summary

This document presents an overview of the results obtained by the aggregation strategies in medical diagnosis support under data incompleteness.

Introduction

In the Analytic datasets construction document we described how to obtain the training and test sets for the evaluation process. In the Training and evaluation of aggregation strategies document we described how to evaluate different aggregation operators and thresholding strategies on these data.

In the following document we present an overview of the performance of aggregation strategies. To decode names of the aggregation strategies, please refer to comments and code in aggregators.R and aggregators-helpers.R scripts.

Results

This section presents the results obtained on the training and test set. More detailed resutls consideres selected aggregation strategies which may fulfil medical requirements in the diagnostic process.

Training set

The following figure presents the performance top 5 aggregation strategies within each group on the training set (by the lowest average total cost). For the reference, the original and uncertaintified models are also plotted.

Test set

The left part of the following figure compares the total cost performance on the test set among the original and uncertaintified models and each aggregation group (by the lowest cost). The right part compares accuracy, sensitivity, specificity and decisiveness on test set among among the original and uncertaintified models and each aggregation group.

Detailed performance measures with 95% confidence intervals on test set for both original and uncertaintified methods and best aggregation methods.
Method Accuracy Acc. 95% CI Cost matrix Cost m. 95% CI Decisiveness Dec. 95% CI Sensitivity Sen. 95% CI Specificity Spec. 95% CI
orig. Alc 0.889 (0.773-0.975) 189.0 (170.000-209.233) 0.206 (0.146-0.260) 0.882 (0.708-1.000) 0.895 (0.739-1.000)
orig. LR1 0.771 (0.650-0.875) 184.5 (163.000-204.500) 0.274 (0.217-0.337) 0.926 (0.830-1.000) 0.571 (0.333-0.796)
orig. LR2 0.828 (0.717-0.914) 164.0 (144.533-188.000) 0.331 (0.269-0.394) 0.943 (0.857-1.000) 0.652 (0.432-0.857)
orig. RMI 0.838 (0.769-0.913) 156.5 (128.067-183.233) 0.566 (0.483-0.646) 0.759 (0.592-0.918) 0.871 (0.799-0.943)
orig. SM 0.791 (0.713-0.856) 142.5 (117.500-169.233) 0.629 (0.560-0.703) 0.946 (0.861-1.000) 0.712 (0.607-0.806)
orig. Tim 0.916 (0.852-0.972) 159.0 (137.000-181.500) 0.474 (0.400-0.549) 0.667 (0.406-0.900) 0.971 (0.925-1.000)
unc. Alc 0.851 (0.780-0.918) 149.0 (124.533-177.500) 0.537 (0.466-0.614) 0.862 (0.714-0.967) 0.846 (0.754-0.925)
unc. LR1 0.721 (0.643-0.798) 154.5 (132.767-179.733) 0.594 (0.531-0.663) 0.957 (0.889-1.000) 0.534 (0.404-0.669)
unc. LR2 0.712 (0.639-0.786) 157.0 (134.533-180.733) 0.594 (0.531-0.669) 0.957 (0.889-1.000) 0.517 (0.388-0.650)
unc. RMI 0.873 (0.811-0.924) 123.0 (96.000-150.967) 0.766 (0.700-0.826) 0.767 (0.587-0.903) 0.904 (0.837-0.951)
unc. SM 0.790 (0.723-0.843) 114.0 (86.500-147.667) 0.926 (0.886-0.960) 0.913 (0.826-0.980) 0.741 (0.646-0.811)
unc. Tim 0.901 (0.846-0.949) 127.0 (100.267-157.733) 0.691 (0.623-0.754) 0.724 (0.537-0.885) 0.957 (0.908-0.990)
cho_auc_cen_0.025 0.901 (0.853-0.945) 80.0 (53.267-110.967) 0.920 (0.880-0.960) 0.826 (0.713-0.932) 0.930 (0.881-0.974)
i_cho_auc_(cen_0.025) 0.901 (0.853-0.945) 80.0 (53.267-110.967) 0.920 (0.880-0.960) 0.826 (0.713-0.932) 0.930 (0.881-0.974)
iMean_(dec_(owa_cen))_(cen_0.025) 0.886 (0.833-0.934) 70.0 (44.033-102.233) 0.949 (0.914-0.983) 0.902 (0.814-0.980) 0.878 (0.816-0.934)
iMean_wid_(cen_0.025)_2 0.865 (0.813-0.911) 75.5 (52.500-105.000) 0.971 (0.943-0.994) 0.918 (0.837-0.980) 0.843 (0.780-0.904)
i_(soft_s.min_0.25)_(cen_0.025) 0.867 (0.815-0.912) 78.0 (55.000-108.233) 0.943 (0.903-0.971) 0.918 (0.840-0.980) 0.845 (0.776-0.907)
mean_(dec_(owa_cen))_cen_0.025 0.886 (0.833-0.934) 70.0 (44.033-102.233) 0.949 (0.914-0.983) 0.902 (0.814-0.980) 0.878 (0.816-0.934)
mean_ep_min_0.0_3 0.876 (0.824-0.919) 72.0 (47.267-102.233) 0.971 (0.943-0.994) 0.900 (0.806-0.979) 0.867 (0.806-0.925)
(soft_s.min_0.25)_cen_0.025 0.861 (0.811-0.904) 82.0 (57.767-112.733) 0.949 (0.914-0.971) 0.898 (0.811-0.980) 0.846 (0.779-0.907)

Selected aggregation strategies

The left part of the following figure compares the total cost performance on the test set of the aggreagation strategies which fulfil medical requirements in the diagnostic process. The right part compares accuracy, sensitivity, specificity and decisiveness on these aggregation strategies.

Detailed performance measures with 95% confidence intervals on test set for selected aggregation methods.
Method Accuracy Acc. 95% CI Cost matrix Cost m. 95% CI Decisiveness Dec. 95% CI Sensitivity Sen. 95% CI Specificity Spec. 95% CI
iMean_(dec_(owa_min))_(cen_0.025) 0.876 (0.822-0.923) 72.0 (44.767-103.467) 0.966 (0.937-0.994) 0.902 (0.814-0.980) 0.864 (0.805-0.924)
iMean_ep_(max_0.0)_3 0.859 (0.805-0.906) 79.5 (53.033-111.000) 0.971 (0.943-0.994) 0.900 (0.806-0.979) 0.842 (0.777-0.904)
iMean_wid_(cen_0.025)_2 0.865 (0.813-0.911) 75.5 (52.500-105.000) 0.971 (0.943-0.994) 0.918 (0.837-0.980) 0.843 (0.780-0.904)
mean_(dec_(owa_min))_cen_0.025 0.876 (0.822-0.923) 72.0 (44.767-103.467) 0.966 (0.937-0.994) 0.902 (0.814-0.980) 0.864 (0.805-0.924)
mean_ep_cen_0.0_3 0.871 (0.820-0.913) 74.5 (49.267-105.000) 0.971 (0.943-0.994) 0.900 (0.806-0.979) 0.858 (0.798-0.916)
mean_ep_max_0.0_3 0.859 (0.805-0.906) 79.5 (53.033-111.000) 0.971 (0.943-0.994) 0.900 (0.806-0.979) 0.842 (0.777-0.904)
mean_ep_min_0.0_3 0.876 (0.824-0.919) 72.0 (47.267-102.233) 0.971 (0.943-0.994) 0.900 (0.806-0.979) 0.867 (0.806-0.925)

Difference between aggregation strategies and uncertaintified models

The following table shows the resuls McNemar’s test among the selected aggregation strategies and with relation to the uncertaintified models. The NaN values indicate that in a given pair of methods classify identically.

Legend: short ids for selected aggregation strategies
Id Method Class Subclass Subsubclass
A iMean_(dec_(owa_min))_(cen_0.025) Aggregation OWA Interval
B iMean_ep_(max_0.0)_3 Aggregation Mean Interval
C iMean_wid_(cen_0.025)_2 Aggregation Mean Interval
D mean_(dec_(owa_min))_cen_0.025 Aggregation OWA Numeric
E mean_ep_cen_0.0_3 Aggregation Mean Numeric
F mean_ep_max_0.0_3 Aggregation Mean Numeric
G mean_ep_min_0.0_3 Aggregation Mean Numeric
McNemar’s test with Benjamini-Hochberg correction among selected aggregation strategies and bettwen the strageties and the uncertaintified models
A B C D E F G
A
B 0.955
C 1.000 1.000
D NaN 0.955 1.000
E 1.000 0.657 1.000 1.000
F 0.955 NaN 1.000 0.955 0.657
G 1.000 0.355 0.955 1.000 1.000 0.355
Alc 0.000 0.000 0.000 0.000 0.000 0.000 0.000
LR1 0.000 0.000 0.000 0.000 0.000 0.000 0.000
LR2 0.000 0.000 0.000 0.000 0.000 0.000 0.000
RMI 0.000 0.000 0.000 0.000 0.000 0.000 0.000
SM 0.001 0.002 0.002 0.001 0.001 0.002 0.001
Tim 0.000 0.000 0.000 0.000 0.000 0.000 0.000