Document generation date: 2016-03-20 12:17:28
This document presents an overview of the results obtained by the aggregation strategies in medical diagnosis support under data incompleteness.
In the Analytic datasets construction document we described how to obtain the training and test sets for the evaluation process. In the Training and evaluation of aggregation strategies document we described how to evaluate different aggregation operators and thresholding strategies on these data.
In the following document we present an overview of the performance of aggregation strategies. To decode names of the aggregation strategies, please refer to comments and code in aggregators.R
and aggregators-helpers.R
scripts.
This section presents the results obtained on the training and test set. More detailed resutls consideres selected aggregation strategies which may fulfil medical requirements in the diagnostic process.
The following figure presents the performance top 5 aggregation strategies within each group on the training set (by the lowest average total cost). For the reference, the original and uncertaintified models are also plotted.
The left part of the following figure compares the total cost performance on the test set among the original and uncertaintified models and each aggregation group (by the lowest cost). The right part compares accuracy, sensitivity, specificity and decisiveness on test set among among the original and uncertaintified models and each aggregation group.
Method | Accuracy | Acc. 95% CI | Cost matrix | Cost m. 95% CI | Decisiveness | Dec. 95% CI | Sensitivity | Sen. 95% CI | Specificity | Spec. 95% CI |
---|---|---|---|---|---|---|---|---|---|---|
orig. Alc | 0.889 | (0.773-0.975) | 189.0 | (170.000-209.233) | 0.206 | (0.146-0.260) | 0.882 | (0.708-1.000) | 0.895 | (0.739-1.000) |
orig. LR1 | 0.771 | (0.650-0.875) | 184.5 | (163.000-204.500) | 0.274 | (0.217-0.337) | 0.926 | (0.830-1.000) | 0.571 | (0.333-0.796) |
orig. LR2 | 0.828 | (0.717-0.914) | 164.0 | (144.533-188.000) | 0.331 | (0.269-0.394) | 0.943 | (0.857-1.000) | 0.652 | (0.432-0.857) |
orig. RMI | 0.838 | (0.769-0.913) | 156.5 | (128.067-183.233) | 0.566 | (0.483-0.646) | 0.759 | (0.592-0.918) | 0.871 | (0.799-0.943) |
orig. SM | 0.791 | (0.713-0.856) | 142.5 | (117.500-169.233) | 0.629 | (0.560-0.703) | 0.946 | (0.861-1.000) | 0.712 | (0.607-0.806) |
orig. Tim | 0.916 | (0.852-0.972) | 159.0 | (137.000-181.500) | 0.474 | (0.400-0.549) | 0.667 | (0.406-0.900) | 0.971 | (0.925-1.000) |
unc. Alc | 0.851 | (0.780-0.918) | 149.0 | (124.533-177.500) | 0.537 | (0.466-0.614) | 0.862 | (0.714-0.967) | 0.846 | (0.754-0.925) |
unc. LR1 | 0.721 | (0.643-0.798) | 154.5 | (132.767-179.733) | 0.594 | (0.531-0.663) | 0.957 | (0.889-1.000) | 0.534 | (0.404-0.669) |
unc. LR2 | 0.712 | (0.639-0.786) | 157.0 | (134.533-180.733) | 0.594 | (0.531-0.669) | 0.957 | (0.889-1.000) | 0.517 | (0.388-0.650) |
unc. RMI | 0.873 | (0.811-0.924) | 123.0 | (96.000-150.967) | 0.766 | (0.700-0.826) | 0.767 | (0.587-0.903) | 0.904 | (0.837-0.951) |
unc. SM | 0.790 | (0.723-0.843) | 114.0 | (86.500-147.667) | 0.926 | (0.886-0.960) | 0.913 | (0.826-0.980) | 0.741 | (0.646-0.811) |
unc. Tim | 0.901 | (0.846-0.949) | 127.0 | (100.267-157.733) | 0.691 | (0.623-0.754) | 0.724 | (0.537-0.885) | 0.957 | (0.908-0.990) |
cho_auc_cen_0.025 | 0.901 | (0.853-0.945) | 80.0 | (53.267-110.967) | 0.920 | (0.880-0.960) | 0.826 | (0.713-0.932) | 0.930 | (0.881-0.974) |
i_cho_auc_(cen_0.025) | 0.901 | (0.853-0.945) | 80.0 | (53.267-110.967) | 0.920 | (0.880-0.960) | 0.826 | (0.713-0.932) | 0.930 | (0.881-0.974) |
iMean_(dec_(owa_cen))_(cen_0.025) | 0.886 | (0.833-0.934) | 70.0 | (44.033-102.233) | 0.949 | (0.914-0.983) | 0.902 | (0.814-0.980) | 0.878 | (0.816-0.934) |
iMean_wid_(cen_0.025)_2 | 0.865 | (0.813-0.911) | 75.5 | (52.500-105.000) | 0.971 | (0.943-0.994) | 0.918 | (0.837-0.980) | 0.843 | (0.780-0.904) |
i_(soft_s.min_0.25)_(cen_0.025) | 0.867 | (0.815-0.912) | 78.0 | (55.000-108.233) | 0.943 | (0.903-0.971) | 0.918 | (0.840-0.980) | 0.845 | (0.776-0.907) |
mean_(dec_(owa_cen))_cen_0.025 | 0.886 | (0.833-0.934) | 70.0 | (44.033-102.233) | 0.949 | (0.914-0.983) | 0.902 | (0.814-0.980) | 0.878 | (0.816-0.934) |
mean_ep_min_0.0_3 | 0.876 | (0.824-0.919) | 72.0 | (47.267-102.233) | 0.971 | (0.943-0.994) | 0.900 | (0.806-0.979) | 0.867 | (0.806-0.925) |
(soft_s.min_0.25)_cen_0.025 | 0.861 | (0.811-0.904) | 82.0 | (57.767-112.733) | 0.949 | (0.914-0.971) | 0.898 | (0.811-0.980) | 0.846 | (0.779-0.907) |
The left part of the following figure compares the total cost performance on the test set of the aggreagation strategies which fulfil medical requirements in the diagnostic process. The right part compares accuracy, sensitivity, specificity and decisiveness on these aggregation strategies.
Method | Accuracy | Acc. 95% CI | Cost matrix | Cost m. 95% CI | Decisiveness | Dec. 95% CI | Sensitivity | Sen. 95% CI | Specificity | Spec. 95% CI |
---|---|---|---|---|---|---|---|---|---|---|
iMean_(dec_(owa_min))_(cen_0.025) | 0.876 | (0.822-0.923) | 72.0 | (44.767-103.467) | 0.966 | (0.937-0.994) | 0.902 | (0.814-0.980) | 0.864 | (0.805-0.924) |
iMean_ep_(max_0.0)_3 | 0.859 | (0.805-0.906) | 79.5 | (53.033-111.000) | 0.971 | (0.943-0.994) | 0.900 | (0.806-0.979) | 0.842 | (0.777-0.904) |
iMean_wid_(cen_0.025)_2 | 0.865 | (0.813-0.911) | 75.5 | (52.500-105.000) | 0.971 | (0.943-0.994) | 0.918 | (0.837-0.980) | 0.843 | (0.780-0.904) |
mean_(dec_(owa_min))_cen_0.025 | 0.876 | (0.822-0.923) | 72.0 | (44.767-103.467) | 0.966 | (0.937-0.994) | 0.902 | (0.814-0.980) | 0.864 | (0.805-0.924) |
mean_ep_cen_0.0_3 | 0.871 | (0.820-0.913) | 74.5 | (49.267-105.000) | 0.971 | (0.943-0.994) | 0.900 | (0.806-0.979) | 0.858 | (0.798-0.916) |
mean_ep_max_0.0_3 | 0.859 | (0.805-0.906) | 79.5 | (53.033-111.000) | 0.971 | (0.943-0.994) | 0.900 | (0.806-0.979) | 0.842 | (0.777-0.904) |
mean_ep_min_0.0_3 | 0.876 | (0.824-0.919) | 72.0 | (47.267-102.233) | 0.971 | (0.943-0.994) | 0.900 | (0.806-0.979) | 0.867 | (0.806-0.925) |
The following table shows the resuls McNemar’s test among the selected aggregation strategies and with relation to the uncertaintified models. The NaN
values indicate that in a given pair of methods classify identically.
Id | Method | Class | Subclass | Subsubclass |
---|---|---|---|---|
A | iMean_(dec_(owa_min))_(cen_0.025) | Aggregation | OWA | Interval |
B | iMean_ep_(max_0.0)_3 | Aggregation | Mean | Interval |
C | iMean_wid_(cen_0.025)_2 | Aggregation | Mean | Interval |
D | mean_(dec_(owa_min))_cen_0.025 | Aggregation | OWA | Numeric |
E | mean_ep_cen_0.0_3 | Aggregation | Mean | Numeric |
F | mean_ep_max_0.0_3 | Aggregation | Mean | Numeric |
G | mean_ep_min_0.0_3 | Aggregation | Mean | Numeric |
A | B | C | D | E | F | G | |
---|---|---|---|---|---|---|---|
A | |||||||
B | 0.955 | ||||||
C | 1.000 | 1.000 | |||||
D | NaN | 0.955 | 1.000 | ||||
E | 1.000 | 0.657 | 1.000 | 1.000 | |||
F | 0.955 | NaN | 1.000 | 0.955 | 0.657 | ||
G | 1.000 | 0.355 | 0.955 | 1.000 | 1.000 | 0.355 | |
Alc | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 |
LR1 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 |
LR2 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 |
RMI | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 |
SM | 0.001 | 0.002 | 0.002 | 0.001 | 0.001 | 0.002 | 0.001 |
Tim | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 |