Description

The empirical cumulative distribution function (ECDF) can be calculated from several runs of an optimization algorithm. From a first-hit graph several targets will be defined in the objective function and it will be measured when a target is hit by an algorithm. The proportion of all these hits forms the distribution.

RLD Analysis is to be included through analyzer operators that can be attached to the Analyzers of an algorithm instance. When run they track the monotonic convergence graph. In addition a run collection view is to be created that constructs and displays the ECDF.

Review comments

IndexedDataRow

ItemDescription "A data row that contains time series." is somehow misleading. Isn't it possible that the data row contains other data than time series?

Forward of NameChange to the VisualProperty.DisplayProperties (compare DataRow line 119-123)

IndexedDataTable

RegisterRowsEvents should not be virtual due to the call in the ctor. IMHO it is better to make it private and not virtual and the registered methods (e.g. rows_ItemsAdded) protected virtual. As a result a potential subclass is automatically registered to the relevant events, but could decide how to react on those by overriding the registered methods.

QualityPerEvaluationsAnalyzer

Comments in line 94 and 101 are misleading. The values are not directly replaced, but rather in the next application of the analyzer. Does this code work if two consecutive improvements are achieved? (I think so) The code is IMHO pretty complex.

Line 101 checks for inequality between the last item and the best quality.

var improvement = values.Last().Item2 != bestQuality;

What if its not an improvement but a decrease in quality? Should the graph still be updated and is the variable named incorrectly or should the best values be kept? The current implementation depends on the bestQuality which is updated only if an improvement has been achieved.

Why is the data row called first-hit graph? Is there a second hit graph as well?

What is the difference between newEntry and the newly created Tuple?

The same comments apply to the QualityPerExecutionTimeAnalyzer as well.

IndexedDataRow

ItemDescription "A data row that contains time series." is somehow misleading. Isn't it possible that the data row contains other data than time series? Changed description

Forward of NameChange to the VisualProperty.DisplayProperties (compare DataRow line 119-123) forwarded

IndexedDataTable

RegisterRowsEvents should not be virtual due to the call in the ctor. IMHO it is better to make it private and not virtual and the registered methods (e.g. rows_ItemsAdded) protected virtual. As a result a potential subclass is automatically registered to the relevant events, but could decide how to react on those by overriding the registered methods. changed visibility and virtual definition, also fixed a potential bug as RegisterRowsEvents did not register individual row events (e.g. when called after copy constructor)

ExpectedRunTimeHelper

I consider runs to be outliers if they're shorter than the mean runtime minus two standard deviations. I want to exclude them, because result analysis with respect to the shaded area is more clear if longer runs are used.

ErtCalculationResult

Fields should be readonly made readonly

ToString depends on the SuccessfulRuns. Wouldn't it be better to set the ExpectedRuntime directly to infinity and let .Net handle the string display. did as suggested

QualityPerEvaluationsAnalyzer

I don't understand the second part of the if clause in line 92. changed the analyzer to start recording values in the convergence graph only when evaluations is strictly greater than 0

Comments in line 94 and 101 are misleading. The values are not directly replaced, but rather in the next application of the analyzer. Does this code work if two consecutive improvements are achieved? (I think so) The code is IMHO pretty complex.

Yes the code works in case of consecutive improvements

Changed the comments

Line 101 checks for inequality between the last item and the best quality.

var improvement = values.Last().Item2 != bestQuality;

What if it's not an improvement but a decrease in quality? Should the graph still be updated and is the variable named incorrectly or should the best values be kept? The current implementation depends on the bestQuality which is updated only if an improvement has been achieved.

Our algorithms and analyzers use best quality to track the overall progress of the algorithm, if it changes it means that a progress was made

Why is the data row called first-hit graph? Is there a second hit graph as well?

first-hit means the first time a certain quality is achieved

What is the difference between newEntry and the newly created Tuple?

the last entry tracks the maximum number of evaluations. I wanted to record that information inside the graph, because I need it in the analysis. Still, I wanted to keep amount of data low and thus the last entry is always overwritten with the current amount of evaluated solutions.

The same comments apply to the QualityPerExecutionTimeAnalyzer as well. adapted the comments as well

IndexedDataTableView

As is the DataTableView. I remember that I tried deriving from DataTableView, but then had problems and reimplemented most functionality from scratch. Speaking of which, the improvements made by Philipp to DataTableView are not present in IndexedDataTableView