The first one should be simpler as it uses a builtin mechanism of EMFcompare but with less ability to get out of the border of what it had been designed for while the second one would let us define whatever we want but with an increased implementation complexity.

We now detail the two options, their pros and cons, and finally conclude and propose a solution.

Builtin extension mechanism

The extension mechanism is not working in essence. The mechanism work as follow:

In your extending Ecore, you define a subclass of the EClasses AbstractDiffExtension and DiffElement (or any of its subclasses).

You generate the code of this Ecore, and implement the visit(DiffModel), getText(), getImage(), provideMerger() methods.

When the DiffService.doDiff() is called, diff is realized by the best suited diff engine within the registered ones.

Once the diff is over, for each registered AbstractDiffExtension, the visit() method is called.

There are two problems:

AbstractDiffExtension are instantiated only once. How can we assume we will only get one difference of that kind in the model? That's dumb and not working.

Even if AbstractDiffExtension is to instantied the correct number of time, the visit() method is taking the whole diff model as input. It has to be browsed for each extension. It will not scale.

The good parts of this mechanism come from AbstractDiffExtension#hideElements reference and its opposite DiffElement#isHiddenBy. It provides a neat solution to hide differences created by generic engines as it taken into account by the GUI.

The AbstractDiffExtension#provideMerger() are also a good idea as it let us provide our own merger to the framework. But it is best to provide the mergers through the IMergerProvider extension point.

The AbstractDiffExtension#getText(), AbstractDiffExtension#getImage() are bad idea. Why does EMF compare not use the edit framework and its providers to display labels and images? Even for extension, it can work!

Implements specific Diff/Match engine

The GenericXXXEngine (where XXX is Match or Diff) implementations provide useful base implementation of EMF model comparison. Then we will use it as our direct superclass.

The core algorithm is quite well splitted into several method and some of them are protected to allow subclassing. We can change the reference/attribute checker to avoid the comparison of certain features. The main point is that we are able to make some post processing by browsing the DiffModel only once! It will scale much more than the extension mechanism.

Conclusion

We will use the best of the two worlds: extension mechanism for out of the box UI support of diff extension and subclass of diff and match engine to provide working and scalable post processing of the DiffModel.

AbstractDiffExtension specific behavior should be remove from the core except the GUI hidding one. Moreover, the AbstractDiffExtension#visit(), AbstractDiffExtension#getImage(), AbstractDiffExtension#getText() and AbstractDiffExtension#provideMerger() methods should be make deprecated in EMF Compare 1.2 and removed from 2.0.

UML Compare match engine

There is only one need that force us to subclass the match engine: to ignore EObject storing stereotypesproperties. This is detailled in section #Profile / Stereotype support.

UML Compare diff engine

The subclass Generic Diff Engine will do two things:

Ignore references that are of no need for accurate comparison

Add post processing of the DiffModel to add UML extensions and hide other generic DiffElements.

Subset references

GenericDiffEngine of EMF Compare ignore by default the following EReferences kinds (via the ReferencesCheck class):

containment,

container,

derived,

transient.

Actually, it ignores containment references because it already checks the contents of EObject via reflective eContents() method. Then, checking containment references would lead to duplicate differences. In UML2, there is the notion of subset references. Some of them are not derived (they are not automatically deduced from the superset) and subsets a containment references. With EMF, an EObject can be referenced by only one containment reference. Then the subset references of containment one are not declared as is. Thus, GenericDiffEngine checks them against differences and it leads to duplication. To avoid this situation, we have to ignore those references.

As the "subset" information is only available in the uml.ecore file (in EAnnotation) and that this information is not persisted in the generated code, we have to store this information somewhere. We propose a property file with a set a key/value pair (fully qualified name of the reference: <containingEClass.name>.<reference.name>=true). We reproduce here the file with the selected references:

We will implement a subclass of ReferencesCheck to ignore references described in this file.

Infrastucture to support UML extensions

EMF Compare extension mechanism does not scale well. We propose here a new set of APIs to extend EMF Compare.

This infrascture is called just after the end of the normal processing of the doDiffTwoWay/doDiffThreeWay. Those methods will be overriden, then call their super() and finally branch our post processing infrastructure.

First, the diff extension factory registry is initialized and returns a Set of IDiffExtensionFactory. At each step of a whole DiffModel browsing each factory are asked if they handle the given DiffElement. If true, it is asked to create an AbstractDiffExtension and also asked for the parent DiffElement of the just created AbstractDiffElement. The newly created element is then added to its parent. The create() method has to handle the hidding of concerned DiffElement.

For each AbstractDiffExtension, a Factory is implemented and the pair of method handles/create gives the semantic of creation of this extension. The handles method should return quickly to minimize the overhead of the extension in case it is not relevant for the current case.

With this infrastructure, we are able to provide a scalable system to extend the GenericDiffEngine for UML difference computation.

Profile / Stereotype support

There are two issues to properly support UML profiles in UML compare engine:

Ignore the EObjects that store the stereotypes properties from the match engine in the object matching phase but instead include their properties in the "base" EObject during their matching.

Compute the differences of stereotypes application properties and attached them to their base objects.

The second one is simply addressed by the compare extension mechanism. We will define specific extension to handle properties diff on stereotype application and to attached them to their base EObject.

The first one forced us to subclass the GenericMatchEngine. We have two things to do:

First we have to provide our own IMatchScopeProvider implementation (or GenericMatchScopeProvider subclass) to also provide our own IMatchScope implementation to ignore EObject in the direct contents of the resources corresponding to stereotype applications. This test will be implemented in the isInScope() method of the IMatchScope.

Second, we have to extend the internal contents of the base EObject with the stereotype application EObject in order to match base EObject with all their properties (including the ones from the applied stereotypes).

UML Extensions

The following tables list if elements from all papyrus supported diagrams need to be handled by an specific extension in the UML Comparison engine.

No, means there is no need to implement an extension

?, means it is still unknown if it needs an extension (because we did not find a way to create the element from the Papyrus editor)

TODO, means it needs to be implemented as an extension

Done, means it has been implemented as an extension

Class diagram

Nodes

Addition

Removal

Class

No

No

ClassifierTemplateParameter

No

No

Comment

?

?

Component

No

No

Constraint

?

?

DataType

No

No

DurationObservation

?

?

Enumeration

No

No

EnumerationLiteral

No

No

Interface

No

No

InstanceSpecification

No

No

PrimitiveType

No

No

Model

No

No

Operation

No

No

OperationTemplateParameter

?

?

Package

No

No

Property

No

No

Reception

?

?

RedefinableTemplateSignature

?

?

Signal

?

?

Slot

?

?

TemplateParameter

?

?

TemplateSignature

?

?

TimeObservation

?

?

ConnectableElementTemplateParameter

?

?

Edges

Addition

Removal

Abstraction

TODO

TODO

Association

TODO

TODO

AssociationBranch

TODO

TODO

AssociationClass

TODO

TODO

ContainmentLink

TODO

TODO

Dependency

TODO

TODO

DependencyBranch

TODO

TODO

ElementImport

No

No

Generalization

No

No

GeneralizationSet

TODO

TODO

InstanceSpecificationLink

?

?

InterfaceRealization

TODO

TODO

Link

No

No

PackageImport

No

No

PackageMerge

No

No

ProfileApplication

cf. profile support

cf. profile support

Realization

TODO

TODO

Substitution

TODO

TODO

Usage

TODO

TODO

TemplateBinding

No

No

Package diagram

Nodes

Addition

Removal

Package

No

No

Edges

Addition

Removal

Dependency

TODO

TODO

PackageImport

No

No

Composite diagram

Nodes

Addition

Removal

Port

No

No

Collaboration

No

No

CollaborationRole

No

No

CollaborationUse

No

No

InformationItem

No

No

Parameter

No

No

Activity

No

No

Interaction

No

No

ProtocolStateMachine

No

No

StateMachine

No

No

FunctionBehavior

No

No

OpaqueBehavior

No

No

Edges

Addition

Removal

Connector

No

No

RoleBinding

?

?

Representation

?

?

InformationFlow

No

No

Use case diagram

Nodes

Addition

Removal

Subject

No

No

Actor

No

No

UseCase

No

No

Edges

Addition

Removal

Include

No

No

Extend

TODO

TODO

ConstrinedElement

No

No

Sequence diagram

Nodes

Addition

Removal

Lifeline

?

?

ActionExecutionSpecification

TODO

TODO

BehaviorExecutionSpecification

TODO

TODO

InteractionUse

No

No

CombinedFragment

No

No

InteractionOperand

No

No

Continuation

?

?

StateInvariant

No

No

CoRegion

No

No

TimeConstraint

TODO

TODO

DurationConstraint

?

?

DestructionEvent

TODO

TODO

Edges

Addition

Removal

MessageSync

TODO

TODO

MessageAsync

TODO

TODO

MessageReply

?

?

MessageCreate

TODO

TODO

MessageDelete

TODO

TODO

MessageLost

TODO

TODO

MessageFound

TODO

TODO

GeneralOrdering

?

?

CommentLink

No

No

ConstraintLink

No

No

Activity diagram

Nodes

Addition

Removal

InitialNode

No

No

ActivityNode

No

No

FlowFinal

No

No

DecisionNode

No

No

MergeNode

No

No

JoinNode

No

No

ForkNode

No

No

ActivityParameterNode

?

?

DataStoreNode

No

No

OpaqueAction

No

No

CallBehaviorAction

?

?

CallOperationAction

?

?

SendObjectAction

No

No

SendSignalAction

?

?

AcceptEventAction

?

?

ValueSpecificationAction

No

No

ReadSelfAction

No

No

InterruptibleActivityRegion

No

No

StructuredActivityNode

No

No

ConditionalNode

No

No

ExpensionRegion

No

No

LoopNode

No

No

SequenceNode

No

No

OutputPin

No

No

InputPin

No

No

ActionInputPin

No

No

ValuePin

No

No

InputExpansionNode

No

No

OutputExpansionNode

No

No

LocalPreconditionConstraint

No

No

LocalPreconditionIntervalConstraint

No

No

LocalPreconditionDurationConstraint

No

No

LocalPostconditionConstraint

No

No

LocalPostconditionIntervalConstraint

No

No

LocalPostconditionDurationConstraint

No

No

LocalPostconditionTimeConstraint

No

No

Activity

No

No

Edges

Addition

Removal

ControlFlow

No

No

ObjectFlow

No

No

ExceptionHandler

No

No

Link

No

No

State diagram

Nodes

Addition

Removal

Region

?

No

State

No

No

Initial

No

No

FinalState

No

No

ShallowHistory

No

No

DeepHistory

No

No

Fork

No

No

Join

No

No

Choice

No

No

Junction

No

No

EntryPoint

No

No

ExitPoint

No

No

Terminate

No

No

ConnectionPointReference

?

?

Edges

Addition

Removal

Transition

No

No

Communication diagram

Nodes

Addition

Removal

Edges

Addition

Removal

Message

TODO

TODO

Merger

Ther merger of extensions are traditionnaly provided by the AbstractDiffExtension#provideMerger() method. But, as we said in section [#Builtin_extension_mechanism],
this a poor subsitute compare the the IMergerProvider extension point.

If we choose to do that, we have to change the behavior of the API method