Technical description

Algorithm

Selection

Set1 :

Set2 :

All the elements of Set1 with own text or with more than 1 child or with
only one child not of type img or object (where "img ,object[type\^=image], object[data\^=data:image], object[data$=png],object[data$=jpeg], object[data$=jpg],object[data$=bmp],object[data$=gif]" returns empty)

Set3 :

All the elements of Set2 with a not empty text.

Process

Test1

For each element of Set3, we check whether the link content doesn't belong to the text link blacklist.

For each element returning false in Test1, raise a MessageA, raise a MessageB instead

Test2

For each element of Set3, we check whether the link content doesn't only contain non alphanumeric characters

For each element returning false in Test2, raise a MessageA, raise a MessageB instead

MessageA : Unexplicit Link

code : UnexplicitLink

status: Failed

parameter : link text, title attribute, snippet

present in source : yes

MessageB : Check link without context pertinence

code : CheckLinkWithoutContextPertinence

status: Need More Info

parameter : link text, title attribute, snippet

present in source : yes

Analysis

NA :

Set1 is empty (the page has no combined links)

Failed :

Test1 OR Test2 returns false for at least one element (At least one element of the Set2 has a text content which is blacklisted or that only contains non alphanumerical characters)

NMI :

In all other cases

Notes

We assume here that the links are only composed of a text. (<a href="https://asqatasun.org/target.html"> my link</a>)

All the links that have children different from img or object, are considered as combined links