What are the elements of social media that can be used to facilitate automated discovery of high-quality content?

What is the utility of links between items, quality rating from members of the community, and other non-content information to the task of estimating the quality of UGC?

How are these different factors related?

Is content alone enough for identifying high-quality items?

Can community feedback approximate judgments of specialists?

In this work, the authors used a judged question/answer collection where good questions usually have good answers to model a classifier to predict good questions and good answers, obtaining an AUC (area under the curve of the precision-recall graph) of 0.76 and 0.88, respectively.

The drawback is that the quality gap is balanced by volume. The larger the volume of the UGC, the lower difficult the quality evaluation.

The Online Entities Quality Challenge

The advent and openness of online social media platforms often leaves them highly susceptible to abuse by suspicious entities. It therefore becomes increasingly important to automatically identify these suspicious entities and mitigate/eliminate their threats.

Fundamental Definitions

“Data Quality” is described as data that is “Fit-for-use”: data considered appropriate for one use may not possess sufficient attributes for another use!

Common Dimensions of Information or Data Quality

Accuracy: extent to which data are correct, reliable and certified free of error

Consistency: extent to which information is presented in the same format and compatible with previous data

Security: extent to which access to information is restricted appropriately to maintain its security

Timeliness: extent to which the information is sufficiently up-to-date for the task at hand

Completeness: extent to which information is not missing and is of sufficient breadth and depth for the task at hand

Concise: extent to which information is compactly represented without being overwhelming (i.e. brief in presentation, yet complete and to the point)

Reliability: extent to which information is correct and reliable

Accessibility: extent to which information is available, or easily and quickly retrievable

Availability: extent to which information is physically accessible

Objectivity: extent to which information is unbiased, unprejudiced and impartial

Relevancy: extent to which information is applicable and helpful for the task at hand

Useability: extent to which information is clear and easily used

Understandability: extent to which data are clear without ambiguity and easily comprehended

Amount of data: extent to which the quantity or volume of available data is appropriate

Believability: extent to which information is regarded as true and credible

Navigation: extent to which data are easily found and linked to

Reputation: extent to which information is highly regarded in terms of source or content

Useful: extent to which information is applicable and helpful for the task at hand

Efficiency: extent to which data are able to quickly meet the information needs for the task at hand

Value-Added: extent to which information is beneficial, provides advantages from its use

These attributes of data quality can vary depending on the context in which the data is to be used.

Defining what Information Quality means in the context of Search Engines will depend greatly on whether dimensions are being identified for the producers of information, the storage and maintenance systems used for information, or for the searchers and users of information.

Consider the information user, quality dimensions of their interest include relevancy and usefulness. These dimensions are enormously important but extremely difficult to gauge.

You can accelerate your cloud migration using intelligent migration assessment services like Azure Migrate. Azure Migrate is a generally available service, offered at no additional charge, that helps you plan your migration to Azure.