Abstract—Automatic classification of virus samples into a concept hierarchy has been attracting much attention from malware research community. This would help anti-virus experts to have an obvious and systematic view on the landscape of virus samples, whose numbers have been rapidly increasing recently. However, it is not a trivial work, since malwares usually come in binary forms whose actions are complicated and obfuscated. Therefore, the typical data mining approaches based on feature extraction are not easily applied.
In this paper, we introduce an approach using Formal Concept Analysis (FCA) to generate a malware hierarchy. Since virus behaviours are often described effectively by temporal logic, we extend formal paradigm of FCA by using Logical Concept Analysis (LCA), where concepts are generalized by logic formulas. We also enhance the basic LCA to Viral Logical Concept Analysis (V-LCA), where abstraction techniques are used to abstract formal concepts representing virus samples. Our approach has been applied in a real dataset of virus and promising experiment results have been acquired.