Comments 0

Document transcript

Ontology Classification forSemantic-Web-Based Software EngineeringYajing Zhao,Jing Dong,Senior Member,IEEE,and Tu PengAbstract—The Semantic Web is the second generation of the Web,which helps sharing and reusing data across application,enterprise,and community boundaries.Ontology defines a set of representational primitives with which a domain of knowledge ismodeled.The main purpose of the Semantic Web and ontology is to integrate heterogeneous data and enable interoperability amongdisparate systems.Ontology has been used to model software engineering knowledge by denoting the artifacts that are designed orproduced during the engineering process.The Semantic Web allows publishing reusable software engineering knowledge resourcesand providing services for searching and querying.This paper classifies the ontologies developed for software engineering,reviewsthe current efforts on applying the Semantic Web techniques on different software engineering aspects,and presents the benefits oftheir applications.We also foresee the possible future research directions.Index Terms—Semantic Web,ontology,business requirement modeling,requirements for enterprise modeling.Ç1 INTRODUCTIONTHE Semantic Web is the second generation of the Web toshare and reuse data across application,enterprise,andcommunity boundaries.One important component of theSemantic Web is ontology that is the study of existence.Incomputer science and information technology,it is a formalrepresentation of a set of concepts within a domain and therelationships between these concepts.The representation isat the semantic level,independent from data structure andimplementation.Being used to define a domain of knowl-edge,it enables the reasoning about the properties of thedomain.In addition,it enables the integration of hetero-geneous data and the interoperability among disparatesystems.The W3C standards for the Semantic Web includethe Web Ontology Language (OWL) [46],Resource Descrip-tion Framework (RDF) [44],etc.OWL is an ontologyspecification language and RDF is a language for describingresources that exist on the Web.The Semantic Webovercomes the disadvantages of natural language in away to represent formal,precise,and unambiguouscontents.In addition,it maintains information in the formatthat can be understood and processed by automated tools[9].It provides an integrated framework so that informationcan be well organized,widely published,broadly shared,easily retrieved,and simply integrated.Software development is a complex process whichproduces a large amount of information.Systematic,disciplined,and quantifiable approaches are required forthe development,operation,and maintenance of softwaresystems [43].Effort has been made to improve the softwareprocess.For example,modeling languages have beenproposed to better represent system designs;the advancedIntegrated Development Environments (IDEs) have beenproduced to facilitate implementation;and other types ofComputer-Aided Software Engineering (CASE) tools havebeen developed to manage different types of softwareartifacts.However,software development remains difficultin several ways.First,software process involves a lot of efforts inacquiring and producing information.Reusing existinginformation saves efforts.The scope of reuse has beenexpanded fromreusing pieces of code to reusing all kinds ofinformation,such as requirements,project processes,andsoftware designs.However,the current representation ofsuch information makes it hard to manage,retrieve,andreuse.A method that facilitates information retrieval andpromotes reuse is highly demanded.Second,due to globalization,software systems areusually developed among teams that are geographicallydispersed.Consequently,diversity exists among the pro-cesses used by different teams.For example,teams atdifferent locations may use diverse processes and may alsopossess different sets of knowledge.Information sharinghelps to prevent inconsistency.Information sharing and reuse have the followingbenefits:improving productivity,shortening developmentlife cycle,decreasing cost,and increasing product quality.The Semantic Web provides a way to improve informationsharing and reuse.Ontology has been used to modelsoftware engineering knowledge by denoting the artifactsdesigned or produced during the engineering process.TheSemantic Web allows publishing reusable software engi-neering knowledge resources.With the support of softwareengineering related ontologies,the Semantic Web allows thequery and reasoning on software engineering knowledgemanually or by software agents.In this paper,we review the ontologies for softwareengineering,provide a classification of these ontologies,anddiscuss their relationships.The goal of this workis toprovideIEEE TRANSACTIONS ON SERVICES COMPUTING,VOL.2,NO.4,OCTOBER-DECEMBER 2009 303.The authors are with the University of Texas at Dallas,PO Box 830688,EC 31,Richardson,TX 75083.E-mail:{yxz045100,jdong,txp051000}@utdallas.edu.Manuscript received 11 Dec.2008;revised 7 Mar.2009;accepted 24 June2009;published online 1 July 2009.For information on obtaining reprints of this article,please send e-mail to:tsc@computer.org,and reference IEEECS Log Number TSC-2008-12-0108.Digital Object Identifier no.10.1109/TSC.2009.20.1939-1374/09/$25.00 2009 IEEE Published by the IEEE Computer SocietyAuthorized licensed use limited to: IEEE Xplore. Downloaded on May 13,2010 at 11:43:27 UTC from IEEE Xplore. Restrictions apply.www.DownloadPaper.irwww.DownloadPaper.ira generic and systematic set of ontologies,which enables theapplication of the Semantic Web techniques.We reviewthecurrent effort of using the ontologies in the Semantic-Web-based software engineering.We examine howthe SemanticWeb technologies are used to solve different problems insoftware engineering.We classify the existing approachesaccording to their usage of the ontologies.The goal of thiswork is to present the current state of this research area.Wediscuss our ownviewonthe applicationof the Semantic Webtechnologies and point out the future research directions.The rest of this paper is organized as follows:Section 2surveys the ontology that can be built for softwareengineering.Some of the ontologies are used or mentionedby existing researches and others are identified by us.Wealso provide a schema to classify the ontologies.Section 3discusses how the Semantic Web techniques can be used tosolve problems in software engineering.Finally,Sections 4and 5 cover the discussions and conclusions.2 ONTOLOGIES FOR SOFTWARE ENGINEERINGThe goal of the Semantic Web is to allow the Web contentsto be understood by both human and software agents.Thus,Web contents not only have to be presented in a formal waybut also need semantic information.Ontology is a formalrepresentation for the concepts within a domain and thesemantic relationship between concepts.It consists of ataxonomy and a set of inference rules.It has been used inthe Semantic Web as a formal vocabulary.It allows thesoftware agents to understand the Web contents throughmaking logical conclusions.Software engineering is a complex process that involveslarge amount of information.The amount and complexityof the information make the process difficult to manage.The Semantic Web and ontology techniques have beenapplied in software engineering with the following advan-tages [18]:first,to formalize the information;second,toprovide broad access from different physical locations;third,to provide universal retrieving service that applies toall types of software knowledge;and fourth,to allowcomparison and matching of knowledge or concepts.Theapproach of applying the Semantic Web and ontologytechniques in software engineering typically includes thefollowing steps:first,information resources about softwaredevelopment methodologies,techniques,supporting toolsas well as the software itself are gathered;second,theconcepts appearing in the resources are classified hierarchi-cally and built into ontologies using ontology languages;and finally,information resources are published on the Webeither for people to read or for software agents to process.In the following sections,we survey the ontologies forsoftware engineering.In each section,we discuss a softwareactivity and the related ontologies.The organization of theontologies according to their related software activitiesmakes our presentation clear.In the last section,we discussthe relationship between the ontologies and provide aclassification of them.2.1 Software Process OntologySoftware engineering is a complex process.Inexperiencedengineers may be unclear about their tasks,their goals,andthe processes they should follow.Awell-defined,complete,and disciplined process can be quite helpful for them.Onthe other hand,experienced engineers or managers mayhave different opinions about the project plans or processes,which causes conflicts.A well-defined process serves as astandard when there is a disagreement.In addition,since itbecomes common that collaboration and cooperation areneeded between software teams at different physicallocations,communication becomes even harder.To im-prove the communication,process can be modeled andshared on the Web.The Semantic Web techniques alsoallow process automation,dynamic assembly and tailoringof process elements,formal reasoning and query aboutactivity specifications,and reusable work products.Itcreates the opportunity to measure the process usage,ensures the process conformance,and suggests futureprocess improvement options [19].Ontology 1.Software process ontology [28],[19],[33],whichdefines software activities,process phases,and existingprocess models.Each process phase can be defined by asequence of activities,and each process model can bedefined by process phases.In addition,each activity canbe associated with software artifacts it produces.Thestructure of the ontology can be illustrated by Fig.1.In Fig.1,each oval with solid border represents aconcept,each directed line represents a relationshipbetween two concepts,and the text on the line denotesthe type of the relationship.Instead of a concrete concept,304 IEEE TRANSACTIONS ON SERVICES COMPUTING,VOL.2,NO.4,OCTOBER-DECEMBER 2009Fig.1.Partial software process ontology.Authorized licensed use limited to: IEEE Xplore. Downloaded on May 13,2010 at 11:43:27 UTC from IEEE Xplore. Restrictions apply.www.DownloadPaper.irwww.DownloadPaper.iran oval with dashed lines represents an ontology,which canbe seen as a metaconcept.These ontologies are connected toother concepts/metaconcepts by dashed lines with arrowheads.The reason that these ontologies are included in thisfigure will be explained in Section 2.11.For simplicity,someinformation has been omitted in Fig.1.For example,thelines that do not have attached text represent the subClassOfrelationships.The relationship that the WriteCode activityproduces the Code artifact is not displayed in the figure.Building ontology is not an easy task,which involves theselection of appropriate concepts and semantic relations.Inorder to achieve high-quality process ontology,existingsoftware process models and related information have to bestudied.The international standards of software processmodels,such as CMMI and ISO/IEC 15504,are goodresources [28].By learning and comparing the basiccomponents and activities in the standard process model,slight differences betweenthe models canbe found.Differentstandards are defined by using different set of terms.Theterms from two vocabularies may have the same meaning,whereas the same term in two vocabularies may representdifferent meanings.Thus,the vocabulary sets cannot besimply combined.To establish a complete process model,amapping has to be created for each pair of items that sharethe same meaning.In this way,the overlapping part of thestandards as well as the different part canbe identified,andacomplete set of concepts can be achievedafter combining theconcept sets and filtering out the duplicate concepts.Besides CMMI model and ISO/IEC1554 model,there aremany other standard process models.Variation existsamong these models which cannot be covered by oneprocess ontology.It is proposed to have a general processontology,which can be extended to cover different existingmodels [28].In addition,there are more various processmodels created by different software teams by tailoring thestandard models to serve their own needs.Accordingly,thegeneral process ontology can be tailored to project-specificprocess ontology [38] which defines rules and logic within aspecific application development project.Comparing to thegeneral ontology,which has a broader application scope,the project-specific ontologies are applicable to fewerdevelopment processes and within smaller communities.More discussions about the ontology application scopes canbe found in Section 2.11.2.2 Domain OntologiesDomain engineering is the process of defining a scope forsoftware application domain,analyzing the importantconcepts within the domain and their relationships,specifying the domain structure,and building the reusablecomponents.Domain engineering collects useful informa-tion within a specific domain,which can be maintained andreused in future application development in the samedomain.Reusing domain information may reduce time andsave the effort of gathering information.The followingontologies related to domain information can be created.Ontology 2.Application domain ontology [1],[4],[8],[17],[33],whichrepresents the knowledge of anapplicationdomainand the business information required for buildingsoftware applications ina specific domain.It also includesthe semantic relationships established among their con-cepts froma real-world domain point of view.Systems in the same domain share common features.Software features differentiate software applications indifferent domains as well as their engineering processes[36].Capturing common features for domain applicationsfacilitates the domain engineering as well as softwareengineering.Ontology 3.Application domain feature model ontology,whichmodels the features of software systems in the sameapplication domain.In addition,it maintains therelationships between features,such as mandatory,optional,alternative,or,requires,and excludes [36].2.3 Requirement OntologyRequirement specifications arethedescriptions of thedesiredsoftware characteristics specified by the customer.There aretwotypesof requirements:functional requirements(FRs) andnonfunctional requirements (NFRs).Each FR can be viewedas a sequence of behaviors/actions that the systemperformsunder a particular context [8].On the other hand,NFRsrepresent the quality-related characteristics of a system.Ontology 4.System behavior ontology [8],which modelssystem behaviors,the actions that system performsunder certain scenarios.The main concepts of thisontology include event,action,reservation,etc.Relation-ships include making agreement,making reservation,etc.2.4 Architecture and Design OntologiesSoftware architecture is a description of the systemcompo-nents and how they interact with each other at a highabstraction level.There are some common architecturestyles,such as pipe and filter,data abstraction and object-oriented organization,and layered systems.Each architec-ture style includes a set of components andtheir interactions.It has some constraints on its components as well as on theinteractions between the components.In addition,each stylehas its advantages and disadvantages.In order to formallyspecify the architecture styles,software-architecture-relatedinformation can be defined by the following ontology.Ontology 5.Software architecture ontology [23],which modelsarchitecture-related concepts,such as architecturalstyles,components that are required in each architecturalstyle,and their interactions.A number of modeling techniques have been proposedto facilitate the software design.For example,the controlflow diagram (CFD) can be used to model the executionflow of a system;the data flow diagram(DFD) can be usedto model data processing systems visually.Petri nets can beused to model discrete distributed system by describingplaces and transitions.In addition,state diagram togetherwith preconditions and postconditions can also be used tomodel the behavior of software systems.Such modelingtechniques and their related concepts can be defined in theapplication logic ontology.Ontology 6.Application logic ontology [13],which definesconcepts that are used to model the logic behind theZHAO ET AL.:ONTOLOGY CLASSIFICATION FOR SEMANTIC-WEB-BASED SOFTWARE ENGINEERING305Authorized licensed use limited to: IEEE Xplore. Downloaded on May 13,2010 at 11:43:27 UTC from IEEE Xplore. Restrictions apply.www.DownloadPaper.irwww.DownloadPaper.irapplication behaviors.For example,CFD concepts,suchas asynchronous calls,synchronous calls,fork,andmerge,could be included in this ontology.Object-oriented programming languages have beenadopted in many software applications.UML is a goodmodeling language for object-oriented software system.Thefollowing ontology can be built to catch the information inUML diagrams.Ontology 7.Object-Oriented design ontology [11],[22],whichdefines the vocabulary for describing object-orientedsoftware designs.The concepts in this ontology includeclasses,interfaces,methods,and attributes.The relationshipsinclude inheritance,realization,etc.2.5 Pattern OntologySoftware patterns capture expert experience and documentflexible designs and development that may change overtime [16].Currently,there are several kinds of patterns,such as Gang of Four (GoF) design patterns,usabilitydesign patterns,and Web application patterns.GoF designpatterns [16] provide experienced design practices forobject-oriented software application.Usability patterns[20] have been used to distribute usability knowledge andensure a degree of consistency across applications.Webapplication patterns describe the experience of recurringproblems in Web applications [25].Patterns are usually described in natural languages ordiagrams.Since the definition is informal,there are anumber of variants for each design pattern.Developersapply design patterns according to their own understand-ing.Researchers discover design patterns from existingsoftware source code based on their own interpretation.Thus,there is a demand for the formal definition of patterns.The Semantic Web techniques,such as OWL and RDF,can be used to formally describe the patterns [11],[12].Using OWL and RDF,a pattern definition itself can be aninstance of the OWL class represented by a set of RDFstatements.The characteristics of a pattern,such as theproblem it solves,the solution it provides,the context it isused,and the alternatives,can be defined through RDFproperties.Each pattern consists of several participants.Forexample,a design pattern usually involves several classeswhich interact with each other.These participants can alsobe represented as OWL classes and be associated with thepattern.Thus,the following ontology can be defined.Ontology 8.Pattern ontology [20],[21],which aims atproviding a catalogue for patterns,including softwaredesign patterns,usability patterns,Web applicationpatterns,etc.Patterns are defined as OWL conceptsand are placed as subconcepts of their correspondingcategories.The patterns are connected with theirparticipants by the hasParticipant relationship.The de-sign of this ontology can be illustrated by Fig.2.Although the pattern ontology is not completely defined,we illustrate the idea of providing a unified patternrepository.Kamthan and Pai [25] try to define the ontologyfor Web Application Patterns (OWAP),which can be includedas a subclass of the SoftwarePatterns.2.6 Implementation OntologiesSoftware development process requires a large amount ofinformation and produces a large number of softwareartifacts.When information is not well organized,andthus,becomes unsearchable or irretrievable,it becomesuseless.Therefore,it is necessary to organize theinformation in an easy-to-retrieve manner.Defining thefollowing ontologies helps to organize the artifactsproduced during implementation.Ontology 9.Software artifact ontology [1],[4],[17],whichprovides a set of concepts that permit the classification ofdifferent artifacts according to their formats and internalstructures,such as text files,diagrams,maps,images,tables,and code,or according to their types,such assource code,design,and documentation.The ontologyalso includes artifact-related concepts,such as person,who might be the creator,and project,where an artifactmight be produced.As discussed in the previous section,object-orientedprogramming languages are commonly used in softwareindustry.Object-oriented source code ontology can becreated so that source code can be organized along withit.The difference between this ontology and the object-oriented design ontology is that this ontology involves moreprogramming language constructs which are not used inthe design,such as method parameters and local variables.Ontology 10.Object-oriented source code ontology [26],[37],[39],which is designed to formally specify majorconcepts of object-oriented programming languages.The concepts of this ontology include package,class,attribute,method,parameter,variable,etc.Software process is iterative,in which the softwareartifacts keep improving and updating.In the process,theremight be several versions created for each artifact.It is acomplex activity to manage the versions.The followingontology can be created,according to which versions of filescan be organized.Ontology 11.Version ontology [26],which aims at modelingthe relationships between files,releases,and revisions ofsoftware projects.The relationships in this ontologyinclude hasRevision,isReleaseOf,etc.For example,a filewhichhas anumber of revisions shouldbeconnectedwiththe corresponding revisions using hasRevision relation-ship.At some point,the software is relatively stable andready to be publishedas a newrelease.The release should306 IEEE TRANSACTIONS ON SERVICES COMPUTING,VOL.2,NO.4,OCTOBER-DECEMBER 2009Fig.2.Part of pattern ontology.Authorized licensed use limited to: IEEE Xplore. Downloaded on May 13,2010 at 11:43:27 UTC from IEEE Xplore. Restrictions apply.www.DownloadPaper.irwww.DownloadPaper.ircontainall the revisions made until that point.Inthis case,the release is connected with its corresponding filerevisions by using the isReleaseOf relationship.Software component configuration is an activity to keeptrack of version changes and various dependency con-straints imposed on the system.Dependencies betweencomponents include AND dependency and OR depen-dency.For example,component A may depend on eithercomponent BOR component C.Component Dmay dependon both component E AND component F.Versionrestrictions specify the range of versions of a componentthat works with the range of versions of another compo-nent.For example,there are situations such that componentA version m works well with component B version n.Nevertheless,the mþ1 version of A may be incompatiblewith version n of Banymore.The following ontology can becreated to facilitate the configuration between components.Ontology 12.System configuration ontology [31],a model forcomponent constraints and version restrictions.Nor-mally,each system has its own configuration ontology.2.7 Documentation OntologiesDocumentation is one of the important ways to commu-nicate and share information.However,documentation is atedious task.To provide some assistance to this task,it isdesired that information that is expected to be documentedcan be collected and transformed into text format auto-matically.Thus,during the development process,thevaluable information can be collected and defined in thedocumentation ontology.Ontology 13.Documentation ontology [37],[39],[10],whichconsists of a large amount of concepts that are expectedto appear in the content of software documents.Theseconcepts are based on various programming domainsand design decisions.A large number of documents are produced to recorddifferent types of information.For example,there aredocuments recording customer-related contents,documentsdescribing the software artifacts,documents providingguidelines for the development or testing tools,anddocuments recording the process information related toschedule,cost,and task breakdown structure [32].Toorganize generated documents and maintain a historicaloverviewof the document revisions,the following ontologycan be developed.Ontology 14.Document ontology [10],which models docu-ment types and their relationships.The relationshipsinclude useTemplate,related,refinedIn,update,etc.Forexample,a document written according to a templatedocument can be connected to the template by theuseTemplate relationship.The newversion of a documentcan be connected to the previous versions of the samedocument by the update relationship.The difference between the document ontology and thedocumentation ontology is that the document ontologyaims at organizing documents,whereas the documentationontology aims at organizing information recorded bydocuments.They differ in their intentions and applicationscopes.Besides,the concepts and restrictions defined in thetwo ontologies are different.2.8 Quality OntologiesSoftware quality can be measured along the followingsoftware attributes:capacity,usability,performance,relia-bility,installability,maintainability,availability,etc.Defin-ing and measuring quality help to learn the current status ofa systemand identify the aspects that need to be improved.Currently,the most widely used software quality measure-ment tools are the Ishikawa’s seven basic tools,Paretodiagram,histogram,scatter diagram,run chart,controlchart,and cause-and-effect diagram.To improve softwareproduct quality and satisfy customer needs,there have beendeveloped quality models and standards,such as ISO 9001,Capability Maturity Model (CMM),and ISO/IEC 9126.Therefore,defining the quality concepts requires studyingthe existing quality models and constructing a superset.Thesame method as discussed in Section 2.1 can be applied.Thus,the following ontology is constructed.Ontology 15.Quality ontology [3] (also called featureontology [13]),represents reusable knowledge aboutdifferent quality characteristics,subcharacteristics,andmetrics.The basic structure of this ontology can beillustrated in Fig.3 where the edges represent thesubClassOf relationship.Asoftware systemis considered to be of high quality if itsatisfies customer requirements and fits the use [29].Software testing is the activity to assess the attributes of asoftware system and determine whether it meets thefunctional requirements and quality requirements beforedeployment.During the testing process,defects can beidentified.For example,in testing phase,a tester may find afunction failed to be provided.Another example is thatduring the market early trial of the software product,customers may request some changes which are sent to theengineers.Resolving a defect involves the communicationbetween several parties.To improve the management ofsoftware testing,the following ontologies can be defined.Ontology 16.Testing ontology [40],which defines conceptsrelated to testing,such as tester,environment,context,artifact onwhichthe testing is performed,testing method,and activity.Each concept is further refined by introdu-cing more subconcepts.For example,testing contextsinclude unit testing,integration testing,system testing,regression testing,etc.Activity includes planning,testZHAO ET AL.:ONTOLOGY CLASSIFICATION FOR SEMANTIC-WEB-BASED SOFTWARE ENGINEERING307Fig.3.Partial quality ontology.Authorized licensed use limited to: IEEE Xplore. Downloaded on May 13,2010 at 11:43:27 UTC from IEEE Xplore. Restrictions apply.www.DownloadPaper.irwww.DownloadPaper.ircase generation,execution,validation and verification,etc.The relationships in the ontology include subsumerelation between testing methods,compatible-with relationbetween artifacts format,enhance relation between envir-onments,include relation between activities,temporalordering between activities,etc.Ontology 17.Defect ontology [26],which defines theconcepts used to describe the defects detected duringthe software testing phase.It includes the concepts suchas defect,action,person,and comment.Relationshipsinclude discover,assignedTo,hasBehavior,requiresBehavior,etc.For example,a tester who discovers and reports adefect should be connected by the discover relationshipwith the defect.The discovered defect will be assigned toa developer by using the assignedTo relationship.Adefect usually means the system presents differentbehaviors from what have been required by customers.The defect can be associated with its current behaviorusing the hasBehavior relationship and its desiredbehavior using the requiresBehavior relationship.A defectis usually discovered in one revision of the source codeand resolved in another revision so that it should beassociated with the corresponding revisions (defined inOntology 11) by the isDiscoveredIn and isResolvedInrelationships,respectively.2.9 Maintenance OntologySoftware maintenance is the activity of modifying thesystem after delivery,so as to correct faults,improveperformance,or adapt the system to a new environment.Itis a complex task that involves system understanding,maintenance planning,etc.Software maintenance processmodels need to be defined to facilitate the task [30].Inaddition,concepts involved in this task can be modeled by amaintenance ontology.Ontology 18.Software maintenance process ontology [2],whichprovides maintenance-related concepts and their rela-tionships.The concepts include activity,person,procedure,resource,etc.The activities are,for example,maintenanceactivity,management activity,modification activity,etc.People involved in maintenance process include main-tenance engineer,maintenance manager,customer,etc.The relationships include uses which connect a proce-dure with its corresponding activities,performs whichconnect a maintenance engineer with the maintenanceactivities,negotiates-with which connects maintenancemanager and customers,etc.2.10 Technology OntologyA large number of technologies have been proposed forsoftware development.Development environments andtools have been developed to automate or semiautomatethe software engineering process.For each software en-gineering activity,there might be several candidates withinthe large pool of technologies.Lacking knowledge about theexisting tools and technologies impacts the productivity ofsoftware process.A technology ontology can be built to actas a library,to provide engineers with possible information,and to help engineers to pick up the most appropriate toolsor technologies for their specific needs.Ontology 19.Technology ontology [13],[17],which is arepository of software development technologies,envir-onments,platforms,tools,etc.For example,program-ming environments include J2EE,.Net,etc;applicationservers include Tomcat,JBoss,etc;and Web Serviceexecution engines include BPEL-engines,XL-engine,etc.2.11 Classification of OntologiesWe have introduced 19 software-engineering-related ontol-ogies.We have discussed the background,the motivation,and main contents of the ontologies.In this section,wediscuss the relationship between the ontologies and providea classification schema along which we classify them.Thiswork provides our reviews of the ontologies and ourcategorization of the existing ontology-based approaches.There are several ways to classify the software-engineer-ing-related ontologies.For example,they can be classifiedaccording to their related engineering phase,according tothe type of information they model,or according to theirapplication scope.First,we classify ontologies according to their relatedactivities in different software engineering phases.Typicalsoftware engineering phases include requirement engineer-ing,architectural design,implementation,testing,andmaintenance,as presented in Fig.1.Since software processontology defines phases and activities,we use Fig.1 toshow the relationships between the ontologies and theengineering phases and activities.For example,applicationdomain ontology and application domain feature modelontology can be used to facilitate requirement elicitation,anactivity in requirement engineering phase;systembehaviorontology can be used to facilitate requirement specification,which is also an activity in requirement engineering phase;software architecture ontology,application logic ontology,object-oriented design ontology and pattern ontology can beused to facilitate activities in architectural design phase;software artifact ontology,object-oriented source codeontology,version ontology,system configuration ontology,document ontology,and documentation ontology arerelated to activities in implementation phase;testingontology and defect ontology can help testing;and softwaremaintenance ontology can be used during maintenance.Inaddition,there are some ontologies,such as softwareprocess ontology,quality ontology,and technology ontol-ogy,which are related to all different activities and notshown in the figure.Since ontologies are not concreteconcepts,we treat themas metaconcepts and use ovals withdashed border to denote them.The contributing ontologiescan be connected to the related activities by the facilitatedByrelationship.However,this relationship connects metacon-cepts.To differentiate it from the regular relationships,wecreate a new type of relationship denoted by dashed linewith arrowhead.In addition to illustrating the relationshipsby Fig.1,we use a table to summarize the classification.Thethird column of Table 1 shows the classification ofontologies according to their related software activities.Second,we classify ontologies according to the type ofinformation they model.Some ontologies model informa-tion related to processes,such as software process ontologyand software maintenance process ontology.Some ontolo-gies model concepts within an application domain,such as308 IEEE TRANSACTIONS ON SERVICES COMPUTING,VOL.2,NO.4,OCTOBER-DECEMBER 2009Authorized licensed use limited to: IEEE Xplore. Downloaded on May 13,2010 at 11:43:27 UTC from IEEE Xplore. Restrictions apply.www.DownloadPaper.irwww.DownloadPaper.irapplication domain ontology and application domainfeature model ontology.Some ontologies,such as patternontology and technology ontology,aim at providing alibrary of standards and technologies.Some ontologiesmodel information related to software products,whichmight be final products or intermediate products.The finalproduct refers to the software artifacts that are requested bycustomers.The intermediate products refer to the softwareartifacts that are not directly requested but are internallyproduced along the product line.Example ontologies thatmodel product-related information are system behaviorontology,software architecture ontology,software artifactontology,system configuration ontology,etc.Some ontol-ogies model information related to management andsupport of the software engineering.Normally,suchinformation is also generated internally.For example,version ontology,document ontology,quality ontology,testing ontology,and defect ontology belong to thiscategory.We use this criterion to classify the ontologies asshown in the fourth column of Table 1.Third,we classify the ontologies according to theirapplication scopes.Some ontologies model concepts thatare common for almost all projects.For example,softwareprocess ontology,system behavior ontology,softwarearchitecture ontology,and pattern ontology are all generic,which means they can be used in almost any softwareengineering processes.In addition,some ontologies,such asapplication domain ontology and application domainfeature model ontology,are domain-specific.These typesof ontologies can be applied only to projects within thedomain being modeled.Furthermore,some ontologies areonly applicable to projects that use object-oriented program-ming languages due to their main programming language.For example,object-oriented design ontology and object-oriented source code ontology belong to this category.Finally,some ontologies are project-specific.For example,in Section 2.1,we discussed project-specific process ontolo-gies,which are extensions of the generic software processontology.They belong to this category.Systemconfigurationontology is mainly designed to capture constraints andinterrelationships between components of a system.Eachsoftware system should have its own configuration.Thus,this ontology is project-specific.Documentation ontologyconsists of a large body of concepts that are expected toappear in the content of documents.The information withinthis ontology is mainly related to a specific project.We usethe last column of Table 1 to show the classification of theontologies following this criterion.We summarize the software-engineering-related ontolo-gies in Table 1.The order of the ontologies listed in thistable corresponds to the sequence of their occurrence inSection 2.Columns 3-5 list our classification for theontologies along the three criteria mentioned above,namely,the related software engineering activity,type ofinformation modeled,and application scope.3 SEMANTIC WEB APPLICATIONS IN SOFTWAREENGINEERINGOntologies are used to formally express a shared under-standing of a domain.As software engineering is asequence of activities that involve high amount of commu-nication,it is better to have an agreement on the knowledgeshared among different parties.In the previous section,wehave described some software-engineering-related ontolo-gies in the literature.In this section,we describe how theseontologies and the Semantic Web technologies are used toimprove software engineering,i.e.,the software engineeringproblems that can be solved or improved by the SemanticWeb technologies,the proposed approaches,and theontologies that are required in each approach.Problems in software engineering can be categorized indifferent ways.In this section,we categorize the problemsfrom two perspectives:the life-cycle perspective and thecritical issue perspective.From the life-cycle perspective,aproblem may exist in a particular software engineeringphase.On the other hand,there are certain critical issuesthat might need to be handled throughout the entire lifeZHAO ET AL.:ONTOLOGY CLASSIFICATION FOR SEMANTIC-WEB-BASED SOFTWARE ENGINEERING309TABLE 1Classification of Software-Engineering-Related OntologiesAuthorized licensed use limited to: IEEE Xplore. Downloaded on May 13,2010 at 11:43:27 UTC from IEEE Xplore. Restrictions apply.www.DownloadPaper.irwww.DownloadPaper.ircycle.Through this categorization schema,the Semantic-Web-based approaches on software engineering are alsocategorized accordingly.The rest of this section is organized in the following way.Section 3.1 introduces problems fromthe life-cycle perspec-tive and discusses how they can be improved by using theSemantic Web technologies.Section 3.2 discusses certaincritical issues that have to be dealt with throughout theentire engineering life cycle and introduces the Semantic-Web-based solutions.3.1 From Life-Cycle PerspectiveSoftware engineering process can be divided into severalphases according to the main goal to be achieved from thelife-cycle perspective.The distinguishable phases of asoftware engineering process include requirement engi-neering,architectural design,implementation,testing,andmaintenance.In each phase,there are some ontologies thatcan be applied and utilized to improve the phase.In thefollowing sections,we introduce the phases,discuss theontologies that benefit each phase,and present the SemanticWeb technologies that can be used.3.1.1 Requirement Engineering PhaseRequirement Engineering (RE) concerns the real-worldgoals for,functions of,and constraints on software systems.It also concerns the relationship of the factors to precisespecification of software behavior and their evolution overtime and across software families [35].RE is a crucial phasein software development since the quality of the require-ments directly affects the quality of the entire project.Requirement elicitation is an important activity in thisphase,which is to lead customers to state their expectedfunctionalities during interviews.It involves a goodcommunication between system analysts and customers.Misleading or misinterpreting during the interviews maycause incorrect,incomplete,and inconsistent requirements.Therefore,it is important to improve the communicationand achieve an agreeable contract.Two ontologies can beused to support the requirement elicitation:applicationdomain ontology and quality ontology [3].Both of themsuggest requirement analysts the questions that they shouldask during the requirement elicitation interview.Application domain ontology records the importantinformation within the domain that is relevant to theapplication.It is able to suggest the elicitation questions ofFRs,since FRs highly depend on the application domaininformation.Quality ontology records the different aspectsof software qualities.It is able to suggest the NFRselicitation questions since NFRs are much more indepen-dent from application domain and define quality-relatedcharacteristics.The Semantic Web can be used to publishthe domain and quality information,using domain ontol-ogy and quality ontology as the underlying vocabulary.Services such as query and reasoning can be provided onthe Web.These Web services can help the software analyststo learn about the application domain and quality char-acteristics.It helps themto come up with a more reasonablelist of interviewquestions for requirement elicitation so thatthey can be more active during the interview,elicit thecorrect information,and thus,achieve high-quality require-ment specification [3].Use cases are proposedtomodel software requirements toimprove the understandability.They describe in a graphicalrepresentation the systemoperation scenarios.Eachscenariohas its own context and contains actors,tasks,extensionpoints,sometimes preconditions,and postconditions.In thedevelopment process of large software systems,there is alarge library of use cases.Retrieving a use case created in thepast is a complex and error-prone task.The Semantic-Web-based approach can be used to solve this problem [8].Usecases can be annotated with semantic information,such astheir context and the modeled system behaviors.A Webservice can be built to retrieve use cases based on theirsemantic information.The application domain ontology andsystembehavior ontology can be used to support the query.The quality of the requirements directly impacts that ofthe software system.Requirement assessment is the activityto measure the quality of the requirements and to discovererrors and potential issues.The task can be facilitated by thedomain ontology [24],[41],[42].Mapping between applica-tion domain ontology and the requirement specificationitems can be created,so as to establish the relationshipsbetween requirement items according to the relationshipsbetween the corresponding domain concepts.Thus,theinference on the domain ontology enables the inference onthe requirement specification.A query service can beprovided which allows the errors,such as incompletenessand inconsistency,to be detected through query.The threetypical errors in requirements are incompleteness,incon-sistency,and ambiguity.Through the literature study,wefound that there are more approaches focusing on detectingthe inconsistencies than on incompleteness or ambiguity.3.1.2 Software Design PhaseAs software systems become larger and more complex,software architecture and design are important for mana-ging complexity and scalability.Due to constant changesduring the product life cycle,in addition,the design has tobe adaptable to changes and easy to extend.Software patterns document expert experiences forrecurring problems.They provide guidance for softwaredesign.Software pattern ontology provides a library ofcommonly used patterns.The Semantic-Web-based ap-proach facilitates the knowledge dissemination and en-courages the engineers to study patterns.With the supportof the pattern ontology,query services can be provided onthe Web,which further makes the pattern training easier.This approach broadens the community of pattern users.Feature modeling is part of domain engineering assystems in a domain share common features and also differin certain features.A feature model ontology describes thecommon features and constraints for systems in thedomain.A specific system in a domain may have its ownset of features,which should be consistent with theconstraints defined in the domain feature model ontology.Thus,feature model ontology can be used in Semantic-Web-based approaches to provide design guidance [36].A Webservice can be provided to ensure the consistency betweenthe feature design of a system and the feature modelontology for its application domain.310 IEEE TRANSACTIONS ON SERVICES COMPUTING,VOL.2,NO.4,OCTOBER-DECEMBER 2009Authorized licensed use limited to: IEEE Xplore. Downloaded on May 13,2010 at 11:43:27 UTC from IEEE Xplore. Restrictions apply.www.DownloadPaper.irwww.DownloadPaper.ir3.1.3 Implementation and Integration PhaseSoftware implementation phase is when the source code iswritten and data storage schema is designed.It involves alarge amount of effort in producing different kinds ofartifacts.Automating source code generation not only savestime and effort,but also ensures the consistency andestablishes traceability.Mapping can be established between application domainconcepts,object-orientedconcepts,anddata model concepts.According to the mapping,a transformation approach canbe used to transform knowledge from the ontology to theprogramming language so that source code could beautomatically generated from domain ontology [5],[6].Inaddition,transformationcanalso be usedto derive relationaldata models fromthe application domain ontology [5].Software integration is the activity performed after theimplementation of individual component.It focuses on theproblem of integrating two or more software componentsinto a large system,given that they function properlyindependently.The integrated system should satisfy thecombined requirements within the new environment.Components should communicate with each other throughinterfaces without any conflict.The components depend oneach other and there are usually certain constraints on theirdependencies.For example,version m of component Arequires version n,or above,of component B.The systemconfiguration ontology can be built to capture the informa-tion about the component dependencies and version restric-tions of the system[31].A Web service can be developed tosupport the query and reasoning over the systemconfigura-tions.Supported by the ontologies,the service enables thevalidationof a givenintegrationto be checkedautomatically.3.1.4 Software Testing PhaseAs software systems continue to growin size and complex-ity,it becomes harder to control the quality.Softwaretesting is an essential phase in software engineering,whichis performed after implementation to ensure the quality.Software testing ontology is designed especially to modeltesting tasks.Testing tasks are currently performed bysoftware programs.Approaches are proposed to developsoftware agents to perform testing tasks.The communica-tion and interaction between agents can be supported by thesoftware testing ontology [40].The Semantic-Web-based approach can be built to detectpotential problematic areas from the source code [26].Theresource that needs to be analyzed in this case is the sourcecode.A query service can be provided with the object-oriented source code ontology as the underlying vocabu-lary.The query service should be able to answer thequestions such as whether many objects mutually knoweach other and which classes contain methods with longparameter lists.Such queries provide a hint that there mightbe a problem around the classes involved in the queryresults.Other queries may be able to tell whether theproblem could be solved.3.1.5 Software Maintenance PhaseSoftware maintenance [35] describes activities that keep adeployed software system functioning correctly.Theactivities in software maintenance phase include themodification of a software product to correct faults,toimprove performance,user-friendliness,or other attributesof the system,to add extra functionalities,or to adapt theproduct to a new or changing environment.A Semantic-Web-based maintenance planning and man-agement tool can be built to facilitate the maintenanceprocess [2].The tool can be supported by the softwaremaintenance ontology,which models maintenance-relatedactivities.The relationships between identified softwaremaintenance problems and appropriate activities can beestablished so that it is able to provide guidance for softwaremaintainers to make correct planning and decisions.As maintenance phase requires changes,there is ademand for the automation of change propagation.Thus,mappings between software artifacts can be established.Forexample,requirements,tests,and metrics can be associatedwith software components.In this way,once there is achange request on a requirement item,the affectedcomponents and test cases can be identified.A Semantic-Web-based environment can be built to manage the soft-ware artifacts and ensure the consistency between them[22].The object-oriented design ontology,domain ontology,test ontology,and quality ontology can be used as theunderlying vocabulary of the environment.Queries can beperformed to monitor the changes happening to the system.In this way,the changes can be tracked and developers canbe notified when changes require actions.Reverse engineering and system understanding arenormally required during systemmaintenance or evolution.Software patterns help the system understanding byproviding the design intents.However,they are usuallynot retrievable once transformed into source code.Patterndiscovery is an approach to get back the initial designintents.Pattern discovery can be described as to find a pieceof systemthat satisfies the characteristics and constraints ofa pattern.Thus,the discovery is performed according to thepattern definitions.As we have introduced,pattern ontol-ogy defines all the patterns in a formal way,by usingontology languages,such as OWL.Such pattern descrip-tions are accessible by tools.If the system source code canalso be transformed into OWL or RDF format,the discoveryof patterns can be realized by query and reasoning systemproperties according to the pattern definitions [11].3.1.6 SummaryIn this section,we discuss the Semantic-Web-basedapproaches proposed for each software engineering phase.We have introduced the ontologies that can be used to solveeach problem.Now,we summarize them in Table 2.Table 2 shows the usage of ontologies insolving problemsrelated to different software engineering phases.Softwareengineering phases are listed in the first row.Ontologies arelisted in the first column in the order of their occurrence inSection 2.A cross mark in the table indicates that theontology has been used by some research work to solveproblems in the phase.The table shows that the applicationdomain ontology has been adopted by the most researchworks to solve different problems.The table also suggeststhat a broader range of ontologies have been used to supportsoftware maintenance phase.A question mark in the tableZHAO ET AL.:ONTOLOGY CLASSIFICATION FOR SEMANTIC-WEB-BASED SOFTWARE ENGINEERING311Authorized licensed use limited to: IEEE Xplore. Downloaded on May 13,2010 at 11:43:27 UTC from IEEE Xplore. Restrictions apply.www.DownloadPaper.irwww.DownloadPaper.irindicates that the ontology can be used to solve the problembut there is no work on it yet.The question marks suggestfuture work trends which will be discussed in Section 4.3.2 From Critical Issues PerspectiveThere are some critical issues that are consideredthroughoutthe software life cycles.For example,documentation needsto be produced in every software phase;changes have to bepredicted so that the system will be adaptable;traceabilityneeds to be established;and quality has to be assured.Theseissues have to be addressed to ensure the product qualityand process efficiency.The following sections introduce thecritical issues and discuss how the Semantic Web technol-ogies can be applied to solve the issues.3.2.1 DocumentationDocumentation is a necessary activity during the entiresoftware process.Documents are critical artifacts in themaintenance phase and reverse engineering phase,since ithelps the understanding of the software product.However,documentation is a tedious work.Due to the complexity ofsoftware systems and the tight schedule,documentation isoften omitted.To ease documentation and provide,to someextent,automation of this activity,some of the ontologies,such as software artifact ontology and application domainontology,can be used.Software artifacts can be classified according to thesoftware artifact ontology.Relationships can also be estab-lished between the software artifacts and domain ontologyconcepts so that the software artifacts can be attached withadditional background information.Query services can beprovided so that information is easier to access.It providesguidance for the documentation team and allows docu-ments to be created with an integrated perspective.It alsohelps to ensure the document to be consistent and up-to-date along time as the system evolves [1].3.2.2 TraceabilityThere are many artifacts developed in the softwareengineering process.Sometimes,people may be confusedabout the sources of the artifacts,i.e.,what causes theartifacts to be produced.Without knowing the source,engineers are not clear on the goal they are to achieve.Establishing traceability for the software artifacts helps toclarify the entire product line.Semantic links are where the traceability is established.For example,concepts in domain ontology can be asso-ciated with requirement specification items [24],sourcecode [26],and software components [7].Requirementspecifications can be associated with software components[22] and test cases.Software metrics and test cases can alsobe associated with software components [22].Establishingthe semantic relations fills in the gaps between high-levelartifacts and low-level artifacts [23] as well as the commu-nication gaps between different teams.Traceability is also important information for the systemunderstanding during the maintenance phase.Usually,thetraceability information can no longer be found after thedeployment of a software system.Reestablishing traceabilitycan facilitate the system understanding [37],[39].Commonconceptsresidinginsystemsourcecodeanddocumentscanbediscovered and be associated with the concepts defined inobject-oriented source code ontology and documentationontology.Web-based queries can be performed to gatherinformation regarding the source code entities,design-levelconcepts,andtheir occurrences indocuments.3.2.3 Change ControlChange happens constantly in software systems.Change inone part of the system may have impact on others.Evensmall changes could result in disasters.Therefore,a goodstrategy is to prepare for changes,instead of waiting untilthey take place.Predicting future change benefits in thesense that designers can create system designs that areadaptable for the potential changes.Application domain ontology should be a good knowl-edge base for change prediction [24].Concepts and theirrelationships in the domain ontology usually correlate withitems in the requirement specification;therefore,a mappingcan be established between them.Usually,there are anumber of existing projects for an application domain,andthus,an effort can be made to study the evolution ofexisting requirement specifications.Sometimes,it can befound that certain types of requirement changes arecorrelated with certain domain ontology relationships.Thus,these types of relationships can be annotated thatthey suggest certain types of requirement changes.In thefuture projects,if a requirement itemmatches the annotatedontology relationship,it is highly possible to perform thepredicted changes.A main goal of design patterns is to design for changes.Appropriately applying design patterns while program-ming normally renders a well-structured,easy to under-stand,easy to extend/change software system.Publishingthe library of software patterns on the Web allows thetraining of the pattern knowledge much easier.It alsoallows easy access to the pattern knowledge and promotesthe production of change adaptable code.An automatic generation of source code fromthe domainontology by transferring knowledge from the ontology tothe programming language [6] has been discussed in312 IEEE TRANSACTIONS ON SERVICES COMPUTING,VOL.2,NO.4,OCTOBER-DECEMBER 2009TABLE 2Usage of Ontologies to Support Engineering PhasesAuthorized licensed use limited to: IEEE Xplore. Downloaded on May 13,2010 at 11:43:27 UTC from IEEE Xplore. Restrictions apply.www.DownloadPaper.irwww.DownloadPaper.irSection 3.1.3.The transformation actually establishes themapping between domain concepts and the source code,which makes change propagation easier.The Semantic Webservices can be provided to constantly check the consistencybetween the domain model and the source code.It suggeststhe areas of the code which requires changes accordinglywhen the requirement changes.This ensures the synchro-nization between domain model and the source code.It canalso suggest the amount of code changes required for amodel evolution,which allows the business to considerwhether to perform the evolution immediately,performlater,or come up with an alternative.When change is required,it is demanded to measure theamount of change,so as to make a mature decision.Regarding an object-oriented software project,all the Javaclasses of one major release should be compared with thoseof another major release,so as to visualize the code changeswithin a certain time span in the software life cycle.Toanalyze the changes according to the history of a single Javaclass,it is necessary to calculate the class similarity betweeneach pair of immediate consecutive releases.The calculationcan be done for all classes,the result of which can beplotted.Thus,the change amount of the entire project overthe entire life cycle can be presented.A Semantic Webservice can be provided to perform the calculation andvisualize the results [26].Object-oriented source codeontology and version ontology can be used as the formalknowledge base for the service.3.2.4 Quality ControlQuality is one of the most important metrics of softwareproducts.Software testing is the activity to ensure that theproduct system delivers the desired functionality.InSection 3.1.4,we have introduced how the Semantic-Web-based approaches can be used for software testing.In this section,we address other quality control aspectsthat can be improved.Software process model defines concepts related tosoftware activities and provides several commonly usedprocesses.A Semantic-Web-based process managementenvironment helps to achieve a process of high efficiency[19].The formality of the underlying process ontologyallows the process automation,dynamic assembling,andtailoring of process elements fromreusable process compo-nents.It also allows the formal reasoning about activityspecifications and semantic-based searching for activity-relevant standards and lessons learned.A large number of artifacts are produced during theengineering process.To achieve high quality,assessmentneeds to be performed on each artifact.Especially,when theartifacts produced in one phase are inputs to the next phase,their quality directly affects that of other artifacts.Assessingsoftware artifacts before delivering them to the next phasenot only ensures the product quality,but also improves theprocess efficiency.For example,requirement specificationsneed to be measured before development and test casesneed to be measured before testing.Existing approaches[24],[41],[42] on assessing requirement specifications havebeen discussed in the last paragraph of Section 3.1.1.Calculating software metrics is a way that quality can bemeasured.In an object-oriented software system,it is not agood practice to have objects that knowtoo much,i.e.,havetoo many instance variables and methods.Thus,thenumber of instance variables and methods can be calculatedas an indicator of the quality.With the support of object-oriented source code ontology,a Semantic Web service canbe provided to perform the measurement [26].Quality can also be measured by performing calculationson the number of defects [26].A Semantic Web service canbe provided to perform the calculation.The service can besupported by the object-oriented source code ontology,defect ontology,and version ontology.It should be able tosuggest the problematic areas by calculating the number ofdefects associated to each class.It should also be able tocalculate the defect density for each release,which is a goodindicator of the software quality over time.3.2.5 ReuseInformation reuse saves the effort and time in softwareengineering process.Acritical step for reuse is the informa-tion retrieval,which acquires the desired knowledge.Sincethe Semantic-Web-based approach attaches semantic infor-mation to the information resources,it is able to provide thecontext-based search in addition to the keyword-basedsearch.Since most software artifacts have more complexinternal structures than the text format,the Semantic-Web-based approaches have greater retrieving ability.In a Semantic-Web-based information retrieval environ-ment,software artifacts,such as specification documents,design diagrams,and source code,can be represented inRDF,RDFS,or OWL format [4],[10],[17].The artifacts canbe further annotated by implicit and explicit metadata,andindexed or classified according to the concepts defined inthe software artifact ontology and the domain ontology.Theuse of these two ontologies empowers the search mechan-isms and makes the retrieval easier.This enables thecontext-based searching,which is more powerful thankeyword-based searching.A software component is a piece of system that workswell independently.It can also be integrated with othercomponents to achieve more functionality.Components arethe most commonly reused software artifacts.Since there isa large pool of software components,including thecommercial ones,it is a complex task to select the rightcomponents.Similar to other software artifacts,RDF,RDFS,and OWL can be used to specify the components accordingto their UML specifications [27].This allows the componentrepository to be shared through the Web.A Semantic-Web-based environment can be provided for component char-acterization,which helps to annotate the components sothat the components are easier to be retrieved.Thisapproach allows open-source software components to becommented by experts worldwide.The characterization canbe done from different perspectives.From architectureperspective [23],software components can be associatedwith their satisfying architectural mechanisms and archi-tectural policies.From application domain perspective,components can be associated with their domain [7].Inthis way,each domain concepts have certain associatedcomponents.Thus,the domain concepts provide a uniformview of the available components organized in domaintaxonomy.The characterization/annotation allows theZHAO ET AL.:ONTOLOGY CLASSIFICATION FOR SEMANTIC-WEB-BASED SOFTWARE ENGINEERING313Authorized licensed use limited to: IEEE Xplore. Downloaded on May 13,2010 at 11:43:27 UTC from IEEE Xplore. Restrictions apply.www.DownloadPaper.irwww.DownloadPaper.irsearching of software components at the semantic level,which is different from the traditional keyword-basedsearch or component-interface-based search.Architectureontology and domain ontology are required as the under-lying vocabulary.3.2.6 Technology Selection and Process SupportSoftware engineering is a complex process and requires theassistance of existing technologies.There are a number oftools,environments,and platforms provided for softwareengineering.Usage of the suitable technology makes thesoftware engineering process easier.However,without theknowledge of the existing technologies,it is difficult tomake a smart selection.A Semantic-Web-based environment for technologyselection can be provided [13].Each tool can be treated asa registered service.The tool can be annotated with itsapplication scope,its environment requirements,the typesof inputs it processes,and the formats of the outputs itproduces.Supported by the application logic ontology,quality ontology,and technology ontology,query serviceand tool composition service can be provided.For example,if a tool is needed for a single task,a request can beformulated accordingly.If a set of tools is needed for acomplex task,tools can be selected and composed.Thecomposition of the tools has to be done with the considera-tion of the consistency between the inputs,outputs,preconditions,and postconditions of two adjacent tools.This allows a software engineering process supported bytools to be planned automatically.Knowledge-based software engineering environmentscan be developed [15],[34] with the help of the SemanticWeb technologies.The environment may consist of multi-layers,such as the ontology data layer and the applicationservice layer.The ontology data layer contains theontologies,such as software process ontology,qualityontology,and software artifact ontology.The applicationservice layer includes the software agents which performindependent tasks.The tasks include interpreting requestsfrom end users,breaking requests into subtasks anddelegating to other agents,formulating queries,reasoning,etc.These agents cooperate with each other to provide helpfor different software engineering tasks,such as softwareprocess definition,risk estimation,and object modeling.3.2.7 SummaryIn this section,we discussed the critical issues in softwareengineering that can be improved by using the Semantic-Web-based approaches.We have introduced the ontologiesthat can be used to solve the issues.Table 3 shows the usage of ontologies by differentapproaches to solve or improve critical issues.Criticalissues of software engineering are listed in the first row.Ontologies are listed in the first column in the order of theiroccurrence in Section 2.A cross mark in the table indicatesthat the ontology has been used to solve the issue.The tableshows that the application domain ontology has been usedbroadly to improve different areas of software engineering.Quality control is a general task that may involve a greaternumber of ontologies.Aquestion mark in the table indicatesthat the ontology can be used to solve the issue but there isno research on it yet.The question marks suggest futurework trends which will be discussed in Section 4.4 DISCUSSIONSIn the previous sections,we discussed the existingapproaches for the Semantic-Web-based software engineer-ing.The main goal of these approaches is to better managethe information.There used to be less help for the softwareengineers to access the needed information from the largeamount of information or to understand the information.Byusing Semantic Web techniques,the information within thesoftware community can be formalized,the level ofinformation reusability can be increased,and the accessi-bility and availability of information can be improved.Onthe other hand,such approaches also have some draw-backs.For example,ontologies definition is a difficult task.The development of Semantic Webs and services requirestime and effort.However,once the ontologies are definedand services are built,they can be reused many times,andthus,bring tremendous benefits.Continuing effort has beenput in Semantic Web area.The growth of the Semantic Webtechnologies will further improve the Semantic-Web-basedsoftware engineering.As more powerful tools being builtand better standards being defined,they will be adopted toimprove the Semantic-Web-based software engineering.In the following sections,we discuss some of ourobservations during our study of the literature.4.1 OntologiesIn Section 2,we provide a list of ontologies that arebeneficial for software engineering.The generation of thelist is based on the existing works on building ontologies forsoftware engineering.By observing the ontologies list,wefound that the ontologies can be built at different abstrac-tion levels.For example,the pattern ontology is a library ofpatterns,which provides the information for individualpatterns.On the other hand,the OO design ontology314 IEEE TRANSACTIONS ON SERVICES COMPUTING,VOL.2,NO.4,OCTOBER-DECEMBER 2009TABLE 3Benefits of Ontologies to Critical IssuesAuthorized licensed use limited to: IEEE Xplore. Downloaded on May 13,2010 at 11:43:27 UTC from IEEE Xplore. Restrictions apply.www.DownloadPaper.irwww.DownloadPaper.irmodels the object-oriented design concepts.Design patternscan be described by object-oriented concepts,and therelationships of which should conform to the OO designontology.Since the OO design ontology defines theconstraints for the description of design patterns,it can beconsidered as a metaontology for design pattern ontology.A software system contains a large amount of informa-tion,which can be modeled by ontology.For example,thedesign of an object-oriented system,which is usuallymodeled by UML diagrams,can also be modeled byontology.Such ontology is here referred to as project-specific design ontology.In UML,a model usually can bedefined by a metamodel.This means that the design of anobject-oriented system must conform to the UML metamo-del,which represents the object-oriented design principle.Similarly,a project-specific design ontology should alsocomply with the object-oriented design principle,whosedefinition can be found in the OO design ontology.Therefore,the OO design ontology can be seen as themetaontology for any project-specific design ontology.Some of the ontologies defined in Section 2 are metalevelontologies,such as OO design ontology and OO sourcecode ontology.Because of the complexity of buildingontologies,only reusable ontologies are worth building.metalevel ontologies can be reused as rules and guidancefor knowledge-driven software engineering processes.Sincethey not only provide a formal definition,but also provide away for software agents to learn and understand,theybecome foundations for the Semantic-Web-based softwareengineering environment.Although ontologies can be builtfor individual projects,they are not reusable;thus,it is notwise to do so.There are several existing standards to create ontologiesnowadays.The most popular and mature ontology devel-opment language is OWL.There are a number of toolscreated for OWL.To name a few,Prote´ge´and SWOOP areOWL editors;FaCT++,Pellet and Racer are Reasoners forOWL;and Jena and OWL API are programming environ-ments for building Semantic Web applications.Table 4 listssome supporting tools for OWL.Only the most popularones,which are employed by the researches studied in thispaper,are listed.Another emerging language for ontologydefinition is Web Service Modeling Language (WSML),which is intended for handling Web services.However,this language is not mature enough.The language itself hasnot been defined completely and there are less supportingtools.We only found WSMO Studio and WSMT as WSMLeditors and WSML DL Reasoner as a plug-in for reasoning.To the best our knowledge,there is no integratedprogramming environment for WSML yet.Comparatively,OWL is more preferable.The evaluation of the usage ofWSML as ontology language to solve software engineeringproblems can be a future direction.4.2 Current Approaches and Future DirectionsWe introduced the application of the Semantic Webtechnologies in different areas of software engineering inSection 3.The discussions include the current research stateinthis area.For example,there are more works onimprovingrequirement engineeringthanothers.Almost everyapproachdiscusses certaincritical issues that canbe addressedbytheirapproach.In this section,we discuss some issues and pointout some potential research directions.Semantic-Web-based component characterization,dis-cussed in Section 3.2.5,focuses on the open-source softwaresystems,which can be accessed and studied publically.Proprietary systems,however,are not publically accessibleand can only be characterized by a limited number ofpeople.Therefore,this approach cannot benefit fromreceiving comments on proprietary system from world-wide.However,this approach can be used to get commentsfromengineers who develop the systemand existing users,and to deliver the opinions to other potential users.There are a number of existing resources that assistsoftware design.In addition to patterns,common architec-tural styles serve as design guidelines.They help to achievebetter high-level software architecture.Furthermore,exist-ing design and modeling languages,such as UML,helps tovisualize software design so as to improve communication.In the future,the Semantic Web can be used to publisharchitectural styles and knowledge about existing modelinglanguages to encourage the dissemination of architectureand design knowledge.Documentation ontology consists of a large body ofconcepts related to the software system.Any informationthat is expected to appear in the content of softwaredocuments goes into this ontology.Such informationincludes domain knowledge,design decisions,etc.In thefuture,the documentation ontology can be used to helpautomating the documentation process.Relationships canbe established between the concepts in the documentationontology and the software artifacts ontology.With thesupport of the ontologies,Semantic Web service may bedeveloped to generate documents automatically fromexisting resources.Automatic generation of source code fromdomain modelhelps to ensure the consistency between them.When there isa change in the domain model,update of source code isrequired.Applying the same approach to the automaticupdate of requirement specification can be a future work.With the mapping established between the applicationdomainontology concepts andthe requirement specificationitems,any change of the domain ontology can be populatedto the requirement specification,or at least a notificationabout the change request can be sent to the engineers.Concerning assessment,the Semantic Web technologiesare good at detecting inconsistency.However,there areother types of errors,such as incompleteness and ambiguityin requirement specification.Checking incompleteness andambiguity can be another future work.The idea is toprovide one-many mapping between domain ontologyconcepts and requirement specifications.Every concept inthe domain ontology shall be able to map to the require-ments;otherwise,incompleteness is detected.In addition,ZHAO ET AL.:ONTOLOGY CLASSIFICATION FOR SEMANTIC-WEB-BASED SOFTWARE ENGINEERING315TABLE 4Ontology Languages and ToolsAuthorized licensed use limited to: IEEE Xplore. Downloaded on May 13,2010 at 11:43:27 UTC from IEEE Xplore. Restrictions apply.www.DownloadPaper.irwww.DownloadPaper.irthe Semantic-Web-based approaches provide formal ontol-ogy as a definition for domain concepts.Every concept inthe requirement specification is able to find its formaldefinition in the domain taxonomy.Thus,such approacheshelp to avoid ambiguity.Another possible work is to use quality ontology toperform assessment of software artifacts along somepredefined metrics.The quality ontology is an importantsource where metrics can be found.A Web service can beprovided to measure software artifacts according todifferent metrics that are defined in the quality ontology.This service may also require the support of otherontologies,such as source code ontology and softwareartifact ontology.5 CONCLUSIONSThere are many discussions and suggestions about improv-ing software engineering process by using ontology and theSemantic Web techniques.To the best of our knowledge,there is no classification or assessment on these approachesyet.Our goal in this paper is to provide a reviewon currentstatus of this field.We discussed the ontologies proposed by differentresearch works.For each ontology,we identify its back-ground and motivation as well as present what need to bemodeledbythe ontology.We provide a classificationschemato further categorize the ontologies.Our review andclassificationof ontologies provide a clear viewof the currentstate of the ontology development for software engineering.We study the Semantic-Web-based approaches on im-proving software engineering processes.We categorize theapproaches based on the engineering phases and the criticalissues they can be applied.We analyze the usage ofontologies in different approaches.This work provides acomprehensive view on the current approaches proposedfor software engineering,by presenting which area has beenfully covered and which has not.We also suggest futureresearch directions in this field.REFERENCES[1] A.P.Ambro´sio,D.C.de Santos,F.N.de Lucena,and J.C.de Silva,“Software Engineering Documentation:An Ontology-Based Ap-proach,” Proc.WebMedia and LA-Web Joint Conf.10th BrazilianSymp.Multimedia and the Web Second Latin Am.Web Congress,pp.38-40,2004.[2] A.April,J.-M.Desharnais,and R.Dumke,“A Formalism ofOntology to Support a Software Maintenance Knowledge-BasedSystem,” Proc.18th Int’l Conf.Software Eng.and Knowledge Eng.,2006.[3] T.H.Al Balushi,P.R.F.Sampaio,D.Dabhi,and P.Loucopoulos,“Performing Requirements Elicitation Activities Supported byQuality Ontologies,” Proc.18th Int’l Conf.Software Eng.andKnowledge Eng.,pp.343-348.July 2006.[4] B.Antunes,P.Gomes,and N.Seco,“SRS:A Software ReuseSystem Based on the Semantic Web,” Proc.Third Int’l WorkshopSemantic Web Enabled Software Eng.,June 2007.[5] I.N.Athanasiadis,F.Villa,and A.-E.Rizzoli,“EnablingKnowledge-Based Software Engineering through Semantic-Object-Relational Mappings,” Proc.Third Int’l Workshop Seman-tic Web Enabled Software Eng.,June 2007.[6] M.V.Bossche,P.Ross,I.MacLarty,B.V.Nuffelen,and N.Pelov,“Ontology Driven Software Engineering for Real Life Applica-tions,” Proc.Third Int’l Workshop Semantic Web Enabled SoftwareEng.,June 2007.[7] R.M.M.Braga,M.Mattoso,and C.M.L.Werner,“The Use ofMediation and Ontology Technologies for Software ComponentInformation Retrieval,” Proc.Symp.Software Reusability:PuttingSoftware Reuse in Context,pp.19-28,2001.[8] J.C.Caralt and J.W.Kim,“Ontology Driven Requirement Query,”Proc.40th Ann.Hawaii Int’l Conf.SystemSciences,p.197c,Jan.2007.[9] J.Davies,R.Studer,and P.Warren,Semantic Web Technologies:Trends and Research in Ontology-Based Systems.Wiley,July 2006.[10] B.Decker,E.Ras,J.Rech,B.Klein,and C.Hoecht,“Self-OrganizedReuse of Software Engineering Knowledge Supported by Seman-tic Wikis,” Proc.First Int’l Workshop Semantic Web Enabled SoftwareEng.,2005.[11] J.Dietrich and C.Elgar,“A Formal Description of Design PatternsUsing OWL,” Proc.Australian Software Eng.Conf.,pp.243-250,2005.[12] J.Dietrich and C.Elgar,“Towards a Web of Patterns,” Proc.FirstInt’l Workshop Semantic Web Enabled Software Eng.,2005.[13] U.Dinger,R.Oberhauser,and C.Reichel,“SWS-ASE:LeveragingWeb Service-Based Software Engineering,” Proc.Int’l Conf.Soft-ware Eng.Advances,2006.[14] J.Dong,R.Paul,and L.-J.Zhang,“High Assurance Service-Oriented Architecture,” Computer,vol.41,no.8,pp.22-23,Aug.2008.[15] R.A.Falbo,F.B.Ruy,and R.D.Moro,“Using Ontologies to AddSemantics to a Software Engineering Environment,” Proc.17thInt’l Conf.Software Eng.and Knowledge Eng.,pp.151-156,2005.[16] E.Gamma,R.Helm,R.Johnson,and J.Vlissides,Design Patterns:Elements of Reusable Object-Oriented Software.Addision-Wesley,1995.[17] H.-J.Happel,A.Korthaus,S.Seedorf,and P.Tomczyk,“KOntoR:An Ontology-Enabled Approach to Software Reuse,” Proc.18thInt’l Conf.Software Eng.and Knowledge Eng.,pp.349-354,July 2006.[18] H.-J.Happel and S.Seedorf,“Applications of Ontologies inSoftware Engineering,” Proc.Second Int’l Workshop Semantic WebEnabled Software Eng.,2006.[19] J.S.Hawker,H.Ma,and R.K.Smith,“A Web-Based Process andProcess Models to Find and Deliver Information to Improve theQuality of Flight Software,” Proc.22nd Digital Avionics SystemsConf.(DASC ’03),vol.1,pp.12-16,Oct.2003.[20] S.Henninger and P.Ashokkumar,“An Ontology-Based Infra-structure for Usability Design Patterns,” Proc.First Int’l WorkshopSemantic Web Enabled Software Eng.,Nov.2005.[21] S.Henninger and P.Ashokkumar,“An Ontology-Based Metamo-del for Software Patterns,” Proc.18th Int’l Conf.Software Eng.andKnowledge Eng.,2006.[22] D.Hyland-Wood,D.Carrington,and S.Kaplan,“Toward aSoftware Maintenance Methodology Using Semantic Web Tech-niques,” Proc.Second Int’l IEEE Workshop Software Evolvability,pp.23-30,Sept.2006.[23] P.Inostroza and H.Astudillo,“Emergent Architectural Compo-nent Characterization Using Semantic Web Technologies,” Proc.Second Int’l Workshop Semantic Web Enabled Software Eng.,Nov.2006.[24] H.Kaiya and M.Saeki,“Ontology Based Requirements Analysis:Lightweight Semantic Processing Approach,” Proc.Fifth Int’l Conf.Quality Software,pp.223-230,2005.[25] P.Kamthan and H.-I.Pai,“An Experience in OntologicalRepresentation of Web Application Patterns for the SemanticWeb,” Proc.First Int’l Workshop Semantic Web Enabled Software Eng.,Nov.2005.[26] C.Kiefer,A.Bernstein,and J.Tappolet,“Analyzing Software withiSPARQL,” Proc.Third Int’l Workshop Semantic Web EnabledSoftware Eng.,June 2007.[27] A.Korthaus,M.Schwind,and S.Seedorf,“Leveraging SemanticWeb Technologies for Business Component Specification,” J.WebSemantics:Science,Services and Agents on the World Wide Web,vol.5,no.2,pp.130-141,June 2007.[28] L.Liao,Y.Qu,and H.K.N.Leung,“A Software Process Ontologyand Its Application,” Proc.First Int’l Workshop Semantic WebEnabled Software Eng.,Nov.2005.[29] S.H.Kan,Metric and Models in Software Quality Engineering.Addison-Wesley,Sept.2002.[30] J.Rilling,Y.Zhang,W.J.Meng,R.Witte,V.Haarslev,and P.Charland,“AUnified Ontology-Based Process Model for SoftwareMaintenance and Comprehension,” Proc.Workshop and Symp.atModels in Software Eng.,pp.56-65,Oct.2006.316 IEEE TRANSACTIONS ON SERVICES COMPUTING,VOL.2,NO.4,OCTOBER-DECEMBER 2009Authorized licensed use limited to: IEEE Xplore. Downloaded on May 13,2010 at 11:43:27 UTC from IEEE Xplore. Restrictions apply.www.DownloadPaper.irwww.DownloadPaper.ir[31] H.H.Shahri,J.A.Hendler,and A.A.Porter,“Software Configura-tion Management Using Ontologies,” Proc.Third Int’l WorkshopSemantic Web Enabled Software Eng.,June 2007.[32] I.Sommerville,Software Engineering,fourth ed.Addison-Wesley,1992.[33] S.Thaddeus and S.V.K.Raja,“Ontology-Driven Model forKnowledge-Based Software Engineering,” Proc.18th Int’l Conf.Software Eng.and Knowledge Eng.,pp.337-341,July 2006.[34] S.Thaddeus and S.V.K.Raja,“A Semantic Web Tool forKnowledge-Based Software Engineering,” Proc.Second Int’lWorkshop Semantic Web Enabled Software Eng.,2006.[35] D.Van Edelstein,“Report on the IEEE STD 1219-993—Standardfor Software Maintenance,” ACM SIGSOFT—Software Eng.Notes,vol.18,no.4,pp.94-95,Oct.1993.[36] H.Wang,Y.F.Li,J.Sun,H.Zhang,and J.Pan,“Verifying FeatureModels Using OWL,” J.Web Semantics:Science,Services and Agentson the World Wide Web,vol.5,no.2,pp.117-129,June 2007.[37] R.Witte,Y.Zhang,and J.Rilling,“Empowering SoftwareMaintainers with Semantic Web Technologies,” Proc.FourthEuropean Semantic Web Conf.,pp.37-52,June 2007.[38] P.Wongthongtham,P.E.Chang,T.S.Dillon,and I.Sommerville,“Software Engineering Ontologies and Their Implementation,”Proc.IASTED Int’l Conf.Software Eng.(SE ’05),pp.208-213,Feb.2005.[39] Y.Zhang,R.Witte,J.Rilling,and V.Haarslev,“An Ontology-Based Approach for Traceability Recovery,” Proc.Third Int’lWorkshop Metamodels,Schemas,Grammars,Ontologies for ReverseEng.,2006.[40] H.Zhu and Q.Huo,“Developing a Software Testing Ontology inUML for a Software Growth Environment of Web-BasedApplications,” Chapter IX of Software Evolution with UML andXML,Idea Group,Inc.(IGI),2005.[41] X.Zhu and Z.Jin,“Ontology-Based Inconsistency Management ofSoftware Requirements Specification,” Proc.31st Conf.CurrentTrends in Theory and Practice of Computer Science,pp.340-349,Jan.2005.[42] X.Zhu and Z.Jin,“Inconsistency Measurement of SoftwareRequirements Specifications:An Ontology-Based Approach,”Proc.10th IEEE Int’l Conf.Eng.of Complex Computer Systems,pp.402-410,2005.[43] IEEE Std 610.12-1990,IEEE Standard Glossary of Software Engineer-ing Terminology,IEEE,1990.[44] Resource Description Framework (RDF),http://www.w3.org/RDF,2009.[45] SPARQL Query Language for RDF,http://www.w3.org/TR/rdf-sparql-query,2009.[46] Web Ontology Language OWL,http://www.w3.org/2004/OWL,2009.Yajing Zhao received the BS degree in com-puter science from Nankai University in 2005,and the MS degree in software engineering fromthe University of Texas at Dallas in 2007.She iscurrently working toward the PhD degree at theUniversity of Texas at Dallas,majoring in soft-ware engineering.Her research interests in-clude software architecture and design,service-oriented architecture,semantic Web services,system modeling,and model transformation.Jing Dong received the BS degree in com-puter science from Peking University in 1992,and the MMath and PhD degrees in computerscience from the University of Waterloo,Canada,in 1997 and 2002,respectively.Heis an assistant professor in the Department ofComputer Science at the University of Texasat Dallas.His research and teaching interestsinclude formal and automated methods forsoftware engineering,software modeling anddesign,services computing,and visualization.He has had more than100 publications in these areas.He is a senior member of the ACMand a senior member of the IEEE.Tu Peng received the BS and MS degreesfrom the School of Mathematics,PekingUniversity,China.He is currently workingtoward the PhD degree majoring in softwareengineering at the Computer Science Depart-ment,University of Texas at Dallas.Hisresearch interests include formal modelingand verification of software design,security,services computing,and model checking.ZHAO ET AL.:ONTOLOGY CLASSIFICATION FOR SEMANTIC-WEB-BASED SOFTWARE ENGINEERING317Authorized licensed use limited to: IEEE Xplore. Downloaded on May 13,2010 at 11:43:27 UTC from IEEE Xplore. Restrictions apply.www.DownloadPaper.irwww.DownloadPaper.ir