About Me

Shinpei Hayashi is an assistant professor at School of Computing, Tokyo Institute of Technology.
He received a B.Eng. degree from Hokkaido University in 2004.
He also received M.Eng. and Dr.Eng. degrees from Tokyo Institute of Technology in 2006 and 2008, respectively.

Existing techniques for detecting code smells (indicators of source code problems) do not consider the current context, which renders them unsuitable for developers who have a specific context, such as modules within their focus. Consequently, the developers must spend time identifying relevant smells. We propose a technique to prioritize code smells using the developers' context. Explicit data of the context are obtained using a list of issues extracted from an issue tracking system. We applied impact analysis to the list of issues and used the results to specify the context-relevant smells. Results show that our approach can provide developers with a list of prioritized code smells related to their current context. We conducted several empirical studies to investigate the characteristics of our technique and factors that might affect the ranking quality. Additionally, we conducted a controlled experiment with professional developers to evaluate our technique. The results demonstrate the effectiveness of our technique.

In multi-layer systems such as web applications, locating features is a challenging problem because each feature is often realized through a collaboration of program elements belonging to different layers. This paper proposes a semi-automatic technique to extract correspondence between features and program elements among layers, by merging execution traces of every layer to feed into formal concept analysis. By applying this technique to a web application, not only modules in the application layer but also web pages in the presentation layer and table accesses in the data layer can be associated with features at once. To show the feasibility of our technique, we applied it to a web application which conforms to the typical three-layer architecture of Java EE and discuss its applicability to other layer systems in the real world.

The authors formulate the class responsibility assignment (CRA) problem as the fuzzy constraint satisfaction problem (FCSP) to automate CRA, and show the results of automatic assignments of examples. Responsibilities are contracts or obligations of objects that they should assume; by aligning them to classes appropriately, designs of high quality realize. Typical aspects of a desirable design are having a low coupling between highly cohesive classes. However, because of a trade-off among such aspects, solutions that satisfy the conditions moderately are desired, and computer assistance is needed. The authors represent the conditions of such aspects as fuzzy constraints, and formulate CRA as FCSP. That enables us to apply common algorithms that solve FCSP to the problem, and to derive solutions representing a CRA.

Refactoring large systems involves several sources of uncertainty related to the severity levels of code smells to be corrected and the importance of the classes in which the smells are located. Both severity and importance of identified refactoring opportunities (e.g. code smells) are difficult to estimate. In fact, due to the dynamic nature of software development, these values cannot be accurately determined in practice, leading to refactoring sequences that lack robustness. In addition, some code fragments can contain severe quality issues but they are not playing an important role in the system. To address this problem, we introduced a multi-objective robust model, based on NSGA-II, for the software refactoring problem that tries to find the best trade-off between three objectives to maximize: quality improvements, severity and importance of refactoring opportunities to be fixed. We evaluated our approach using 8 open source systems and one industrial project, and demonstrated that it is significantly better than state-of-the-art refactoring approaches in terms of robustness in all the experiments based on a variety of real-world scenarios. Our suggested refactoring solutions were found to be comparable in terms of quality to those suggested by existing approaches, better prioritization of refactoring opportunities and to carry an acceptable robustness price.

We propose a method of developing a thesaurus for requirements elicitation and its supporting tool. This proposed method consists of two parts - (1) elicitation of candidates of functional requirements to be registered in the thesaurus from technical documents and (2) registration of functional requirements with associated non-functional factors in the thesaurus from these candidates under the direction of domain experts. Our tool supports the first part. This method should satisfy the following two characteristics - (a) extracted functions are correct and (b) any analyst can extract all indispensable functions from technical documents. We show the above two characteristics through case studies and confirm the usability and effectiveness of the proposed method.

Change-aware development environments can automatically record fine-grained code changes on a program and allow programmers to replay the recorded changes in chronological order. However, since they do not always need to replay all the code changes to investigate how a particular entity of the program has been changed, they often eliminate several code changes of no interest by manually skipping them in replaying. This skipping action is an obstacle that makes many programmers hesitate when they use existing replaying tools. This paper proposes a slicing mechanism that automatically removes manually skipped code changes from the whole history of past code changes and extracts only those necessary to build a particular class member of a Java program. In this mechanism, fine-grained code changes are represented by edit operations recorded on the source code of a program and dependencies among edit operations are formalized. The paper also presents a running tool that slices the operation history and replays its resulting slices. With this tool, programmers can avoid replaying nonessential edit operations for the construction of class members that they want to understand. Experimental results show that the tool offered improvements over conventional replaying tools with respect to the reduction of the number of edit operations needed to be examined and over history filtering tools with respect to the accuracy of edit operations to be replayed.

Refactoring is the process of changing a software system in such a way that it does not alter the external behavior of the code yet improves its internal structure. Not only researchers but also practitioners need to know past instances of refactoring performed in a software development project. So far, a number of techniques have been proposed on the automatic detection of refactoring instances. Those techniques have been presented in various international conferences and journals, and it is difficult for researchers and practitioners to grasp the current status of studies on refactoring detection techniques. In this survey paper, we introduce refactoring detection techniques, especially in techniques based on change history analysis. At first, we give the definition and the categorization of refactoring detection in this paper, and then introduce refactoring detection techniques based on change history analysis. Finally, we discuss possible future research directions on refactoring detection.

This paper presents a survey on techniques to record and utilize developers’ operations on integrated development environments (IDEs). Especially, we let techniques treating fine-grained code changes be targets of this survey for reference in software evolution research. We created a three-tiered model to represent the relationships among IDEs, recording techniques, and application techniques. This paper also presents common features of the techniques and their details.

In software configuration management, it is important to separate source code changes into meaningful units before committing them (in short, Task Level Commit). However, developers often commit unrelated code changes in a single transaction. To support Task Level Commit, an existing technique uses an editing history of source code and enables developers to group the editing operations in the history. This paper proposes an automated technique for grouping editing operations in a history based on several criteria including source files, classes, methods, comments, and times editted. We show how our technique reduces developers' separating cost compared with the manual approach.

Goal-oriented requirements analysis (GORA) is a promising technique in requirements engineering, especially requirements elicitation. This paper aims at developing a technique to support the improvement of goal graphs, which are resulting artifacts of GORA. We consider that the technique of improving existing goals of lower quality is more realistic rather than that of creating a goal graph of high quality from scratch. To achieve the proposed technique, we define quality properties for each goal formally. Our quality properties result from IEEE Std 830 and past related studies. To define them formally, using attribute values of an attributed goal graph, we formulate predicates for deciding if a goal satisfies a quality property or not. We have implemented a supporting tool to show a requirements analyst the goals which do not satisfy the predicates. Our experiments using the tool show that requirements analysts can efficiently find and modify the qualitatively problematic goals.

Goal-oriented requirements analysis (GORA) is one of the promising techniques to elicit software requirements, and it is natural to consider its application to security requirements analysis. In this paper, we proposed a method for goal-oriented security requirements analysis using security knowledge which is derived from several security targets (STs) compliant to Common Criteria (CC, ISO/IEC 15408). We call such knowledge security ontology for an application domain (SOAD). Three aspects of security such as confidentiality, integrity and availability are included in the scope of our method because the CC addresses these three aspects. We extract security-related concepts such as assets, threats, countermeasures and their relationships from STs, and utilize these concepts and relationships for security goal elicitation and refinement in GORA. The usage of certificated STs as knowledge source allows us to reuse efficiently security-related concepts of higher quality. To realize our proposed method as a supporting tool, we use an existing method GOORE (goal-oriented and ontology-driven requirements elicitation method) combining with SOAD. In GOORE, terms and their relationships in a domain ontology play an important role of semantic processing such as goal refinement and conflict identification. SOAD is defined based on concepts in STs. In contrast with other goal-oriented security requirements methods, the knowledge derived from actual STs contributes to eliciting security requirements in our method. In addition, the relationships among the assets, threats, objectives and security functional requirements can be directly reused for the refinement of security goals. We show an illustrative example to show the usefulness of our method and evaluate the method in comparison with other goal-oriented security requirements analysis methods.

Software must be continually evolved to keep up with users’ needs. In this article, we propose a new taxonomy of software evolution. It consists of three perspectives: methods, targets, and objectives of evolution. We also present a literature review on software evolution based on our taxonomy. The result could provide a concrete baseline in discussing research trends and directions in the field of software evolution.

Software requirements elicitation is a cooperative work by stakeholders. It is important for project managers and analysts to understand stakeholder concerns and to identify potential problems such as imbalance or lack of stakeholders. This paper presents a technique and a tool which visualize the strength of stakeholders' interest of concerns on two dimensional screens. The tool generates anchored maps from an attributed goal graph by AGORA, which is an extended version of goal-oriented analysis methods. It has stakeholders' interest to concerns and its degree as the attributes of goals. Additionally an experimental evaluation is described, whose results show the user of the tool could identify imbalance and lack of stakeholders more accurately in shorter time than the case with a table of stakeholders and requirements.

Requirements changes frequently occur at any time of a software development process, and their management is a crucial issue to develop software of high quality. Meanwhile, goal-oriented analysis techniques are being put into practice to elicit requirements. In this situation, the change management of goal graphs and its support are necessary. This paper presents a technique related to the change management of goal graphs, realizing impact analysis on a goal graph when its modifications occur. Our impact analysis detects conflicts that arise when a new goal is added, and investigates the achievability of the other goals when an existing goal is deleted. We have implemented a supporting tool for automating the analysis. Two case studies suggested the efficiency of the proposed approach.

This paper proposes an interactive approach for efficiently understanding a feature implementation by applying feature location (FL). Although existing FL techniques can reduce the understanding cost, it is still an open issue to construct the appropriate inputs for the techniques. In our approach, the inputs of FL are incrementally improved by interactions between users and the FL system. By understanding a code fragment obtained using FL, users can find more appropriate queries from the identifiers in the fragment. Furthermore, the relevance feedback, obtained by partially judging whether or not a code fragment is required to understand, improves the evaluation score of FL. Users can then obtain more accurate results. We have implemented a supporting tool of our approach. Evaluation results using the tool show that our interactive approach is feasible and that it can reduce the understanding cost more effectively than the non-interactive approach.

Software engineers require knowledge about a problem domain when they elicit requirements for a system about the domain. Explicit descriptions about such knowledge such as domain ontology contribute to eliciting such requirements correctly and completely. Methods for eliciting requirements using ontology have been thus proposed, and such ontology is normally developed based on documents and/or experts in the problem domain. However, it is not easy for engineers to elicit requirements correctly and completely only with such domain ontology because they are not normally experts in the problem domain. In this paper, we propose a method and a tool to enhance domain ontology using Web mining. Our method and the tool help engineers to add additional knowledge suitable for them to understand domain ontology. According to our method, candidates of such additional knowledge are gathered from Web pages using keywords in existing domain ontology. The candidates are then prioritized based on the degree of the relationship between each candidate and existing ontology and on the frequency and the distribution of the candidate over Web pages. Engineers finally add new knowledge to existing ontology out of these prioritized candidates. We also show an experiment and its results for confirming enhanced ontology enables engineers to elicit requirements more completely and correctly than existing ontology does.

Object Constraint Language (OCL) is frequently applied in software development for stipulating formal constraints on software models. Its platform-independent characteristic allows for wide usage during the design phase. However, application in platform-specific processes, such as coding, is less obvious because it requires usage of bespoke tools for that platform. In this paper we propose an approach to generate assertion code for OCL constraints for multiple platform specific languages, using a unified framework based on structural similarities of programming languages. We have succeeded in automating the process of assertion code generation for four different languages using our tool. To show effectiveness of our approach in terms of development effort, an experiment was carried out and summarised.

It is difficult to estimate how a combination of implementation technologies influences quality attributes on an entire system. In this paper, we propose a technique to choose implementation technologies by modeling casual dependencies between requirements and technoloies probabilistically using Bayesian networks. We have implemented our technique on a Bayesian network tool and applied it to a case study of a business application to show its effectiveness.

This paper discusses recent studies on technologies for supporting software construction and maintenance by analyzing various software engineering data. We also introduce typical data mining techniques for analyzing the data.

This paper proposes a technique for selecting the most appropriate alternative of source code changes based on the commitment of a software development project by each developer of the project. In the technique, we evaluate the alternative changes by using an evaluation function with integrating multiple software metrics to suppress the influence of each developer’s subjectivity. By regarding the selection of the alternative changes as a multiple criteria decision making, we create the function with Analytic Hierarchy Process. A preliminary evaluation shows the efficiency of the technique.

This paper proposes a technique for detecting the occurrences of refactoring from source code revisions. In a real software development process, a refactoring operation may sometimes be performed together with other modifications at the same revision. This means that detecting refactorings from the differences between two versions stored in a software version archive is not usually an easy process. In order to detect these impure refactorings, we model the detection within a graph search. Our technique considers a version of a program as a state and a refactoring as a transition between two states. It then searches for the path that approaches from the initial to the final state. To improve the efficiency of the search, we use the source code differences between the current and the final state for choosing the candidates of refactoring to be applied next and estimating the heuristic distance to the final state. Through case studies, we show that our approach is feasible to detect combinations of refactorings.

ATTED-II (http://atted.jp) is a database of gene coexpression in Arabidopsis that can be used to design a wide variety of experiments, including the prioritization of genes for functional identification or for studies of regulatory relationships. Here, we report updates of ATTED-II that focus especially on functionalities for constructing gene networks with regard to the following points: (i) introducing a new measure of gene coexpression to retrieve functionally related genes more accurately, (ii) implementing clickable maps for all gene networks for step-by-step navigation, (iii) applying Google Maps API to create a single map for a large network, (iv) including information about protein-protein interactions, (v) identifying conserved patterns of coexpression and (vi) showing and connecting KEGG pathway information to identify functional modules. With these enhanced functions for gene network representation, ATTED-II can help researchers to clarify the functional and regulatory networks of genes in Arabidopsis.

One of the approaches to improve program understanding is to extract what kinds of design pattern are used in existing object-oriented software. This paper proposes a technique for efficiently and accurately detecting occurrences of design patterns included in source codes. We use both static and dynamic analyses to achieve the detection with high accuracy. Moreover, to reduce computation and maintenance costs, detection conditions are hierarchically specified based on Pree's meta patterns as common structures of design patterns. The usage of Prolog to represent the detection conditions enables us to easily add and modify them. Finally, we have implemented an automated tool as an Eclipse plug-in and conducted experiments with Java programs. The experimental results show the effectiveness of our approach.

A database of coexpressed gene sets can provide valuable information for a wide variety of experimental designs, such as targeting of genes for functional identification, gene regulation and/or protein-protein interactions. Coexpressed gene databases derived from publicly available GeneChip data are widely used in Arabidopsis research, but platforms that examine coexpression for higher mammals are rather limited. Therefore, we have constructed a new database, COXPRESdb (coexpressed gene database) (http://coxpresdb.hgc.jp), for coexpressed gene lists and networks in human and mouse. Coexpression data could be calculated for 19 777 and 21 036 genes in human and mouse, respectively, by using the GeneChip data in NCBI GEO. COXPRESdb enables analysis of the four types of coexpression networks: (i) highly coexpressed genes for every gene, (ii) genes with the same GO annotation, (iii) genes expressed in the same tissue and (iv) user-defined gene sets. When the networks became too big for the static picture on the web in GO networks or in tissue networks, we used Google Maps API to visualize them interactively. COXPRESdb also provides a view to compare the human and mouse coexpression patterns to estimate the conservation between the two species.

Publicly available database of co-expressed gene sets would be a valuable tool for a wide variety of experimental designs, including targeting of genes for functional identification or for regulatory investigation. Here, we report the construction of an Arabidopsis thaliana trans-factor and cis-element prediction database (ATTED-II) that provides co-regulated gene relationships based on co-expressed genes deduced from microarray data and the predicted cis elements. ATTED-II (http://www.atted.bio.titech.ac.jp) includes the following features: (i) lists and networks of co-expressed genes calculated from 58 publicly available experimental series, which are composed of 1388 GeneChip data in A.thaliana; (ii) prediction of cis-regulatory elements in the 200 bp region upstream of the transcription start site to predict co-regulated genes amongst the co-expressed genes; and (iii) visual representation of expression patterns for individual genes. ATTED-II can thus help researchers to clarify the function and regulation of particular genes and gene networks.

Refactoring is one of the promising techniques for improving program design by means of program transformation with preserving behavior, and is widely applied in practice. However, it is difficult for engineers to identify how and where to refactor programs, because proper knowledge and skills of a high order are required of them. In this paper, we propose the technique to instruct how and where to refactor a program by using a sequence of its modifications. We consider that the histories of program modifications reflect developers' intentions, and focusing on them allows us to provide suitable refactoring guides. Our technique can be automated by storing the correspondence of modification patterns to suitable refactoring operations. By implementing an automated supporting tool, we show its feasibility. The tool is implemented as a plug-in for Eclipse IDE. It selects refactoring operations by matching between a sequence of program modifications and modification patterns.

Goal refinement is a crucial step in goal-oriented requirements analysis to create a goal model of high quality. Poor goal refinement leads to missing requirements and eliciting incorrect requirements as well as less comprehensiveness of produced goal models. This paper proposes a technique to automate detecting \textit{bad smells} of goal refinement, symptoms of poor goal refinement. Based on the classification of poor refinement, we defined four types of bad smells of goal refinement and developed two types of measures to detect them: measures on the graph structure of a goal model and semantic similarity of goal descriptions. We have implemented a support tool to detect bad smells and assessed its usefulness by an experiment.

In the world of the Internet of Things, heterogeneous systems and devices need to be connected. A key issue for systems and devices is data interoperability such as automatic data exchange and interpretation. A well-known approach to solve the interoperability problem is building a conceptual model (CM). Regarding CM in industrial domains, there are often a large number of entities defined in one CM. How data interoperability with such a large-scale CM can be supported is a critical issue when applying CM into industrial domains. In this paper, evolved from our previous work, a meta-model equipped with new concepts of “PropertyRelationship” and “Category” is proposed, and a tool called FSCM supporting the automatic generation of property relationships and categories is developed. A case study in an industrial domain shows that the proposed approach effectively improves the data interoperability of large-scale CMs.

Goal-oriented requirements analysis (GORA) has been growing in the area of requirement engineering. It is one of the approaches that elicits and analyzes stakeholders’ requirements as goals to be achieved, and develops an AND-OR graph, called a goal graph, as a result of requirements elicitation. However, although it is important to involve stakeholders’ ideas and viewpoints during requirements elicitation, GORA still has a problem that their processes lack the deeper participation of stakeholders. Regarding stakeholders’ participation, creativity techniques have also become popular in requirements engineering. They aim to create novel and appropriate requirements by involving stakeholders. One of these techniques, the KJ-method is a method which organizes and associates novel ideas generated by Brainstorming. In this paper, we present an approach to support stakeholders’ participation during GORA processes by transforming an affinity diagrams of the KJ-method, into a goal graph, including transformation guidelines, and also apply our approach to an example.

Code smells are considered to be indicators of design flaws or problems in source code. Various tools and techniques have been proposed for detecting code smells. The number of code smells detected by these tools is generally large, so approaches have also been developed for prioritizing and filtering code smells. However, the lack of empirical data regarding how developers select and prioritize code smells hinders improvements to these approaches. In this study, we investigated professional developers to determine the factors they use for selecting and prioritizing code smells. We found that \textit{Task relevance} and \textit{Smell severity} were most commonly considered during code smell selection, while \textit{Module importance} and \textit{Task relevance} were employed most often for code smell prioritization. These results may facilitate further research into code smell detection, prioritization, and filtration to better focus on the actual needs of developers.

Utilizing software architecture patterns is important for reducing maintenance costs. However, maintaining code according to the constraints defined by the architecture patterns is time-consuming work. As described herein, we propose a technique to detect code fragments that are incompliant to the architecture as fine-grained architectural violations. For this technique, the dependence graph among code fragments extracted from the source code and the inference rules according to the architecture are the inputs. A set of candidate components to which a code fragment can be affiliated is attached to each node of the graph and is updated step-by-step. The inference rules express the components' responsibilities and dependency constraints. They remove candidate components of each node that do not satisfy the constraints from the current estimated state of the surrounding code fragment. If the current result does not include the current component, then it is detected as a violation. By defining inference rules for MVC2 architecture and applying the technique to web applications using Play Framework, we obtained accurate detection results.

Dynamic feature location techniques (DFLTs), which use execution profiles of scenarios that trigger a feature, are a promising approach to locating features in the source code. A sufficient set of scenarios is key to obtaining highly accurate results; however, its preparation is laborious and difficult in practice. In most cases, a scenario exercises not only the desired feature but also other features. We focus on the relationship between a module and multiple features that can be calculated with no extra scenarios, to improve the accuracy of locating the desired feature in the source code. In this paper, we propose a DFLT using the odds ratios of the multiple relationships between modules and features. We use the similarity coefficient, which is used in fault localization techniques, as a relationship. Our DFLT better orders shared modules compared with an existing DFLT. To reduce developer costs in our DFLT, we also propose a filtering technique that uses formal concept analysis. We evaluate our DFLT on the features of an open source software project with respect to developer costs and show that our DFLT outperforms the existing approach; the average cost of our DFLT is almost half that of the state-of-the-art DFLT.

To develop with lower costs information systems that do not violate regulations, it is necessary to elicit requirements compliant to the regulations. Automated supports allow us to avoid missing requirements necessary to comply with regulations and to exclude functional requirements against the regulations. In this paper, we propose a technique to detect goals relevant to regulations in a goal model and to add goals so that the resulting goal model can be compliant to the regulations. In this approach, we obtain the goals relevant to regulations by semantically matching goal descriptions to regulatory statements. We use Case Grammar approach to deal with the meaning of goal descriptions and regulatory statements, i.e., both are transformed to case frames as their semantic representations, and we check if their case frames can be unified. After detecting the relevant goals, based on the modality of matched regulatory statements, new goals to realize the compliance to regulatory statements are added to the goal model. We made case studies and had a result that 93\% of regulatory violations could be corrected.

Failures of precondition checking when attempting to apply automated refactorings often discourage programmers from attempting to use these refactorings in the future. To alleviate this situation, the postponement of the failed refactoring instead its cancellation is beneficial. This poster paper proposes a new concept of postponable refactoring and a prototype tool that implements postponable Extract Method as an Eclipse plug-in. We believe that this refactoring tool inspires a new field of reconciliation automated and manual refactoring.

Because numerous code smells are revealed by code smell detectors, many attempts have been undertaken to mitigate related problems by prioritizing and filtering code smells. We earlier proposed a technique to prioritize code smells by leveraging the context of the developers, i.e., the modules that the developers plan to implement. Our empirical studies revealed that the results of code smells prioritized using our technique are useful to support developers' implementation on the modules they intend to change. Nonetheless, in software change processes, developers often navigate through many modules and refer to them before making actual changes. Such modules are important when considering the developers' context. Therefore, it is essential to ascertain whether our technique can also support developers on modules to which they are going to refer to make changes. We conducted an empirical study of an open source project adopting tools for recording developers' interaction history. Our results demonstrate that the code smells prioritized using our approach can also be used to support developers for modules to which developers are going to refer, irrespective of the need for modification.

[Context & motivation] To develop information systems providing high business value, we should clarify As-is business processes and information systems supporting them, identify the problems hidden in them, and develop To-be information systems so that the identified problems can be solved. [Question/problem] In this development, we need a technique to support the identification of the problems, which can be seamlessly connected to the modeling techniques. [Principal ideas/results] In this paper, to define metrics to extract problems of the As-is system, following the domains specific to it, we propose the combination of Goal-Question-Metric (GQM) with existing requirements analysis techniques. Furthermore, we integrate goal-oriented requirements analysis (GORA) with problem frames approach and use case modeling to define the metrics of measuring the problematic efforts of human actors in the As-is models. This paper includes a case study of a reporting operation process at a brokerage office to check the feasibility of our approach. [Contribution] Our contribution is the proposal of using of GQM to identify the problems of an As-is model specified with the combination of GORA, use case modeling, and problem frames.

Behavior preservation often bothers programmers in refactoring. This poster paper proposes a new approach that tames the behavior preservation by introducing the concept of a frame. A frame in refactoring defines stakeholder's individual concerns about the refactored code. Frame-based refactoring preserves the observable behavior within a particular frame. Therefore, it helps programmers distinguish the behavioral changes that they should observe from those that they can ignore.

Feature location (FL) is an important activity for finding correspondence between software features and modules in source code. Although dynamic FL techniques are effective, the quality of their results depends on analysts to prepare sufficient scenarios for exercising the features. In this paper, we propose a technique for guiding identification of missing scenarios using the prior FL result. After applying FL, unexplored call dependencies are extracted by comparing the results of static and dynamic analyses, and analysts are advised to investigate them for finding missing scenarios. We propose several metrics that measure the potential impact of unexplored dependencies to help analysts sort out them. Through a preliminary evaluation using an example web application, we showed our technique was effective for recommending the clues to find missing scenarios.

A socio-technical system (STS) consists of many different actors such as people, organizations, software applications and infrastructures. We call actors except both people and organizations machines. Machines should be carefully introduced into the STS because the machines are beneficial to some people or organization but harmful to others. We thus propose a goal-oriented requirements modelling language called GDMA based on i* so that machines with the following characteristics can be systematically specified. First, machines make the goals of each people be achieved more and better than ever. Second, machines make people achieve goals fewer and easier than ever. We also propose analysis techniques of GDMA to judge whether or not the introduction of machines are appropriate or not. Several machines are introduced into an as-is model of GDMA locally with the help of model transformation techniques. Then, such an introduction is evaluated globally on the basis of metrics derived from the model structure. We confirmed that GDMA could evaluate the successful and failure of existing projects.

To find opportunities for applying prefactoring, several techniques for detecting bad smells in source code have been proposed. Existing smell detectors are often unsuitable for developers who have a specific context because these detectors do not consider their current context and output the results that are mixed with both smells that are and are not related to such context. Consequently, the developers must spend a considerable amount of time identifying relevant smells. As described in this paper, we propose a technique to prioritize bad code smells using developers' context. The explicit data of the context are obtained using a list of issues extracted from an issue tracking system. We applied impact analysis to the list of issues and used the results to specify which smells are associated with the context. Consequently, our approach can provide developers with a list of prioritized bad code smells related to their current context. Several evaluations using open source projects demonstrate the effectiveness of our technique.

Threats to existing systems help requirements analysts to elicit security requirements for a new system similar to such systems because security requirements specify how to protect the system against threats and similar systems require similar means for protection. We propose a method of finding potential threats that can be used for eliciting security requirements for such a system. The method enables analysts to find additional security requirements when they have already elicited one or a few threats. The potential threats are derived from several security targets (STs) in the Common Criteria. An ST contains knowledge related to security requirements such as threats and objectives. It also contains their explicit relationships. In addition, individual objectives are explicitly related to the set of means for protection, which are commonly used in any STs. Because we focus on such means to find potential threats, our method can be applied to STs written in any languages, such as English or French. We applied and evaluated our method to three different domains. In our evaluation, we enumerated all threat pairs in each domain. We then predicted whether a threat and another in each pair respectively threaten the same requirement according to the method. The recall of the prediction was more than 70% and the precision was 20 to 40% in three domains.

In order to develop secure information systems with less development cost, it is important to elicit the requirements to security functions (simply security requirements) as early in their development process as possible. To achieve it, accumulated knowledge of threats and their objectives obtained from practical experiences is useful, and the technique to support the elicitation of security requirements utilizing this knowledge should be developed. In this paper, we present the technique for security requirements elicitation using practical knowledge of threats, their objectives and security functions realizing the objectives, which is extracted from Security Target documents compliant to the standard Common Criteria. We show the usefulness of our approach with several case studies.

To check the consistency between requirements specification documents and regulations by using a model checking technique, requirements analysts generate inputs to the model checker, i.e., state transition machines from the documents and logical formulas from the regulatory statements to be verified as properties. During these generation processes, to make the logical formulas semantically correspond to the state transition machine, analysts should take terminology matching where they look for the words in the requirements document having the same meaning as the words in the regulatory statements and unify the semantically same words. In this paper, by using case grammar approach, we propose an automated technique to reason the meaning of words in requirements specification documents by means of cooccurrence constraints on words in case frames, and to generate from regulatory statements the logical formulas where the words are unified to the words of the requirements documents. We have a feasibility study of our proposal with two case studies.

In software configuration management using a version control system, developers have to follow the commit policy of the project. However, preparing changes according to the policy are sometimes cumbersome and time-consuming, in particular when applying large refactoring consisting of multiple primitive refactoring instances. In this paper, we propose a technique for re-organizing changes by recording editing operations of source code. Editing operations including refactoring operations are hierarchically managed based on their types provided by an integrated development environment. Using the obtained hierarchy, developers can easily configure the granularity of changes and obtain the resulting changes based on the configured granularity. We confirmed the feasibility of the technique by applying it to the recorded changes in a large refactoring process.

In this paper, we propose a multi-dimensional extension of goal graphs in goal-oriented requirements engineering in order to support the understanding the relations between goals, i.e., goal refinements. Goals specify multiple concerns such as functions, strategies, and non-functions, and they are refined into sub goals from mixed views of these concerns. This intermixture of concerns in goals makes it difficult for a requirements analyst to understand and maintain goal graphs. In our approach, a goal graph is put in a multi-dimensional space, a concern corresponds to a coordinate axis in this space, and goals are refined into sub goals referring to the coordinates. Thus, the meaning of a goal refinement is explicitly provided by means of the coordinates used for the refinement. By tracing and focusing on the coordinates of goals, requirements analysts can understand goal refinements and modify unsuitable ones. We have developed a supporting tool and made an exploratory experiment to evaluate the usefulness of our approach.

Existing techniques have succeeded to help developers implement new code. However, they are insufficient to help to change existing code. Previous studies have proposed techniques to support bug fixes but other kinds of code changes such as function enhancements and refactorings are not supported by them. In this paper, we propose a novel system that helps developers change existing code. Unlike existing techniques, our system can support any kinds of code changes if similar code changes occurred in the past. Our research is still on very early stage and we have not have any implementation or any prototype yet. This paper introduces our research purpose, an outline of our system, and how our system is different from existing techniques.

This paper presents Historef, a tool for automatin edit history refactoring on Eclipse IDE for Java programs. The aim of our history refactorings is to improve the understandability and/or usability of the history without changing its whole effect. Historef enables us to apply history refactorings to the recorded edit history in the middle of the source code editing process by a developer. By using our integrated tool, developers can commit the refactored edits into underlying SCM repository after applying edit history refactorings so that they are easy to manage their changes based on the performed edits.

A basic clue of feature location available to developers is a description of a feature written in a natural language. However, a description of a feature does not clearly specify the boundary of the feature, while developers tend to locate the feature precisely by excluding marginal modules that are likely outside of the boundary. This paper addresses a question: does a clearer description of a feature enable developers to recognize the same sets of modules as relevant to the feature? Based on the conducted experiment with subjects, we conclude that different descriptions lead to a different set of modules.

We formulate the class responsibility assignment (CRA) problem as the fuzzy constraint satisfaction problem (FCSP) for automating CRA of high quality. Responsibilities are contracts or obligations of objects that they should assume, by aligning them to classes appropriately, quality designs realize. Typical conditions of a desirable design are having a low coupling between highly cohesive classes. However, because of a trade-off among such conditions, solutions that satisfy the conditions moderately are desired, and computer assistance is needed. Additionally, if we have an initial assignment, the improved one by our technique should keep the original assignment as much as possible because it involves with the intention of human designers. We represent such conditions as fuzzy constraints, and formulate CRA as FCSP. That enables us to apply common FCSP solvers to the problem and to derive solution representing a CRA. The conducted preliminary evaluation indicates the effectiveness of our technique.

Software visualization has become a major technique in program comprehension. Although many tools visualize the structure, behavior, and evolution of a program, they have no concern with how a tool user has understood it. Moreover, they miss the stuff the user has left through trial-and-error processes of his/her program comprehension task. This paper presents a source code visualization tool called CodeForest. It uses a forest metaphor to depict source code of Java programs. Each tree represents a class within the program and the collection of trees constitutes a three-dimensional forest. CodeForest helps a user to try a large number of combinations of mapping of software metrics on visual parameters. Moreover, it provides two new types of support: leaving notes that memorize the current understanding and insight along with visualized objects, and automatically recording a user's actions under understanding. The left notes and recorded actions might be used as historical data that would be hints accelerating the current comprehension task.

Feature location is an activity to identify correspondence between features in a system and program elements in source code. After a feature is located, developers need to understand implementation structure around the location from static and/or behavioral points of view. This paper proposes a semi-automatic technique both for locating features and exposing their implementation structures in source code, using a combination of dynamic analysis and two data analysis techniques, sequential pattern mining and formal concept analysis. We have implemented our technique in a supporting tool and applied it to an example of a web application. The result shows that the proposed technique is not only feasible but helpful to understand implementation of features just after they are located.

The elicitation of security requirements is a crucial issue to develop secure business processes and information systems of higher quality. Although we have several methods to elicit security requirements, most of them do not provide sufficient supports to identify security threats. Since threats do not occur so frequently, like exceptional events, it is much more difficult to determine the potentials of threats exhaustively rather than identifying normal behavior of a business process. To reduce this difficulty, accumulated knowledge of threats obtained from practical setting is necessary. In this paper, we present the technique to model knowledge of threats as patterns by deriving the negative scenarios that realize threats and to utilize them during business process modeling. The knowledge is extracted from Security Target documents, based on the international Common Criteria Standard, and the patterns are described with transformation rules on sequence diagrams. In our approach, an analyst composes normal scenarios of a business process with sequence diagrams, and the threat patterns matched to them derives negative scenarios. Our approach has been demonstrated on several examples, to show its practical application.

Automated feature location techniques have been proposed to extract program elements that are likely to be relevant to a given feature. A more accurate result is expected to enable developers to perform more accurate feature location. However, several experiments assessing traceability recovery have shown that analysts cannot utilize an accurate traceability matrix for their tasks. Because feature location deals with a certain type of traceability links, it is an important question whether the same phenomena are visible in feature location or not. To answer that question, we have conducted a controlled experiment. We have asked 20 subjects to locate features using lists of methods of which the accuracy is controlled artificially. The result differs from the traceability recovery experiments. Subjects given an accurate list would be able to locate a feature more accurately. However, subjects could not locate the complete implementation of features in 83% of tasks. Results show that the accuracy of automated feature location techniques is effective, but it might be insufficient for perfect feature location.

Comparing and understanding differences between old and new versions of source code are necessary in various software development situations. However, if changes are tangled with refactorings in a single revision, then the resulting source code differences are more complicated. We propose an interactive difference viewer which enables us to separate refactoring effects from source code differences for improving the understandability of the differences.

Feature location (FL) in source code is an important task for program understanding. Existing dynamic FL techniques depend on sufficient scenarios for exercising the features to be located. However, it is difficult to prepare such scenarios because it involves a correct understanding of the features. This paper proposes an incremental technique for refining the identification of features integrated with the existing FL technique using formal concept analysis. In our technique, we classify the differences of static and dynamic dependencies of method invocations based on their relevance to the identified features. According to the classification, the technique suggests method invocations to exercise unexplored part of the features. An application example indicates the effectiveness of the approach.

When information systems are introduced in a social setting such as a business, the systems will give bad and good impacts on stakeholders in the setting. Requirements analysts have to predict such impacts in advance because stakeholders cannot decide whether the systems are really suitable for them without such prediction. In this paper, we propose a method based on model transformation patterns for introducing suitable information systems. We use metrics of a model to predict whether a system introduction is suitable for a social setting. Through a case study, we show our method can avoid an introduction of a system, which was actually bad for some stakeholders. In the case study, we use a strategic dependency model in i* to specify the model of systems and stakeholders, and attributed graph grammar for model transformation. We focus on the responsibility and the satisfaction of stakeholders as the criteria for suitability about systems introduction in this case study.

This paper proposes a concept for refactoring an edit history of source code and a technique for its automation. The aim of our history refactoring is to improve the clarity and usefulness of the history without changing its overall effect. We have defined primitive history refactorings including their preconditions and procedures, and large refactorings composed of these primitives. Moreover, we have implemented a supporting tool that automates the application of history refactorings in the middle of a source code editing process. Our tool enables developers to pursue some useful applications using history refactorings such as task level commit from an entangled edit history and selective undo of past edit operations.

Change-aware development environments have recently become feasible and reasonable. These environments can automatically record fine-grained code changes on a program and allow programmers to replay the recorded changes in chronological order. However, they do not always need to replay all the code changes to investigate how a particular entity of the program has been changed. Therefore, they often skip several code changes of no interest. This skipping action is an obstacle that makes many programmers hesitate in using existing replaying tools. This paper proposes a slicing mechanism that can extract only code changes necessary to construct a particular class member of a Java program from the whole history of past code changes. In this mechanism, fine-grained code changes are represented by edit operations recorded on source code of a program. The paper also presents a running tool that implements the proposed slicing and replays its resulting slices. With this tool, programmers can avoid replaying edit operations nonessential to the construction of class members they want to understand.

We propose a method to explore how to improve business by introducing information systems. We use a meta-modeling technique to specify the business itself and its metrics. The metrics are defined based on the structural information of the business model, so that they can help us to identify whether the business is good or not with respect to several different aspects. We also use a model transformation technique to specify an idea of the business improvement. The metrics help us to predict whether the improvement idea makes the business better or not. We use strategic dependency (SD) models in i* to specify the business, and attributed graph grammar (AGG) for the model transformation.

This paper proposes structured location, a semiautomatic technique and its supporting tool both for locating features and exposing their structures in source code, using a combination of dynamic analysis, sequential pattern mining and formal concept analysis.

Locating features in software composed of multiple layers is a challenging problem because we have to find program elements distributed over layers, which still work together to constitute a feature. This paper proposes a semi-automatic technique to extract correspondence between features and program elements among layers. By merging execution traces of each layer to feed into formal concept analysis, collaborative program elements are grouped into formal concepts and tied with a set of execution scenarios. We applied our technique to an example of web application composed of three layers. The result indicates that our technique is not only feasible but promising to promote program understanding in a more realistic context.

Comparing and understanding differences between old and new versions of source code are necessary in various software development situations. However, if refactoring is applied between those versions, then the source code differences are more complicated, and understanding them becomes more difficult. Although many techniques for extracting refactoring effects from the differences have been studied, it is necessary to exclude the extracted refactorings' effects and reconstruct the differences for meaningful and understandable ones with no refactoring effect. As described in this paper, we propose a novel technique to address this difficulty. Using our technique, we extract the refactoring effects and then apply them to the old version of source code to produce the differences without refactoring effects. We also implemented a support tool that helps separate refactorings automatically. An evaluation of open source software showed that our tool is applicable to all target refactorings. Our technique is therefore useful in real situations. Evaluation testing also demonstrated that the approach reduced the code differences more than 21\%, on average, and that developers can understand more changes from the differences using our approach than when using the original one in the same limited time.

Although a responsibility driven approach in object oriented analysis and design methodologies is promising, the assignment of the identified responsibilities to classes (simply, class responsibility assignment: CRA) is a crucial issue to achieve design of higher quality. The GRASP by Larman is a guideline for CRA and is being put into practice. However, since it is described in an informal way using a natural language, its successful usage greatly relies on designers' skills. This paper proposes a technique to represent GRASP formally and to automate appropriate CRA based on them. Our computerized tool automatically detects inappropriate CRA and suggests alternatives of appropriate CRAs to designers so that they can improve a CRA based on the suggested alternatives. We made preliminary experiments to show the usefulness of our tool.

We propose an ontology-based technique for recovering traceability links between a natural language sentence specifying features of a software product and the source code of the product. Some software products have been released without detailed documentation. To automatically detect code fragments associated with sentences describing a feature, the relations between source code structures and problem domains are important. We model the knowledge of the problem domains as domain ontologies having concepts of the domains and their relations. Using semantic relations on the ontologies in addition to method invocation relations and the similarity between an identifier on the code and words in the sentences, we locate the code fragments corresponding to the given sentences. Additionally, our prioritization mechanism which orders the located results of code fragments based on the ontologies enables users to select and analyze the results effectively. To show effectiveness of our approach in terms of accuracy, a case study was carried out with our proof-ofconcept tool and summarized.

This paper proposes a formalized technique for generating finer-grained source code deltas according to a developer's editing intentions. Using the technique, the developer classifies edit operations of source code by annotating the time series of the edit history with the switching information of their editing intentions. Based on the classification, the history is sorted and converted automatically to appropriate source code deltas to be committed separately to a version repository. This paper also presents algorithms for automating the generation process and a prototyping tool to implement them.

We propose iFL, an interactive environment that is useful for effectively understanding feature implementation by application of feature location (FL). With iFL, the inputs for FL are improved incrementally by interactions between users and the FL system. By understanding a code fragment obtained using FL, users can find more appropriate queries from the identifiers in the fragment. Furthermore, the relevance feedback obtained by partially judging whether or not a fragment is relevant improves the evaluation score of FL. Users can then obtain more accurate results. Case studies with iFL show that our interactive approach is feasible and that it can reduce the understanding cost more effectively than the non-interactive approach.

Software development is a cooperative work by stakeholders. It is important for project managers and analysts to understand stakeholder concerns and to identify potential problems such as imbalance of stakeholders or lack of stakeholders.\\This paper presents a tool which visualizes the strength of stakeholders' interest of concern on two dimensional screens. The proposed tool generates an anchored map from an attributed goal graph by AGORA, which is an extended version of goal-oriented analysis methods. It has information on stakeholders' interest to concerns and its degree as the attributes of goals.\\Results from the case study are that (1) some concerns are not connected to any stakeholders and (2) a type of stakeholders is interested in different concerns each other. The results suggest that lack of stakeholders for the unconnected concerns and need that a type of stakeholders had better to unify their requirements.

This paper presents an integrated supporting tool for Attributed Goal-Oriented Requirements Analysis (AGORA), which is an extended version of goal-oriented analysis. Our tool assists seamlessly requirements analysts and stakeholders in their activities throughout AGORA steps including constructing goal graphs with group work, utilizing domain ontologies for goal graph construction, detecting various types of conflicts among goals, prioritizing goals, analyzing impacts when modifying a goal graph, and version control of goal graphs.

This paper presents an integrated supporting tool for Attributed Goal-Oriented Requirements Analysis (AGORA), which is an extended version of goal-oriented analysis. Our tool assists seamlessly requirements analysts and stakeholders in their activities throughout AGORA steps including constructing goal graphs with group work, prioritizing goals, and version control of goal graphs.

The Object Constraint Language (OCL) carries a platform independent characteristic allowing it to be decoupled from implementation details, and therefore it is widely applied in model transformations used by model-driven development techniques. However, OCL can be found tremendously useful in the implementation phase aiding assertion code generation and allowing system verification. Yet, taking full advantage of OCL without destroying its platform independence is a difficult task. This paper proposes an approach for generating assertion code from OCL constraints by using a model transformation technique to abstract language specific details away from OCL high-level concepts, showing wide applicability of model transformation techniques. We take advantage of structural similarities of implementation languages to describe a rewriting framework, which is used to easily and flexibly reformulate OCL constraints into any target language, making them executable on any platform. A tool is implemented to demonstrate the effectiveness of this approach.

In this paper, we propose a model-driven development technique specific to the Model-View-Controller architecture domain. Even though a lot of application frameworks and source code generators are available for implementing this architecture, they do depend on implementation specific concepts, which take much effort to learn and use them. To address this issue, we define a UML profile to capture architectural concepts directly in a model and provide a bunch of transformation mappings for each supported platform, in order to bridge between architectural and implementation concepts. By applying these model transformations together with source code generators, our MVC-based model can be mapped to various kind of platforms. Since we restrict a domain into MVC architecture only, automating model transformation to source code is possible. We have prototyped a supporting tool and evaluated feasibility of our approach through a case study. It demonstrates model transformations specific to MVC architecture can produce source code for two different platforms.

This paper proposes an ontology-based technique for recovering traceability links between a natural language sentence specifying features of a software product and the source code of the product. Some software products have been released without detailed documentation. To automatically detect code fragments associated with the functional descriptions written in the form of simple sentences, the relationships between source code structures and problem domains are important. In our approach, we model the knowledge of the problem domains as domain ontologies. By using semantic relationships of the ontologies in addition to method invocation relationships and the similarity between an identifier on the code and words in the sentences, we can detect code fragments corresponding to the sentences. A case study within a domain of painting software shows that we obtained results of higher quality than without ontologies.

This paper proposes a systematic approach to derive feature models required in a software product line development. In our approach, we use goal graphs constructed by goal-oriented requirements analysis. We merge multiple goal graphs into a graph, and then regarding the leaves of the merged graph as the candidates of features, identify their commonality and variability based on the achievability of product goals. Through a case study of a portable music player domain, we obtained a feature model with high quality.

This paper proposes a novel technique to detect the occurrences of refactoring from a version archive, in order to reduce the effort spent in understanding what modifications have been applied. In a real software development process, a refactoring operation may sometimes be performed together with other modifications at the same revision. This means that understanding the differences between two versions stored in the archive is not usually an easily process. In order to detect these impure refactorings, we model the detection within a graph search. Our technique considers a version of a program as a state and a refactoring as a transition. It then searches for the path that approaches from the initial to the final state. To improve the efficiency of the search, we use the source code differences between the current and the final state for choosing the candidates of refactoring to be applied next and estimating the heuristic distance to the final state. We have clearly demonstrated the feasibility of our approach through a case study.

Gene coexpression provides key information to understand living systems because coexpressed genes are often involved in the same or related biological pathways. Coexpression data are now used for a wide variety of experimental designs, such as gene targeting, regulatory investigations and/or identification of potential partners in protein-protein interactions. We constructed two databases for Arabidopsis (ATTED-II, http://www.atted.bio.titech.ac.jp) and mammals (COXPRESdb, http://coxpresdb.hgc.jp). Based on pairwise gene coexpression, coexpressed gene networks were prepared in these databases. To support gene coexpression, known protein-protein interactions, common metabolic pathways and conserved coexpression were also represented on the networks. We used Google Maps API to visualize large networks interactively. The relationships of the coexpression database with other large-scale data will be discussed, in addition to data construction procedures and typical usages of coexpression data.

This paper proposes an automated technique to extract prehistories of software refactorings from existing software version archives, which in turn a technique to discover knowledge for finding refactoring opportunities. We focus on two types of knowledge to extract: characteristic modification histories, and fluctuations of the values of complexity measures. First, we extract modified fragments of code by calculating the difference of the Abstract Syntax Trees in the programs picked up from an existing software repository. We also extract past cases of refactorings, and then we create traces of program elements by associating modified fragments with cases of refactorings for finding the structures that frequently occur. Extracted traces help us identify how and where to refactor programs, and it leads to improve the program design.

In this poster, we discuss the need for collecting and analyzing program modification histories, sequences of fine-grained program editing operations. Then we introduce Eclipse plug-ins that can collect and analyze modification histories, and show its useful application technique that can suggest suitable refactoring opportunities to developers by analyzing histories.