Open CVS Search

Contents

Introduction

In working with large projects the version control systems are immensely helpful. The developers who use these version control systems often get confused due to the high volume of commits that are present within the project. This indeed wastes a lot of time of developers and make their life difficult. It is indeed superb if the IDE itself can refrain the needed data by searching for particular commits.

The Solution Overview

Eclipse is one of the widely used IDEs. CVS is considered as one of the favorite version control systems of many organizations. It would obviously be pleasing if the eclipse IDE can provide a facility with which we can search the CVS by commits and identify the exact artifact that need. The interfaces of the proposed plug-in for CVS search should look as the images given below.

Search Interface

Search results

The artifacts related to a certain project should be indexed in such a way that there are different fields which are indexed. These fields are commits, created date, modified date, name of the artifact etc. When connecting to the version control system this particular module should index all the un-indexed files in the local machine. When this is run for the first time the whole project is taken into the index. Since the bulk of the content is text files it is expected that it won't take so long for the indexing process. This indexing process can run as a back ground process too. This will enable the users to continue with his desired work while the indexing is being done. All the files will in the version control system will not be duplicated in the local machine but an index out of those will be created and stored in the local machine.
The system will perform searches on the index on demand and retrieve and display results according to the format given above.

The Architectural Overview

The system uses a unit wise architecture. Each unit runs individually and the communication is done between units. The Architectural Overview is given under Appendix-A. The functions that are expected from each unit as summarized below.

Presentation Unit

This unit is responsible for dealing with the GUI of the system. It isolates the core of the system from the user.

Presentation Controller Unit

This unit passes the parameters the user gives to the system to the core. It also formats the response and gives it to the Presentation Unit to display.

Indexer and Search Engine Unit

This unit does the indexing and retrieving data on user's demand. The data should be fed to this unit according a particular format that it can index data. In this unit it is expected to run an open source framework such as solr or lucene.

The Converter Unit

This unit converts different files of different formats and feed the indexer unit. Usually some converters from rich formats to plain text can reside in this unit. PDFBox, POI are some of the potential frameworks that we can use at this.

The Watchdog Unit

This unit takes care of all the scheduling and other related jobs. It basically identifies the modified files and take them to the index. It communicates with the version control system.

The Plug-in Connector

This unit takes care of pluging the system to the central eclipse system. This will make the rest of the units being isolated from the tasks of being integrated to the eclipse IDE.

Technologies Used and the Implementation

The prominent technology that is being used in this is solr which indexes the commits and other important fields. This will help the users to extract the specific data from the index and then there will be pointers, links etc so that the users can locate the original link from the version control system. At the converter unit it is expected to use PDFBox, POI Libraries and other technologies to convert files in other formats to plain text.
The implementation will be done in unit wise manner. The units will be integrated at the end. The system is designed in such a way that there will be a higher room for being extended and the system will be much flexible that it will be able to be configured easily.

The Conclusion

The Open CVS Search is expect to be a search tool which is capable of making the job of the developers a hassle free one when dealing with the locating artifacts in the version controller system. It guarantees easy to use integrated and robust, flexible plug-in to eclipse which is capable of searching for the artifacts in CVS by their commits.