Presentations

Remote Development Presented at the CDT Summit 2007 detailing the motivation behind the need for remote development tools and the status of current efforts.

Architecture Requirements

Dependencies

RDT should not introduce new dependencies for other projects (e.g. CDT should not have to depend on RSE)

Extensible Framework

APIs should not be tied to any specific protocol or service provider. We wish to allow for different protocols to be used, potentially at the same time.

E.g., we should not be assuming everything is SSH

E.g., we should not be assuming everything is done via RSE.

Vendor Neutrality

APIs and reference implementations should be vendor neutral. We wish to have frameworks and solutions that have value for the entire community.

Ease of Use

User experience for local scenarios should not change in any appreciable way in the presence of remote tools.

We should try to hide the remote functionality from the user wherever possible. For the average user, once they have chosen a remote project and setup their connections, it should "just work"

The primary use case we are trying to address is users that have the entirety of their project residing on a single remote machine. We should endeavour to make this scenario as simple to configure and use as possible. For example, although it is necessary for the user to setup a connection to the remote machine, they should not be required to set this up for each individual service if all of their services are coming from the same machine via the same connection method.

It should be easy to select a sensible configuration of services (perhaps automatically) that make up the user's environment. Choices of certain services should influence the defaults for other services.

Configuration of services should be able to be specified for different build configurations.

Configuration should be done as much as possible through the New Project Wizard. Users should be prompted at this point to setup required remote connections.

Use Cases

This needs fleshing out into some more detailed documents. Here are some basic use cases:

Entirely Local

Local edit with remote build and/or debug

User is developing for a single remote machine which houses all services (edit, build, debug, etc.

User is developing for a remote machine but different services reside on different machines (e.g. build on one machine, debug on another).

User is doing multi-platform development for many different remote machines, each of which houses all services, but depending on what build configuration they are targeting, they wish the services to reside on the corresponding target machine.

General

We will develop some core remote services infrastructure to support remote projects in CDT. This infrastructure will include:

CModel

CEditor

Class Hierarchy View

Type Hierarch View

Search

Content Assist

C Project Model

[Greg Watson]

Parsing and Indexing

[Chris Recoskie]

Most of the most exciting value adds provided by Eclipse compared to other development environments require knowledge of the structure of the user's source code. Features such as source code navigation, content assist, intelligent search, call hierarchy view, type hierarchy view, the include browser, refactoring, and other features all require parsing the user's source code and producing an index which allows for name based lookups of source code elements.

Parsing and indexing are both very CPU and memory intensive operations, and good performance is a key requirement if these features are to be used by the average user. The remote scenario provides for some unique, additional challenges which have to be overcome in order for the features to work quickly and correctly.

Some important points to consider:

Network mounting the files and operating on them "locally" has been proven to be slow, even on fast (100 megabit) connections with very few intermediate hops.

Downloading the entire set of remote files (both project files and included system headers, which are not generally found on the local machine) is similarly slow.

Sometimes the remote machine uses a different text codepage encoding than the local machine. This means that not only must the source files be transferred, but they may have to undergo a codepage conversion process, which slows things down even further.

Abstract Syntax Trees (ASTs) and indices are typically much larger than the original source code from which they are produced, because they store much more information. I.e., they store a certain amount of syntactic and/or semantic understanding, which is inherently more information than is imparted by the raw bytes that correspond to the source text. As such, it's even more impractical to transfer ASTs or indices than it is to just transfer the original source.

The way a file needs to be parsed in order to be interpreted correctly is often dependent upon the manner in which the file is compiled. E.g., macros and include paths may be defined/redefined on the command line of any individual compilation of any individual file. A successful parse requires that those same macros and include paths be provided to the parser when it runs.

Often the remote machine has greater CPU power than the local machine, so it can often complete parsing and indexing tasks more quickly than the local machine.

Remote machines are often accessed at geographically separated locations. The intermediate topology of the network can often be complicated, with many hops, and slow links. As such, in order to maintain performance it's important for as little data as possible be transferred back and forth between the local machine and the remote machine.

As such, we feel that if the Remote Development Tools are to be successful, then they must provide remote services that allow the user to do all of the parsing and indexing on the remote machine. The local machine can query the remote machine for data it is interested in, and only this data gets transferred over the remote connection.

Design

The IBM team has been working on a design for such a framework in the context of CDT.

The design document above is somewhat out of date. The design is currently being refactored to utilize the Service Model For Remote Projects discussed above rather than the DAO framework listed in the document. APIs will be made somewhat more concrete, as the service model will now specify what services are to be used, whereas the DAO framework would try to infer backwards from a given resource what provider was supposed to service it.

Prototype

A proof-of-concept of this design was implemented by the IBM team, using the Remote Systems Explorer's dstore protocol, that demonstrates the ability to index remote files and display the Call Hierarchy for C functions defined in the file. CDT's parser and indexer are deployed as standalone JARs on the remote machine, and the RSE dstore daemon is extended with indexing and call hierarchy commands which are translated into operations on the generated index.

The only data sent back and forth across the wire are:

A list of files to index

When call hierarchy is requested for a function, the name of the function and its location in the source file from which it was highlighted are sent. The call hierarchy service responds with a list of callers/callees, their locations in the source, and any other relevant properties (such as their type).

Update times of the Call Hierarchy UI for functions with 100 callers is in the sub 1-second range (not including the up-front indexing time, which comparable to the amount of time required to index in the local scenario), which seems to be acceptable.

Build

[Chris Recoskie]

There are currently two builders planned:

Remote Standard Make A remote-enabled version of CDT's Standard Make. Essentially, a make-based builder that uses a remote shell protocol (e.g. SSH) to invoke a builder (e.g. GNU make) on a user-crafted buildfile (makefile). We are planning on tackling this after EFS support, hopefully in time for CDT 5.0, a.k.a. Ganymede.

Remote Managed Build A remote-enabled version of CDT's Managed Build. In this case the Managed Build System would know about your compile settings, and execute the required build commands directly via the shell protocol (e.g. SSH). Implementation date is currently TBD.

Scanner Info

[Jason Montojo]

Error Parser Manager

[Jason Montojo]

New Project Wizard

[Greg Watson]

Service Model

[Greg Watson]

Service Extension Point

Each service must provide:

service ID

service name

a list of project natures for which the service applies, or "all" if it is a universal service that applies to all natures

Service Provider Extension Points

Each provider extension point must minimally provide:

an ID field

a name field

ID of the service being implemented

a providerClass, although the type of this will vary between services since each service will require a different interface

a configurationUIClass which launches a dialog/wizard to allow the service to be configured (which may include setting up and associating any required connections)

Model

ServiceModelManager

IServiceConfiguration[] getConfigurations(project)

IServiceConfiguration getConfiguration(project, name)

IServiceConfiguration getActiveConfiguration(project)

setActiveConfiguration(project, configuration)

IService[] getServices(nature)

IService[] getServices(project)

IService[] getServices()

IService

IServiceProviderInfo[] getProviderInfos()

getID()

getName()

getNatures()

IServiceProviderInfo

getID()

getName()

IServiceProvider

Extends IServiceProviderInfo

Is an instance as opposed to extension data

boolean isConfigured() method which specifies whether this instance has been configured to the point where it can function (this may return true if the defaults can allow the service to operate as-is)

IServiceConfiguration

IServiceProvider getServiceProvider(service)

setServiceProvider(service, delegate)

Launch and Debug

[Greg Watson]

Provide a launch configuration for remote launches. This will be roughly along the same lines as the RSE remote CDT launch configuration. The user should be able to specify the location of the executable (either local or remote) and the host on which the executable will be launched (either local or remote). Issues to consider include:

May need to copy the executable from the build location to the launch location

The architecture of the build and launch machines may be different

Where to get the environment from

Pre- and post-launch actions (e.g. to move input/output data files)

Provide a launch configuration for remote debugging. Although debugging an executable remotely is supported in CDT (using gdbserver), much of the launch must be performed manually. By extending the debug launch configuration, we will allow the remote services to be used to copy the executable to the remote target, and then invoke the debug server automatically. Issues to consider include:

The architecture differences between the executable and the debug server