We have developed a relatively complex geoprocessing routine using a C# / ArcObjects (COM) / ArcGIS Server 9.3.1 / ArcSDE 9.3.1 / Oracle 11g stack. Running the routine on a complete dataset takes many, many days and often does not complete.

Troubleshooting is hindered by the opaque nature of COM and the many levels of abstraction.

Specifics aside, could anyone suggest a heuristic for troubleshooting performance problems with ArcObjects? Hopefully the answer could be useful to others.

Edit: Using a profiler (JetBrains dotTrace), I can see that my problem calls are: ITable.GetRow, ITable.Select, ICursor.NextRow, ITable.Search, ISelectionSet.Search, ISelectionSet.get_IDs, IFeatureClass.Select, and IFeatureClass.GetFeature.

5 Answers
5

You should broaden your search to beyond ArcObjects as suggested in the comments to one of the other answers. Every single one of the methods you list are database access related.

Consider whether your data is properly indexed. Look at both attribute indexes as well as spatial index depending on your type of query.

Are you bringing back too many fields? If you only need a few fields, limit your query to those. Especially bringing back the geometry for feature classes is an expensive operation if you're not using it anywhere in your code.

Are you doing operations on the client (your ASP.NET app) that are better suited for the database? Do as much filtering as possible in your where-clause and/or spatial query to avoid transferring unneeded data across the wire from the database.

You're using queries on plain tables (ITable.Search, ITable.Select, etc.). It might make sense to use plain Oracle database access with ADO.NET instead of going through SDE via ArcObjects.

You're using selection sets, which may not be the most efficient way to do whatever you're trying to accomplish. Consider if transferring the data into memory and doing your filtering in plain .NET would be more efficient (even if it might use more memory in your web tier).

I also found it helpful to refactor the dll so that it can be called by test functions without residing inside a gp service (assuming you are implementing IGpTool).

I wrote a class that implements IGpMessages and passed it to Execute. In my implementation of IGpMessages I wrote messages to Debug.Trace. This allows testing of the gp tool without being run from within the ArcSOC process. At the beginning of Execute I would check to see if the IGpMessages implements IMyGpMessages, and if not then create an instance of MyGpMessages (which implements both IGpMessages and IMyGpMessages). MyGpMessages constructor would take the IGpMessages and keep it as a contained reference. MyGpMessages would write messages to Debug.Trace as well as the contained IGpMessages class.

If the bug only occurs within the gp process, it can be difficult to figure out which SOC process to attach to with the Visual Studio debugger, so you might try writing out the process ID (and machine name) to a log file at the beginning of your Execute implementation. Fortunately, I never had to resort to that. If you don't have visual studio on your Soc machine you might be able to set up remote debugging, but I've heard that can be challenging.

In most cases, performance issues are data-access related. Any other optimizations are likely to have orders of magnitude less effect than optimizing database (or better to say geodatabase) access.

Also, map drawing will slow down if your symbology is overly complicated, but the database access is still more likely to be the culprit.

As suggested by others, make sure the data in your database is properly indexed. When on SDE, make sure the network throughput is not the bottleneck.

At the ArcObjects level, here are some common opportunities for optimizations:

Use ITable.GetRows instead of individual calls to ITable.GetRow whenever possible.

If you need to combine GetRows with a query expression, you can effectively use ITable.Search with the where clause containing an IN clause: "ATTR='value' AND OBJECTID IN (1,2,3,4...)" instead.

Use IRelationshipClass.GetObjectsRelatedToObjectSet instead of individual calls to GetObjectsRelatedToObject.

Cache data whenever possible. It is not advisable to keep references to IRow, for example, but if it is applicable, cache OBJECTIDs, or anything else you can.

Determine field indexes beforehand, do not use IFields.FindField repeatedly in loops when iterating over cursor results, for example. This is one of the few cases where .NET COM interop CAN have significant performance impact, especially when working remotely with ArcGIS Server via DCOM.

Store records (e.g. IRow.Store) only when an attribute value is actually changed.

Avoid selection sets especially for large number of records (> few hundred) since they can get pushed to the SDE database.

Are you doing any logging in the application? I have always found it helpful to add verbose logging with timestamping in the different stages of my apps so if there is a bottleneck, that can help point it out.

You should check all layers in you app, only focusing on ArcObject might make you draw wrong conclutions. Check so it's not the database causing the problem, do you have correct indexes on your tables, Check ArcSDE are you using versioned data for example?
–
MathiasWestinFeb 16 '11 at 21:14

Those are all database-oriented calls. Run a trace on your database server and see which of the queries are taking the longest. Compare the runtimes of those queries against the measured time of those methods. This will help you find the source of your problems.

The "opaqueness" you're experiencing is more-or-less the same that you would get from using a high-level database API or ORM library.

Edit: If you believe there are too many layers to diagnose, then run he GP tool from a trivial ArcEngine app which will eliminate that "layer"--though, based on what you describe it is not relevant.. If you believe ArcSDE is another "layer", then run using Direct Connect and you will eliminate the use of an out-of-process SDE process.

Again... until you run a database trace tool, you won't know if it's COM Interop, ArcObjects, or your database. Even if you know the generated SQL, you will still need to use a database trace to determine the performance of the queries (a query plan can be incredibly misleading at times hence the need for a trace).

COM interop, lack of a profiler tool for inspecting generated SQL, and the many layers at play (C#->COM->AGS->SDE->Oracle) make it rather more opaque than, say, NHibernate.
–
nw1Feb 18 '11 at 16:24

Yes, it is much easier to get some of that information from NHibernate... but as I said, if you run a trace on your DB you can see the generated SQL. And based on the methods you identified, the layers are C# -> ArcObjects (COM) -> Oracle.
–
James SchekFeb 19 '11 at 23:22