I have a line feature (see image) representing a river that I created using the Stream_to_Feature tool. The attribute table contains several records representing different lines - the problem is the longest line (easily distinguishable visually) is not represented as a single line in the table, it is actually made up of many smaller lines. The lines appear to be touching, although they do not cross through each other.

How can I merge these lines and then determine the length of the longest line using ArcObjects or manual methods that I can convert to ArcObjects? An even better solution would involve getting rid of all of the tributaries and leaving me with just the river channel as a line.

Do they connect at all? You said they do not cross, but does that mean they do not share a vertex?
–
NathanusJun 14 '11 at 17:14

1

Sorry - I should have been more clear. They do share vertices, but do not completely cross through each other.
–
Nick OchoskiJun 14 '11 at 17:30

Do you know where the mouth of the river is? Is the river alway a tree (one unique path from each headwater point to the mouth) ?
–
Kirk KuykendallJun 14 '11 at 18:18

3

Actually, you don't want the length of the "longest line." That could be a route from one upstream reach to another remote upstream reach. This will happen when two major branches of the stream join near its mouth. Instead, you want the longest route between the mouth and any other endpoint on the stream. (This characterization does not even require the stream be represented as a tree: it can braid and have islands.)
–
whuber♦Jun 14 '11 at 18:37

@whuber - your assessment is correct - any idea how I can accomplish this using routes?
–
Nick OchoskiJun 14 '11 at 18:48

4 Answers
4

First, a little background to indicate why this is not a hard problem. The flow through a river guarantees that its segments, if correctly digitized, can always be oriented to form a directed acyclic graph (DAG). In turn, a graph can be linearly ordered if and only if it is a DAG, using a technique known as a topological sort. Topological sorting is fast: its time and space requirements are both O(|E| + |V|) (E = number of edges, V = number of vertices), which is as good as it gets. Creating such a linear ordering would make it easy to find the major stream bed.

Here, then, is a sketch of an algorithm. The stream's mouth lies along its major bed. Move upstream along each branch attached to the mouth (there may be more than one, if the mouth is a confluence) and recursively find the major bed leading down to that branch. Select the branch for which the total length is the greatest: that's your "backlink" along the major bed.

To make this clearer, I offer some (untested) pseudocode. The input is a set of line segments (or arcs) S (comprising the digitized stream), each having two distinct endpoints start(S) and end(S) and a positive length, length(S); and the river mouth p, which is a point. The output is a sequence of segments uniting the mouth with the most distant upstream point.

We will need to work with "marked segments" (S,p). These consist of one of the segments S along with one of its two endpoints, p. We will need to find all segments S that share an endpoint with a probe point q, mark those segments with their other endpoints, and return the set:

Procedure Extract(q: point, A: set of segments): Set of marked segments.

When no such segment can be found, Extract must return the empty set. As a side effect, Extract must remove all the segments it is returning from the set A, thereby modifying A itself.

I'm not giving an implementation of Extract: your GIS will provide the capability to select segments S sharing an endpoint with q. Marking them is simply a matter of comparing both start(S) and end(S) with q and returning whichever of the two endpoints does not match.

The procedure "append(S, B0)" sticks the segment S at the end of the array B0 and returns the new array.

(If the stream really is a tree: no islands, lakes, braids, etc--then you can dispense with the step of copying A into A0.)

The original question is answered by forming the union of the segments returned by LongestUpstreamReach.

To illustrate, let's consider the stream in the original map. Suppose it is digitized as a collection of seven arcs. Arc a goes from the mouth at point 0 (top of the map, at right in the figure below, which is rotated) upstream to the first confluence at point 1. It's a long arc, say 8 units long. Arc b branches to the left (in the map) and is short, about 2 units long. Arc c branches to the right and is about 4 units long, etc. Letting "b", "d", and "f" denote the left-side branches as we go from top to bottom on the map, and "a", "c", "e", and "g" the other branches, and numbering the vertices 0 through 7, we can abstractly represent the graph as the collection of arcs

A = {a=(0,1), b=(1,2), c=(1,3), d=(3,4), e=(3,5), f=(5,6), g=(5,7)}

I will suppose they have lengths 8, 2, 4, 1, 2, 2, 2 for a through g, respectively. The mouth is vertex 0.

The first example is the call to Extract(5, {f, g}). It returns the set of marked segments {(f,6), (g,7)}. Note that vertex 5 is at the confluence of arcs f and g (the two arcs at the bottom of the map) and that (f,6) and (g,7) mark each of these arcs with their upstream endpoints.

The next example is the call to LongestUpstreamReach(0, A). The first action it takes is the call to Extract(0, A). This returns a set containing the marked segment (a,1) and it removes segment a from the set A0, which now equals {b,c,d,e,f,g}. There is one iteration of the loop, where (S,q) = (a,1). During this iteration a call is made to LongestUpstreamReach(1, A0). Recursively, it must return either the sequence (g,e,c) or (f,e,c): both are equally valid. The length (M) it returns is 4+2+2 = 8. (Note that LongestUpstreamReach does not modify A0.) At the end of the loop, segment a has been appended to the stream bed and the length has been increased to 8+8 = 16. Thus the first return value consists of either the sequence (g,e,c,a) or (f,e,c,a), with a length of 16 in either case for the second return value. This shows how LongestUpstreamReach just moves upstream from the mouth, selecting at each confluence the branch with the longest distance yet to go, and keeps track of the segments traversed along its route.

A more efficient implementation is possible when there are many braids and islands, but for most purposes there will be little wasted effort if LongestUpstreamReach is implemented exactly as shown, because at each confluence there is no overlap among the searches in the various branches: the computing time (and stack depth) will be directly proportional to the total number of segments.

The Unsplit Line tool can be helpful for what you are trying to do, although you will need to divine some method to differentiate one stream branch from another (for the dissolve field). This assumes you have an ArcInfo license, though.

If you do not have such a license, you might consider the ArcObjects approach taking the XY of each vertex, filling an IPointCollection with them, and then creating an IGeometry as a PolyLineClass.

Even with Multipart to Singlepart my output polyline has more vertices than the line created by Stream to Feature. On the plus side Raster to Polyline appends a length field.
–
Nick OchoskiJun 14 '11 at 17:48

You could use RivEX it's a 9.1 ArcGIS tool (that will work in 9.3 and 10). It has tools for identifying topological issues with river networks and many processing tools. One such tool finds the main stem.