December 12, 2012

I had an interesting question from a member of our Product Support team, last week. They’ve been working with a number of customers regarding large files resulting from importing DGN data into AutoCAD: it turns out that the internal structure for complex DGN linestyles created by the DGNIMPORT command – while working very well when the data is “live” – makes the data difficult to remove once no longer needed.

[Quickly, a big “thank you” to a couple of old friends and colleagues. Firstly to Markus Kraus, who helped explain the data structure created by DGNIMPORT, and then to Albert Rius, who provided the inspiration for the project and helped test the code against some DGN data imported into DWG files.]

Here’s my attempt at showing the types of object that participate in the display of DGN linestyle data inside AutoCAD:

(This is meant to be representative, of course – and is actually somewhat simplified.)

Basically the entities created by the DGNIMPORT command have references to entries in the Linetype Table – just as any standard AutoCAD entities do. Complex DGN linestyles – that presumably cannot be mapped directly to standard AutoCAD linetypes – have entries in their Extension Dictionaries named “DGNLSDEF”. These objects contain references to objects in the “ACAD_DGNLINESTYLECOMP” dictionary (which is inside the root Named Objects Dictionary).

The objects in this dictionary represent the “strokes” in the linetype. They might be simple (in the case of STROKE1, above) or more complex. Complex strokes can refer to each other and also to anonymous blocks in the Block Table.

All in all a little complicated, as you can tell. The main issue with this picture is that all these “hard” references between objects prevent them from being purged, even when the entities that refer to them have been erased. Which can be a problem as it turns out they can occupy quite a lot of space, as linestyles can often require lots of different strokes.

So what can be done to help purge the various unwanted objects?

Here’s the approach I took:

Find out the “unreferenced” DGN linestyles in the drawing.

Collect a list of the linetype records containing the tell-tale DGNLSDEF entry in the extension dictionary.

Iterate through all the objects in the drawing, and if you find one that contains a hard reference to an entry in the list of linetypes, remove that linetype from the list (as long as it doesn’t come from an object inside an anonymous block – we can safely ignore those references) and add it to a list of “keepers”.

Find out the “unreferenced” strokes in the drawing.

Collect a list of the entries in the ACAD_DGNLINESTYLECOMP dictionary.

Go through the list of linetypes we want to keep, following their object references recursively through the dictionary of strokes: any objects that inside this reference network gets removed from the stroke list.

Erase the linetypes and the strokes that are safe to remove.

Erase the containing stroke dictionary, if it now happens to be empty.

There were two main problems that Philippe Leefsma helped me solve (as mentioned earlier this week). The first one was iterating through all the objects in the drawing: Philippe has posted a handy solution to this that works by iterating through the possible handle space and attempting to open the object associated with that particular handle. This is generally more efficient – and a lot simpler – than opening the various entity containers, etc.

The second issue was to collect the object references from the various dictionary objects (whether in the linetype’s extension dictionary of the stroke dictionary inside the NOD) that do not have manage wrappers. The technique is an old one – I’ve used it in the past from ObjectARX but didn’t have any equivalent .NET code handy, which is where Philippe came in – that involves defining a DwgFiler and passing it to the object’s DwgOutFields() method. The DwgFiler has its methods called for the various types of data that would typically get stored in the DWG file: in our case we simply store the reference data passed in.

Here’s the source for that file, which I’ve named ReferenceFiler.cs:

using System;

using Autodesk.AutoCAD.DatabaseServices;

using Autodesk.AutoCAD.Geometry;

using Autodesk.AutoCAD.Runtime;

namespace DgnPurger

{

publicclassReferenceFiler : DwgFiler

{

publicObjectIdCollection HardPointerIds;

publicObjectIdCollection SoftPointerIds;

publicObjectIdCollection HardOwnershipIds;

publicObjectIdCollection SoftOwnershipIds;

public ReferenceFiler()

{

HardPointerIds = newObjectIdCollection();

SoftPointerIds = newObjectIdCollection();

HardOwnershipIds = newObjectIdCollection();

SoftOwnershipIds = newObjectIdCollection();

}

publicoverrideErrorStatus FilerStatus

{

get { returnErrorStatus.OK; }

set { }

}

publicoverrideFilerType FilerType

{

get { returnFilerType.IdFiler; }

}

publicoverridelong Position

{

get { return 0; }

}

publicoverrideIntPtr ReadAddress() { returnnewIntPtr(); }

publicoverridebyte[] ReadBinaryChunk() { returnnull; }

publicoverridebool ReadBoolean() { returntrue; }

publicoverridebyte ReadByte() { returnnewbyte(); }

publicoverridevoid ReadBytes(byte[] value) { }

publicoverridedouble ReadDouble() { return 0.0; }

publicoverrideHandle ReadHandle() { returnnewHandle(); }

publicoverrideObjectId ReadHardOwnershipId()

{

returnObjectId.Null;

}

publicoverrideObjectId ReadHardPointerId()

{

returnObjectId.Null;

}

publicoverrideshort ReadInt16() { return 0; }

publicoverrideint ReadInt32() { return 0; }

publicoverridelong ReadInt64() { return 0; }

publicoverridePoint2d ReadPoint2d() { returnnewPoint2d(); }

publicoverridePoint3d ReadPoint3d() { returnnewPoint3d(); }

publicoverrideScale3d ReadScale3d() { returnnewScale3d(); }

publicoverrideObjectId ReadSoftOwnershipId()

{

returnObjectId.Null;

}

publicoverrideObjectId ReadSoftPointerId()

{

returnObjectId.Null;

}

publicoverridestring ReadString() { returnnull; }

publicoverrideushort ReadUInt16() { return 0; }

publicoverrideuint ReadUInt32() { return 0; }

publicoverrideulong ReadUInt64() { return 0; }

publicoverrideVector2d ReadVector2d()

{

returnnewVector2d();

}

publicoverrideVector3d ReadVector3d()

{

returnnewVector3d();

}

publicoverridevoid ResetFilerStatus() { }

publicoverridevoid Seek(long offset, int method) { }

publicoverridevoid WriteAddress(IntPtr value) { }

publicoverridevoid WriteBinaryChunk(byte[] chunk) { }

publicoverridevoid WriteBoolean(bool value) { }

publicoverridevoid WriteByte(byte value) { }

publicoverridevoid WriteBytes(byte[] value) { }

publicoverridevoid WriteDouble(double value) { }

publicoverridevoid WriteHandle(Handle handle) { }

publicoverridevoid WriteInt16(short value) { }

publicoverridevoid WriteInt32(int value) { }

publicoverridevoid WriteInt64(long value) { }

publicoverridevoid WritePoint2d(Point2d value) { }

publicoverridevoid WritePoint3d(Point3d value) { }

publicoverridevoid WriteScale3d(Scale3d value) { }

publicoverridevoid WriteString(string value) { }

publicoverridevoid WriteUInt16(ushort value) { }

publicoverridevoid WriteUInt32(uint value) { }

publicoverridevoid WriteUInt64(ulong value) { }

publicoverridevoid WriteVector2d(Vector2d value) { }

publicoverridevoid WriteVector3d(Vector3d value) { }

publicoverridevoid WriteHardOwnershipId(ObjectId value)

{

HardOwnershipIds.Add(value);

}

publicoverridevoid WriteHardPointerId(ObjectId value)

{

HardPointerIds.Add(value);

}

publicoverridevoid WriteSoftOwnershipId(ObjectId value)

{

SoftOwnershipIds.Add(value);

}

publicoverridevoid WriteSoftPointerId(ObjectId value)

{

SoftPointerIds.Add(value);

}

publicvoid reset()

{

HardPointerIds.Clear();

SoftPointerIds.Clear();

HardOwnershipIds.Clear();

SoftOwnershipIds.Clear();

}

}

}

And here’s the source for the main command implementation:

using System;

using System.Runtime.InteropServices;

using Autodesk.AutoCAD.ApplicationServices.Core;

using Autodesk.AutoCAD.DatabaseServices;

using Autodesk.AutoCAD.Runtime;

namespace DgnPurger

{

publicclassCommands

{

conststring dgnLsDefName = "DGNLSDEF";

conststring dgnLsDictName = "ACAD_DGNLINESTYLECOMP";

publicstructads_name

{

publicIntPtr a;

publicIntPtr b;

};

[DllImport("acdb19.dll",

CharSet = CharSet.Unicode,

CallingConvention = CallingConvention.Cdecl,

EntryPoint = "acdbHandEnt")]

publicstaticexternint acdbHandEnt(string h, refads_name n);

[CommandMethod("DGNPURGE")]

publicvoid PurgeDgnLinetypes()

{

var doc =

Application.DocumentManager.MdiActiveDocument;

var db = doc.Database;

var ed = doc.Editor;

using (var tr = doc.TransactionManager.StartTransaction())

{

// Start by getting all the "complex" DGN linetypes

// from the linetype table

var linetypes = CollectComplexLinetypeIds(db, tr);

// Store a count before we start removing the ones

// that are referenced

var ltcnt = linetypes.Count;

// Remove any from the "to remove" list that need to be

// kept (as they have references from objects other

// than anonymous blocks)

var ltsToKeep =

PurgeLinetypesReferencedNotByAnonBlocks(db, tr, linetypes);

// Now we collect the DGN stroke entries from the NOD

var strokes = CollectStrokeIds(db, tr);

// Store a count before we start removing the ones

// that are referenced

var strkcnt = strokes.Count;

// Open up each of the "keeper" linetypes, and go through

// their data, removing any NOD entries from the "to

// remove" list that are referenced

PurgeStrokesReferencedByLinetypes(tr, ltsToKeep, strokes);

// Erase each of the NOD entries that are safe to remove

foreach (ObjectId id in strokes)

{

var obj = tr.GetObject(id, OpenMode.ForWrite);

obj.Erase();

}

// And the same for the complex linetypes

foreach (ObjectId id in linetypes)

{

var obj = tr.GetObject(id, OpenMode.ForWrite);

obj.Erase();

}

// Remove the DGN stroke dictionary from the NOD if empty

var nod =

(DBDictionary)tr.GetObject(

db.NamedObjectsDictionaryId, OpenMode.ForRead

);

ed.WriteMessage(

"\nPurged {0} unreferenced complex linetype records" +

" (of {1}).",

linetypes.Count, ltcnt

);

ed.WriteMessage(

"\nPurged {0} unreferenced strokes (of {1}).",

strokes.Count, strkcnt

);

if (nod.Contains(dgnLsDictName))

{

var dgnLsDict =

(DBDictionary)tr.GetObject(

(ObjectId)nod[dgnLsDictName],

OpenMode.ForRead

);

if (dgnLsDict.Count == 0)

{

dgnLsDict.UpgradeOpen();

dgnLsDict.Erase();

ed.WriteMessage(

"\nRemoved the empty DGN linetype stroke dictionary."

);

}

}

tr.Commit();

}

}

// Collect the complex DGN linetypes from the linetype table

privatestaticObjectIdCollection CollectComplexLinetypeIds(

Database db, Transaction tr

)

{

var ids = newObjectIdCollection();

var lt =

(LinetypeTable)tr.GetObject(

db.LinetypeTableId, OpenMode.ForRead

);

foreach (var ltId in lt)

{

// Complex DGN linetypes have an extension dictionary

// with a certain record inside

var obj = tr.GetObject(ltId, OpenMode.ForRead);

if (obj.ExtensionDictionary != ObjectId.Null)

{

var exd =

(DBDictionary)tr.GetObject(

obj.ExtensionDictionary, OpenMode.ForRead

);

if (exd.Contains(dgnLsDefName))

{

ids.Add(ltId);

}

}

}

return ids;

}

// Collect the DGN stroke entries from the NOD

privatestaticObjectIdCollection CollectStrokeIds(

Database db, Transaction tr

)

{

var ids = newObjectIdCollection();

var nod =

(DBDictionary)tr.GetObject(

db.NamedObjectsDictionaryId, OpenMode.ForRead

);

// Strokes are stored in a particular dictionary

if (nod.Contains(dgnLsDictName))

{

var dgnDict =

(DBDictionary)tr.GetObject(

(ObjectId)nod[dgnLsDictName],

OpenMode.ForRead

);

foreach (var item in dgnDict)

{

ids.Add(item.Value);

}

}

return ids;

}

// Remove the linetype IDs that have references from objects

// other than anonymous blocks from the list passed in,

// returning the ones removed in a separate list

privatestaticObjectIdCollection

PurgeLinetypesReferencedNotByAnonBlocks(

Database db, Transaction tr, ObjectIdCollection ids

)

{

var keepers = newObjectIdCollection();

// To determine the references from objects in the database,

// we need to open every object. One reasonably efficient way

// to do so is to loop through all handles in the possible

// handle space for this drawing (starting with 1, ending with

// the value of "HANDSEED") and open each object we can

// Get the last handle in the db

var handseed = db.Handseed;

// Copy the handseed total into an efficient raw datatype

var handseedTotal = handseed.Value;

// Loop from 1 to the last handle (could be a big loop)

var ename = newads_name();

for (long i = 1; i < handseedTotal; i++)

{

// Get a handle from the counter

var handle = Convert.ToString(i, 16);

// Get the entity name using acdbHandEnt()

var res = acdbHandEnt(handle, ref ename);

if (res != 5100) // RTNORM

continue;

// Convert the entity name to an ObjectId

var id = newObjectId(ename.a);

// Open the object and check its linetype

var obj = tr.GetObject(id, OpenMode.ForRead, true);

var ent = obj asEntity;

if (ent != null && !ent.IsErased)

{

if (ids.Contains(ent.LinetypeId))

{

// If the owner does not belong to an anonymous

// block, then we take it seriously as a reference

var owner =

(BlockTableRecord)tr.GetObject(

ent.OwnerId, OpenMode.ForRead

);

if (

!owner.Name.StartsWith("*") ||

owner.Name.ToUpper() == BlockTableRecord.ModelSpace ||

owner.Name.ToUpper().StartsWith(

BlockTableRecord.PaperSpace

)

)

{

// Move the linetype ID from the "to remove" list

// to the "to keep" list

ids.Remove(ent.LinetypeId);

keepers.Add(ent.LinetypeId);

}

}

}

}

return keepers;

}

// Remove the stroke objects that have references from

// complex linetypes (or from other stroke objects, as we

// recurse) from the list passed in

privatestaticvoid PurgeStrokesReferencedByLinetypes(

Transaction tr,

ObjectIdCollection tokeep,

ObjectIdCollection nodtoremove

)

{

foreach (ObjectId id in tokeep)

{

PurgeStrokesReferencedByObject(tr, nodtoremove, id);

}

}

// Remove the stroke objects that have references from this

// particular complex linetype or stroke object from the list

// passed in

privatestaticvoid PurgeStrokesReferencedByObject(

Transaction tr, ObjectIdCollection nodIds, ObjectId id

)

{

var obj = tr.GetObject(id, OpenMode.ForRead);

if (obj.ExtensionDictionary != ObjectId.Null)

{

// Get the extension dictionary

var exd =

(DBDictionary)tr.GetObject(

obj.ExtensionDictionary, OpenMode.ForRead

);

// And the "DGN Linestyle Definition" object

if (exd.Contains(dgnLsDefName))

{

var lsdef =

tr.GetObject(

exd.GetAt(dgnLsDefName), OpenMode.ForRead

);

// Use a DWG filer to extract the references

var refFiler = newReferenceFiler();

lsdef.DwgOut(refFiler);

// Loop through the references and remove any from the

// list passed in

foreach (ObjectId refid in refFiler.HardPointerIds)

{

if (nodIds.Contains(refid))

{

nodIds.Remove(refid);

}

// We need to recurse, as linetype strokes can reference

// other linetype strokes

PurgeStrokesReferencedByObject(tr, nodIds, refid);

}

}

}

}

}

}

When we run the DGNPURGE command, we should see some information on the number of linetypes and strokes that have been discovered, and how many of them were considered unreferenced enough to purge. :-)

Once completed, you should save and reopen the DWG – which will result in the various, now-unreferenced anonymous blocks being removed – at which point you can PURGE any remaining simple linetypes.

As mentioned earlier, Albert has been very helpful testing this code against a number of DWG files that contain imported DGN data. But I suspect there are cases we’re missing: if you have DWGs that feel unnecessarily large due to DGN linestyle data, it would be great if you could give this tool a try (feel free to ping me if you have trouble building the above code into a working, NETLOADable DLL). Be sure to try it on a copy of your data, of course.

If we can get this working to everyone’s satisfaction, I’ll probably send this over to the ADN team for consideration as a “Plugin of the Month”, which will make it easier for people to get hold of a compiled version of the code.

Update:

It seems that some drawings – which I assume to have suffered from some kind of corruption, as the ones I’ve tested against certainly need to be AUDITed/RECOVERed – have their DGN strokes converted to proxies. This may be due to round-trip with older product versions (or non-Autodesk products), but it leads the original implementation to throw an exception.

The below version doesn’t fix the drawing corruption, but it does catch the exception (which is thrown when a proxy is attempted to be erased) and maintains a count of the strokes and linetypes that have been successfully erased.

It seems there’s an issue with compound components in DGN linestyles being purged inappropriately when they’re actually in use. I’ve coded a fix, but it will need to get integrated, tested and re-released. I suggest holding off on using the above tool, for now – I’ll post another update when a new version is available. Many thanks to Jimmy Bergmark for pointing out the issue.

Update 4:

We're getting closer to publishing an updated tool. The reviewed code has been posted here.