Description
Basically i need to store what information's each actor has about another actor, similar to a boiled down facebook. The problem is, this has to scale for a couple hundred actors at once, i.e. this could get out of hand pretty fast.

Question
Is there a decent way to accomplish something like that, or do i have to boil the whole idea down a bit and make it just a "who know's who" sort of thing?

The properties within the EntityModel and ActorModel are the information's in question. Derived from from the ActorModel are several Classes with additional information, but for "simplicity" I've omitted those children classes.

I've thought of converting those properties to classes and include a List inside that store the EntityID of actors who know this specific information, but i'm not sure how this will work out on a larger scale.

Since the amount of access-restricted elements (properties in this case) is restricted and known at design-time it shouldn't be overly complex. In the database you could represent it as a table with the EntityIDs of the property-owning Actor and property-accessing Actor as a compound primary key and boolean "is-Property-Access-allowed" columns. Outside the database you could map that to a dictionary with a Tuple<PropertyOwningActorEntityID, PropertyAccessingActorEntityID> as key (I would subclass the Tuple-class though and map the "Item1" and "Item2" properties to more expressively named properties). Could that be a solution for you?

For the generation of primary keys in n-Tier applications I only ever see two strategies: Either client-side generation of a GUID or a temporary Integer that gets replaced by the DAL and the DAL reports back to the client the value that was actually assigned.

My alternative idea is to let the client request 1..n new primary key value(s) from the DAL whenever it needs one/some. This way I wouldn't have to cope with ugly GUID's (I don't plan for DB-merge-ability) and I avoid awkward client-logic for replacing temporary primary keys.

Have I simply not yet found some projects that do it this way or is there some flaw in that strategy that I'm not aware of?

I'd say this would highly depend on the philosophy of the DAL. For example why would you need to generate temporary primary keys. Let it be empty until a value is returned from the database.

If the referential information isn't seen from the object hierarchy and you need a value for foreign keys until they are actually created, why not use HashCode in such situations. Even better, the object could return the hashcode as a primary key until the actual primary key is generated in the databsae.

In all cases I'd always let the database generate the surrogate key value, never the DAL layer.

In all cases I'd always let the database generate the surrogate key value, never the DAL layer.

What I meant is that the client sends a request for key-provision to the DAL which then lets the database generate it through sequences.

Mycroft Holmes wrote:

Listen to Mika, let the database generate the primary keys, you may need to submit in sequence so you get the PK to be used as a foreign key on the children.

The application for which I will use the framework I'm designing here first has a use case where usually about 1000 entities plus sub-entities are created. I don't want to make that many remote calls for performance reasons.

Mika Wendelius wrote:

If the referential information isn't seen from the object hierarchy and you need a value for foreign keys until they are actually created

Yes, I need some value for foreign keys - object references are "virtual" through keys, to facilitate lazy instantiation and cache expiration/GC. Not sure if I'm missing something but I see some problems with the HashCode approach: In contrast to Guids, HashCode-collisions are much more likely. On top of that they could collide with pre-existing keys. A deterministic generation of temporary keys would appear safer to me - but I would like to avoid that altogether.

I will use the framework I'm designing here first has a use case where usually about 1000 entities plus sub-entities are created.

You appear to be suggesting that you are going to construct several thousand valid entities in a client first before sending to the back end.

And the question would be...why?

I certainly wouldn't want to create a DAL and just assume that the client is going to send thousands of valid entries to me. That violates the primary purpose of constraints on a database in that it protects from programmer errors not user errors.

So you don't save anything in terms of validity.

You will still need to transport those thousands of entities to the back end. If you do it one at a time AND that is problem then your solution doesn't address that at all. If you are going to send them as a block then there would in fact be FEWER calls if you let the DAL handle the ids, since your DAL should be capable of recognizing dependencies (if nothing else pseudo ids in the block accomplish that.)

However thousands of calls, unless you intend to that every second, isn't a problem on any effective modern server as long as it is infrequent (of course modern servers can handle that many calls per second but doing that just to avoid batch handling would be silly.)

I certainly wouldn't want to create a DAL and just assume that the client is going to send thousands of valid entries to me. That violates the primary purpose of constraints on a database in that it protects from programmer errors not user errors.

The entities are fully validated before they're sent to the DAL.

jschell wrote:

You will still need to transport those thousands of entities to the back end. If you do it one at a time AND that is problem then your solution doesn't address that at all.

No, they're sent in a batch/block.

jschell wrote:

If you are going to send them as a block then there would in fact be FEWER calls if you let the DAL handle the ids, since your DAL should be capable of recognizing dependencies (if nothing else pseudo ids in the block accomplish that.)

You mean there would be fewer calls because the client wouldn't have to request new keys from the DAL/DB before creating new entities? But the client can request more than one new key at once - if it's clear how many new entities are to be created beforehand, it's just one request, if it's not clear beforehand, it would still be considerably less than one request per key because it can just request increasingly more if it runs out of new keys.

jschell wrote:

However thousands of calls, unless you intend to that every second, isn't a problem on any effective modern server as long as it is infrequent (of course modern servers can handle that many calls per second but doing that just to avoid batch handling would be silly.)

The application will run in a variety of environments, many of which won't have a very performant server or network. So I want to design it in a way that it puts the least stress on either.

One is more than none.
Batching it solves that problem but it doesn't explain why the client needs to provide the ids.

I don't think I deserve your impatience with me because you could have found the explanation in another post of me in this thread:

[..] the main point why I need at least some kind of key is that I implement object-references "virtually" (don't know if there's a better term for it): Entities don't hold a direct reference to other entities but a key and on property access the key gets resolved into an object reference - that way I can easily implement lazy/implicit loading and cache expiration.

To put it into picture: I'm developing a custom ORM, mainly because of one requirement that disqualifies existing ORMs: My users need to be able to extend the model with custom tables and fields (which of course need to be "non-intrusive" on the business-logic by being nullable/optional). The first version of the application will only include desktop clients and those will be rich clients where the part of the ORM that does the record-entity mapping resides in the client. So probably you could say that I split up the DAL into tiers. This will probably clear up the following:

jschell wrote:

Again I would not write a DAL nor a database (relational) that relied solely on a client for validity.

The DAL I've been talking about is essentially that part of a DAL which you're thinking about that does the final step of saving the raw records. Any validation you would do in a DAL happens here in the first layer of the client.

So I need Id's/keys in the client because they're required to resolve references.

You still need to write each record to the database individually, I would construct my object as parent object containing a List<> of children. At this point there is no requirement for a FK value as the parent has the children in it's List<>.

When you decide to write the data into the database you have a nested loop where the inserts the record and gets the PK value back from the database and uses that when inserting the children.

It's not one simple parent-child-relationship - there can be several inter-relationships. But the main point why I need at least some kind of key is that I implement object-references "virtually" (don't know if there's a better term for it): Entities don't hold a direct reference to other entities but a key and on property access the key gets resolved into an object reference - that way I can easily implement lazy/implicit loading and cache expiration.

Mostly because of join performance. Querying is the main job of the application and there are a lot of related tables, which, even if they're not part of the query predicate, have to appear in the resultset.

Have you actually done any profiling to see if there is a hit that will be, in any way, significant using GUIDs? As these are presumably going to be indexed fields, you shouldn't be seeing much impact.

I'm using SQL Server 2008. Sequential GUIDs in general are GUIDs that are not completely random but created in way that their value is ascending (and thus remedying many of the drawbacks of conventional fully-random GUIDs when being used as a key in a DB) while still providing the advantage that collisions are virtually non-existent. SQL Server provides a function to generate these since version 2008 2005: NEWSEQUENTIALID[^]

I'm pretty sure that's been available since 2005 but you're using the database to generate the GUID, rather than in code (which was why I was querying what the sequential guid was as this isn't native behaviour). If you don't want to preallocate the key then you have no choice but to have your code react and post-allocate the IDs.