I posted this on stackoverflow but want to get your recommendations as well as a user on overflow recommended I post it here.

I'm going to say from the beginning that I am not a programmer, I have a cursory knowledge of different types of AI and am just a businessman building a web app.

Anyways, the web app I am investing in to develop is for a hobby of mine. There are many part manufacturers, product manufacturers, upgrade and addon manufacturers etc. for hardware/products in this hobby's industry. Currently, I am in the process of building a crowd sourced platform for people who are knowledgeable to go in and mark up compatibility between those parts as its not always clear cut if they are for example:

Manufacturer A makes a "A" class product, and manufacturer B makes upgrade/part that generally goes with class "A" products, but is for one reason or another not compatible with Manufacturer A's particular "A" class product.

However, a good chunk (>60%-70%) of the products/parts in the database can have their compatibility inferred by their properties,

For example:

Part 1 is type "A" with "X" mm receiver and part 2 is also Type "A" with "X" mm interface and thus the two parts are compatible..

or

Part 1 is a 8mm gear, thus all bushings of 8mm from any manufacturer is compatible with part 1. Further more, all gears can only have compatibility relationships in the database with bushing and gear boxes, but there can be no meaningful compatibility between a gear and a rail, or receiver since those parts don't interface.

Now what I want is an AI to be able to learn from the decisions of the crowdsourced platform community and be able to infer compatibility for new parts/products based on their tagged attributes, what type of part they are etc.

What would be the best form of AI to tackle this? I was thinking a Expert System, but explicitly engineering all of the knowledge rules would be daunting because of the complex relations between literally tens of thousands of parts, hundreds of part types and many manufacturers.

Would a ANN (neural network) be ideal to learn from the many inputs/decisions of the crowdsource platform users?

5 Answers
5

Currently, I am in the process of building a crowd sourced platform
for people who are knowledgeable to go in and mark up compatibility
between those parts as its not always clear cut if they are for
example:

And this...

Now what I want is an AI to be able to learn from the decisions of the
crowdsourced platform community and be able to inference compatibility
for new parts/products based on their tagged attributes, what type of
part they are etc.

This leaves me with the assumption that the database of parts will already be established, more or less.

The primary crowd sourcing aspect of the process will be to confirm or deny compatibility claims: affirming claims already made. The secondary aspect will be to add new compatibility claims that weren't previously identified by other people, or inferred by the system.

The system could evaluate the claim including the conditions of the claim or the reliability of all associated claims, the nature of the part-to-part relationship, and identify if there is similarity in other part-to-part relationships to allow an additional claim to be inferred.

The conditions of the claim would be primarily being a judge of behavior, which ultimately would lead to something resembling an ANN/neural network. However, tuning this process to adjust the weighting factors of the variables under consideration will encourage the use to some degree of Markov chaining simply because going into the project, you won't have a model to tell you which factors to weigh the strongest, and which to discard, and there will be a lot of factors you will want to consider, from both a part-to-part relationship, from the user activity profile, and system activity profile (the system in this aspect should be viewed as just another user who has made a bunch of claims).

The nature of the part-to-part relationship is tricky. There are only a few points in a part-to-part pairing that probably need to be considered as crucial in determining if a part-to-part relationship may be a reasonable candidate for compatibility. For this aspect, an expert system would probably fit, but it could be even a less complicated process, potentially succeeding well with a more rudimentary method of candidate identification.

The tricky part of identifying candidates would not be whether or not an expert system, or any other AI for that matter, should be used, but how strong and detailed the data model is. The stronger the data model, the simpler the system would probably end up being, since you are only interested in a small number of potential relationship structures and corresponding values in a particular pairing, namely physical compatibility: do they bolt or plug together; if communication is a fundamental aspect of the parts, do they have the same number of wires in each side of the plug and potentially operate in the same voltage range; when pneumatic, are they both rated for safe operation in the same relative ranges and have the same port counts, shapes, and sizes, or something to that effect.

This is where your crowd sourcing could do some of the 'expert system' work for you. Let them tag the part aspects and interfaces that are important in determining compatibility. Essentially allow them to make two claims on the system: 1) a particular relationship is a compatible one, or not, or perhaps maybe if you are willing to allow the users to indicate uncertainty 2) this aspect or interface contributes to determining compatibility. By doing that, and then feeding each stream of data back into the appropriate neural network, you can probably eliminate the need for an expert system.

Lastly, finding new candidates based on all the other data you have gathered, that is some kind of inference process (or often described as recommendation system).

NNs are often used for tasks where we're unsure about what "features" are important. I can recognize a handwritten "2" but it's difficult to describe the essence of a "2" given the enormous variation in hand writing. In your case, the important features of your items seem to be decided by the tags. Humans have already done the hard work.

Similarly NNs are often used to classify objects, i.e. image of hand writing -> number. From what you said it seems like something that's already been done for you. This isn't to say that NNs couldn't be applied but it seems like the areas where NNs excel have already been covered. That being said, your model does seem to provide a wonderful amount of training data.

One question, you say:

Manufacturer A makes a "A" class product, and manufacturer B makes upgrade/part that generally goes with class "A" products, but is for one reason or another not compatible with Manufacturer A's particular "A" class product.

What could "one reason or another" be? Is it something you could tell from the tags? from the picture? from playing with the physical parts? Unless it's something a human could tell from the tags it's probably going to be difficult for an AI to guess.

To answer your question, the "one reason or another" could be many things, manufacturing tolerance for one, sizes that aren't given in the product data. These types of issues are usually discovered when the two parts are physically put together by a human. I do not expect the AI to figure these out, these cases are relatively uncommon and I expect my crowdsourcing community to take care of these problems. What I do want is the AI to learn from these inputs and inference that certain types of parts, manufacturers and models tend not to fit, or do tend to fit more often than not.
–
user1154277Nov 1 '12 at 21:03

Neural networks are more versatile than just visual recognition software.
–
PhilipNov 1 '12 at 21:06

Basically I want it to guess with probability based on past events, where relations were made. These compatibility relations tend to change over time for one reason or another (for example manufacturing tolerances). So for a time Manufacturer A's magazines might fit Manufacturer B's receivers well, but a year later they might have changed something minor and they start to not fit. I expect my AI to figure out that it is starting to not be compatible and factor that in to future guesses on compatibility.
–
user1154277Nov 1 '12 at 21:06

In addition to this I want my AI to figure out, by the virtue of part type "Gears" being mostly associated with "Gearboxes" in terms of compatibility by the users, I want the AI to know that if it gets a new product with "6mm Gears" in the title, that the tags (without being explicitly added) are most likely "gears" and "6 mm" and that they are generally associated with "gearboxes" of "6mm".
–
user1154277Nov 1 '12 at 21:08

Well, I haven't built too many AI systems before, but let's take a naive crack at this. If I were going to use a neural network for this, the assigned tags to new parts would be the input nodes, the (vast) list of items it could be compatible with would the the output nodes, and the hidden layer would be it's user-confirmed compatibilities or incompatibilities.

So if users make tags... "8mm", "gear", and assign them to some parts. And your users list these parts as compatible with a item #4215 (an 8mm bushing). Then new parts that were tagged "8mm" "dog" and "gear" "head" would both be listed as likely compatible with this 8mm bushing... And hopefully the vast quantities of data would fix that sort of thing.

And, of course, users have never disagreed when it comes to technical details like this...

If I understand you correctly, you essentially have a data set (crowd-)sourced from actual people, which describes the compatibility between various parts. You would like to build something which analyses this information, and figures out the associations between the parts and the compatibility flag.

Essentially, extracting this type of information from a data set is known as Data Mining. A variety of techniques are applied (statistical analysis, etc), to essentially identify the various traits in the data set that result in various classifications ("compatible" vs "not compatible"). These rules can then be turned into an expert system either automatically, or by hand. It's worthwhile mentioning that several software packages exist to do this job, so you may not have to write everything yourself (although they may be pricey). I would say that this is the most straightforward approach to what you want to do (although it would still be very difficult, involving many issues, such as dealing with incomplete data, etc).

I should also mention that you should question what you want to achieve with this, exactly. The benefit of having an expert system is that it would be able to apply similar logic to determine compatibility on parts which are very similar to ones it already understands well. The ability of the AI to apply itself in this way is called generalisation, and it's pretty much the only thing you will get out of it, since it will never outperform the original data for the original items. I would definitely weigh up the implications of this. If you are expecting a huge data set with many similar items, it may still be very worthwhile. If, however, you will have only a few similar items in each category, the AI will be relatively useless.

Man this is great info. To answer some of your questions and conditionals, what I ultimately want the ai to do is recommend relationships for people to vote on. Right now the crowdsourcing workflow is that users just get a random relationship to vote on. But I want the AI to recommend combinations that make more sense and be more correct so that instead of having x
–
user1154277Nov 3 '12 at 23:53

number of parts combinatorial relationships to vote on, I will significantly reduce workload.
–
user1154277Nov 3 '12 at 23:54

@user1154277 OK, that is slightly different to how I though you'd want to use it, and it might be more sensible (the AI has a less critical role to play). I also just wanted to highlight the difficulties in data mining dirty data sets (missing data, mismatched units, etc). You really want to have separate, specialised layers of code and AI to clean it, before you train another layer to pick combinations to suggest. Don't expect one big ball of neural networks to do this :)
–
Daniel BNov 4 '12 at 7:18

It sounds like you want some sort of inference engine, which to me suggests an expert system. I don't think populating one would be that difficult if you can break things down into proper categories (manufacturers, parts, assemblies, etc). Then you can make general assertions ("gearboxes have multiple gears","gears need bushings","manufacturers make bushings"), have some code that classifies parts and assigns properties, then let your user base enter specific exclusions ("bushings from manufacturer X don't fit into gears from manufacturer Y"). It's not clear from your question what you will do with this system once you've got it, so it's hard to give any more specific examples, but you might want to search on "recommender systems" and see if that sounds like what you're trying to build.

What I would do with this is let people have a "builder" of sorts to build their own upgraded product from different manufacturers which a lot of people in this hobby do. Part compatibility is hard to research and usually ends up being a game of "ask on as many different forums as possible and hope someone who knows posts". This is why I want a universal database that has mostly correct part compatibilities so people can pick the parts they want, know its compatible and order from vendors.
–
user1154277Nov 1 '12 at 21:28