Aviler wrote:Updates daily for the previous 7 days? Or updates once a week? Or trade secret?

I'd have to google around to tell you exactly, but it looks only at the data from the last 7 days to compute the average and other values.I'm not sure if it is exactly 168 from when you click or if it just changes each day.

I wanted to post this in an update hut but for some reason am not allowed to post there. There was a request that your query remove outliers from the query for things such as users selling 1 gold for 50 food for studying at a library or something. Someone made the comment that if it could be done efficiently they would and not sure why it needs to be exceptionally efficient and not being familiar with your database it will obviously need to be tweaked. Knowing the type of database it is (Oracle, SQLServer) could have helped but here is a basic stab at a solution for you...

select i.item as ItemName, Avg(i.cost) as AveragePrice, i.type as PriceType, Sum(i.units) as UnitsTotal from SoldItems i join ( select d.item, Avg(d.cost) as AveragePrice, d.type, 2*STDDEVP(d.cost) as StdDeviation from SoldItems d group by d.item, d.type) don i.item=d.item and i.type=d.type and i.cost between (d.AveragePrice-d.StdDeviation and d.AveragePrice+d.StdDeviation group by i.item, i.type

This is not at all tested and I just wrote it here. But, I do believe your statistics will show alot better if you remove things that fall out of a certain range of the standard deviation as I am doing here. Not sure of efficiency of the above and again believe it might need to be tweaked depending on database being used. Also, I made assumptions about your database layout that are also probably inacurate but think if you sit down and spend about 4 hours you could probably enhance your query to remove outliers and be alot more useful.

Sorry my reply was geared towards developers as in another post there was mention that the problem with the info of average trade price is that it should remove items that are way outside the norm. Meaning when people trade 1 gold for 50 food and 1 wood for 100 food, etc. Meaning a trade for a payback instead of a real trade. This means that the query that computes the averages has to be modified to remove these outliers which can get a bit hairy. As one who has written alot of SQL which stands for Simple Query Language (I Think) the language of Relational databases, I was trying to offer some assistance in how to modify the query so that it removes these outliers in the totals that are computed. Again my post has no meaning accept to the developers of the game as I am a developer historically myself.

Also, on this same note there is a bit of statistics in what to remove when.

Since I feel like geeking out for a moment and have a degree in math I will go ahead and geek this out for you.

Lets say you have a set of numbers. The total of the numbers is 500,000 and there are 100 numbers well you can easily calculate that the mean (average) of the numbers is 500,000 divided by 100 or 5,000 on average. However, there is another piece to this being that not every number is 5,000 but instead deviates from 5,000 by some amount. This deviation can be calculated using a formula which SQL nicely will do for you. The formula cannot write out so will describe.

It is square root of following...The summation of each data point minus the mean squared all divided by the number of data points.

So, a few points 5,005 4,990 5,001 5,000Would be (5005-5000)=5 squared=254990-5000=10 squared=1005001-5000=1 squared=15000-5000=0 squared=0and so forthLets say we get 250,000 and then we have 100 datapoints so divide that by 100 and we end up with 2500. Take square root and we get 50 so the standard deviation is 50. From here the standard deviation says that any data point within 50 of 5000 is within one standard deviation so 4950-5050, 100 two so 4900-5100. So, anything outside a predetermined number of standard deviations we will throw out as noise thus removing the values that are 1 or something else completely confusing. Once we do this our mean could dramatically change if there are a number of these values that fall way outside the standard deviation.

I am glad I confused you even more but doing this removing the values that fall way outside the normal deviation will make for much more accurate averages of sell price.

K!ngF!sh wrote:Sorry my reply was geared towards developers as in another post there was mention that the problem with the info of average trade price is that it should remove items that are way outside the norm. Meaning when people trade 1 gold for 50 food and 1 wood for 100 food, etc. Meaning a trade for a payback instead of a real trade. This means that the query that computes the averages has to be modified to remove these outliers which can get a bit hairy. As one who has written alot of SQL which stands for Simple Query Language (I Think) the language of Relational databases, I was trying to offer some assistance in how to modify the query so that it removes these outliers in the totals that are computed. Again my post has no meaning accept to the developers of the game as I am a developer historically myself.

Actually SQL = Structured Query Language.

Also, there is a fundamental problem. Trying to encode a business rule utilizing standard deviation to identify and exclude outlying items as statistically irrelevant / suspect is fine except for one thing. Standard deviation still assumes that while certain values are of relatively less value, that each transaction in and of itself is equal in meaning as any other transaction. Unfortunately that is a false assumption. For example, while 100 stone trading for 10 gold as a payback is suspect, what about 10 transactions consisting of 10 stone for 1 gold? Same value but different impact on the standard deviation. It's easy to assume they would be all ruled out but the problem is the standard deviation has been changed. So you would have to make it iterative. Calculate the stnd dev, throw out the outsiders, recalc the stnd dev, throw out the outsiders, etc, continuing until there are no more outsiders.

Big problem with this is 95% of the items in the list are already of such low volume to already be irrelevant as it it. Even if you one disagrees with me on this (which I have data to back up my assertion), the necessity to perform an iterative approach to achieve any meaningful improvement in the integrity of the data used will irrefutable reduce the volumes to the point where the only items worth looking at will be gold, crude, wood, oil, feathers, leather, berries, and stone.

Ignore crude, feathers, and berries. They are so cheap, common, and stable in price that the data will rarely if ever fluctuate.Ignore oil because its value is fixed (pegged) to the price of wood and food (in gold terms). The only real fluctation there will not be based on market dynamics but only on the number of people who bought/sold oil recently who flunked math in high school.

That leaves the only possible meaningful prices of potential fluctuation (sans bogus transactions) to be food (in gold terms), wood, leather, and stone.

Sorry but until the trade report is stretch to include at least a month (or 2) worth of data and an iterative approach is taken, and they filter out the ALL items whose volume level fail to exceed at least some minimal percentage (1/10th %?) of the TOTAL items of the same kind in existence, the report will be nothing but misleading and of no real value except to sellers/buyers trying to manipulate their potential counter-party.

Last edited by Micali on Thu May 20, 2010 12:35 pm, edited 1 time in total.