The Meade numbers are what i would expect from that player - possibly needing a little adjustment. He has higher SPD than STR, so I would expect him to do better on the outside than the inside, and he does. It's tough to say just how much the outside rushing would need to be adjusted without knowing exactly what he was running against in the way of ratings.

I hate asking you to do more work, but I would be interested in seeing what these numbers look like for your RB Montgomery, as he moves about 10 points from SPD into STR compared to Meade. I would expect to see his inside numbers better than Meade and outside numbers worse.

Posted by norbert on 4/19/2013 2:23:00 PM (view original):What part of those numbers are showing that the results have worsened, and what part of that are you expecting to be different. I think the numbers aren't exactly balanced yet, but it looks like all of them are pointing in the direction they should be (e.g. lower rushing numbers for rushing defense, etc.). I don't claim or expect to have the engine to the point where it would be released and that's part of the reason for having the beta. When I make adjustments in the engine, I can only do so much testing and look at certain match ups and balance to those, so it helps to see other tests where the numbers don't match up with what I have been seeing. That allows me to find where the results need to be tweaked to bring those back in to balance. I can look at results of games, but having other people look at things like trey is doing helps out a lot more. I think I've been pretty upfront about where the engine stands and even now I'm still saying there needs to be more work done on the YAC. If you see something in the engine, whether aggregate stats like this or even in just one game, please post about it along with what you would expect. Not everything can be tweaked to perfection but it can help in making improvements. I also have to look at results with more scrutiny than you guys do, as sometimes not all the factors are being considered. If I say to look at other points of the play, it's to rule out all the possibilities and narrow down where any issues might be. Believe me, if it were as easy as waving a magic wand and having a perfect engine spit out perfect results that everyone agrees with, that would be great (assuming I can find a magic wand).

But I believe if you actually look closer at the results and all the factors, you'll find the results are much better than they used to be and that settings and ratings actually do affect the outcomes. That doesn't mean the engine is where we want it to be.

From the numbers trey posted and the numbers I submitted to you earlier, it looks like running inside/outside vs balanced defense is better. There still seems to be significant differences running inside/outside vs all rush defense. I can't find my data on running vs all rush defenses, but from what I remember the rushing differentials between balanced and all rush defenses were similar in my testing (approx 1-2 YPC depending on offensive formation). I did not test vs blitzing defenses so I can't comment on that. That is why I say that it has possibly worsened depending on defensive setting.

I appreciate that you're putting in the time and effort to adjust the engine. I like that you are soliciting feedback from the beta testers. But it's hard for us to give you much besides "feel". As we discussed earlier, it is very difficult to make balanced adjustments using a compartmentalized setup going from step 1 to step 2 to step 3 etc. til the end of each play. It's even harder to isolate individual factors influence on a play without being able to turn off the other influences.

When you are saying "balanced" defense, I take that to mean 50/50 pass and rush defense. This means that half the time you are calling rush defense and half the time calling pass defense, so it's only going to be different than "all rush" defense about half the time. This is why it's also difficult to evaluate the results based on aggregate data alone. We really have to look at the individual results as well. It's nearly impossible to break down all the possible combination of effects within the engine to get strict comparisons of results for ratings and settings. Really at the end of the day, all any of us can do is evaluate the engine on feel, but my job is to make sure that everything is considered in that evaluation. There may be times where something looks way out of whack until it is pointed out that some other rating or setting also affects the result that wasn't considered. So when I question any of your observations, don't take that as "nah, you're wrong. everything is fine." but rather that I just have to make sure everything is considered and look a little closer at the results than "feel".

The biggest issue with 2.0 was how it factored ratings and settings into the results. The way it did it left no room to adjust the results based on ratings and settings. The 3.0 engine pulls apart that logic in to way more moving parts that allows us to focus more on ratings and settings for each of those little pieces of the play. However, this makes it WAY more complicated when it comes to trying to hit certain numbers with results for those ratings and settings. It's really just trying to get everything moving in the right direction, observing, adjusting, and repeating until we feel it is where it needs to be. It's a long process and we are closer than we were, but there still needs to be adjustments.

Posted by norbert on 4/19/2013 4:09:00 PM (view original):When you are saying "balanced" defense, I take that to mean 50/50 pass and rush defense. This means that half the time you are calling rush defense and half the time calling pass defense, so it's only going to be different than "all rush" defense about half the time. This is why it's also difficult to evaluate the results based on aggregate data alone. We really have to look at the individual results as well. It's nearly impossible to break down all the possible combination of effects within the engine to get strict comparisons of results for ratings and settings. Really at the end of the day, all any of us can do is evaluate the engine on feel, but my job is to make sure that everything is considered in that evaluation. There may be times where something looks way out of whack until it is pointed out that some other rating or setting also affects the result that wasn't considered. So when I question any of your observations, don't take that as "nah, you're wrong. everything is fine." but rather that I just have to make sure everything is considered and look a little closer at the results than "feel".

The biggest issue with 2.0 was how it factored ratings and settings into the results. The way it did it left no room to adjust the results based on ratings and settings. The 3.0 engine pulls apart that logic in to way more moving parts that allows us to focus more on ratings and settings for each of those little pieces of the play. However, this makes it WAY more complicated when it comes to trying to hit certain numbers with results for those ratings and settings. It's really just trying to get everything moving in the right direction, observing, adjusting, and repeating until we feel it is where it needs to be. It's a long process and we are closer than we were, but there still needs to be adjustments.

I realize that only 50% of the plays are comparable. But somewhere in the other 50% of the plays, something is happening to allow outside running to outperform inside running by approximately 2 yards. I think that's an issue that needs to be looked at. It "feels" wrong since it looks like stuff% are statistically even. IMO, there needs to be an adjustment to stuff% and/or 20+ yard runs and/or the mode of the runs. Given the complexity of the zone results method, I have no idea how to accomplish this and I don't envy your position.

Norbert - Here's the Montgomery data you requested. As you noted, he is much more suited to be an inside runner, so I was curious myself about the difference. I couldn't run the same average game totals, because as the 2nd RB he only averaged 6 carries per game and his carries were less consistent, but we can look at the totals.

As you'll see from the chart below, the inside vs. outside gap is less, but even in this case Montgomery averages 0.6 YPC higher on the outside. And again, as we saw with the lead RB Meade, the inside vs outside gap is higher vs. an all rush defense than vs a balanced defense. The average gap is about 0.4 YPC vs a balanced D, and 0.7 YPC vs an all rush defense. He also has a higher % of 20+ yarders on the outside (3.7% out vs 2.8% in), which is in contrast to what you would expect with his ratings (54 spd, 50 elus, 65 strength, though the 60 ath could offset the lower spd/elus).

At any rate, I still believe there is some tweaking to be done, specifically the impact of a rushing focused defense on inside vs outside runs.

FYI, your assumptions on balaced defense are basically correct. On rushing down/distance (i.e., 2nd and short, 3rd and short), defense is about 80% run, while on passing downs and formations (2nd/3rd and long, Pro set and trips), defense is 80% pass with nickel for trips.

It looks to me like outside rushing needs to be adjusted to have less long runs which would improve the overall numbers you were showing as well as help bring the balance with Montgomery's numbers. It is currently set to have slightly more stuffs on the outside (all things being equal) and slightly less rushes getting past the short area on the inside. This is what we are seeing in those numbers, but there probably needs to be slightly more stops in the short area for outside runs. All of this trickles down - less stops short means more runs long, etc. - which is why it takes time to balance these things.

The idea of more stuffs outside and less long runs inside is carried over from the 2.0 design and I believe the idea is that there may be more stuffs on the outside due to the play taking longer to play out and less inside rushes getting past short yardage because they have to run through the box. Any comments on that is welcome. We don't really have good numbers from real life, but I also think this adds a little strategy game-wise - you can try to rush inside when you need short yardage and rush outside risking the loss of yardage but improving your chance of longer yardage.