Benny Friedman NPR, and a word on statistical completeness

JameisLoseston
Posts: 391
Joined: Wed Oct 16, 2019 12:39 am

Re: Benny Friedman NPR, and a word on statistical completene

Post by JameisLoseston »

If you want to figure normalized passer ratings for Friedman, I would not include anything from "incomplete" games. In Friedman's case, I think Neft has "complete" stats for 42 of 61 games (that's for 1927-31, the years when the league wasn't counting), so that's more than two-thirds anyway. It's reasonable to assume that his stats for the other 19 games would be similar if we had them. But we don't have them, and 42 games is a reasonable sample size to work with. So I'd just use those stats and disregard the other games. The same goes for Dunn, since such a high percentage of his games with Green Bay are fully documented, or near enough.
Obviously that is ideal, but the issue is how labor-intensive this would be. Sure, I can sift the stats with that fine a grain for specific players like Friedman and Dunn, but I can't imagine how I'd do so for the league compilations, without spending dozens of hours picking through to figure out which of every single player's games is completely recorded. Friedman/Dunn's complete games only weighed against everything for the league would work, but it might be even more skewed than trying to fill in the blanks.
rhickok1109
Posts: 1473
Joined: Sun Oct 12, 2014 8:57 am

Re: Benny Friedman NPR, and a word on statistical completene

Post by rhickok1109 »

One small caution about incomplete game stats: They tend to include only extreme results.

By that I mean that the newspaper accounts on which they're based would usually include long gains, such as a 15-yard run or a 20-yard pass play, and negative plays, such as an interception or an incomplete pass or a 1-yard loss that ended a drive and forced a team to punt, but nothing or very little in between. Such an account, for example, might say something like "The Giants gained 22 yards on five consecutive runs to cross midfield" without telling you what runners were involved, so that would help with team stats but not individual stats.
User avatar
JeffreyMiller
Posts: 819
Joined: Wed Dec 17, 2014 11:28 am
Location: Birthplace of Pop Warner

Re: Benny Friedman NPR, and a word on statistical completene

Post by JeffreyMiller »

rhickok1109 wrote:One small caution about incomplete game stats: They tend to include only extreme results.

By that I mean that the newspaper accounts on which they're based would usually include long gains, such as a 15-yard run or a 20-yard pass play, and negative plays, such as an interception or an incomplete pass or a 1-yard loss that ended a drive and forced a team to punt, but nothing or very little in between. Such an account, for example, might say something like "The Giants gained 22 yards on five consecutive runs to cross midfield" without telling you what runners were involved, so that would help with team stats but not individual stats.
Case in point: I am currently working on a story about the All-Americans-Bulldogs game of 12/04/20. Jim Thorpe had a long punt return in the second quarter, as reported by all of the major papers. However, The New York Times described the run as a 40-yarder. The New Yorker Herald called it a 55-yarder. The Buffalo Express said it was 50 yards. Good luck ...

As for attendance in this game, The New York Times estimated 15,000, as did the Buffalo Express. The New York Herald said 12,000. The Canton Daily News said 7,000!
"Gentlemen, it is better to have died a small boy than to fumble this football."
RichardBak
Posts: 814
Joined: Sun Aug 02, 2020 4:04 pm

Re: Benny Friedman NPR, and a word on statistical completene

Post by RichardBak »

JeffreyMiller wrote:
rhickok1109 wrote:One small caution about incomplete game stats: They tend to include only extreme results.

By that I mean that the newspaper accounts on which they're based would usually include long gains, such as a 15-yard run or a 20-yard pass play, and negative plays, such as an interception or an incomplete pass or a 1-yard loss that ended a drive and forced a team to punt, but nothing or very little in between. Such an account, for example, might say something like "The Giants gained 22 yards on five consecutive runs to cross midfield" without telling you what runners were involved, so that would help with team stats but not individual stats.
Case in point: I am currently working on a story about the All-Americans-Bulldogs game of 12/04/20. Jim Thorpe had a long punt return in the second quarter, as reported by all of the major papers. However, The New York Times described the run as a 40-yarder. The New Yorker Herald called it a 55-yarder. The Buffalo Express said it was 50 yards. Good luck ...

As for attendance in this game, The New York Times estimated 15,000, as did the Buffalo Express. The New York Herald said 12,000. The Canton Daily News said 7,000!
When I run into these situations---competing papers with varying numbers, especially when it comes to attendance---I typically combine all the figures and arrive at an average. That's the number I wind up using. So that return by Burt Lancaster, um Jim Thorpe, works out to 48 yards (145 yards divided by 3 accounts). And attendance was about 11,300 (34,000 divided by 3 accounts). Hey, it works for me.
JameisLoseston
Posts: 391
Joined: Wed Oct 16, 2019 12:39 am

Re: Benny Friedman NPR, and a word on statistical completene

Post by JameisLoseston »

Speaking of varying numbers, I came across a really old Coffin Corner article that gives Friedman 12 and 10 pass TDs in his first two years, instead of the common 11 and 9. Is this something that changed since like 1987 when this article was published, and if so, why?
Bob Gill
Posts: 559
Joined: Wed Oct 08, 2014 7:16 pm

Re: Benny Friedman NPR, and a word on statistical completene

Post by Bob Gill »

JameisLoseston wrote:Speaking of varying numbers, I came across a really old Coffin Corner article that gives Friedman 12 and 10 pass TDs in his first two years, instead of the common 11 and 9. Is this something that changed since like 1987 when this article was published, and if so, why?
I didn't know that. I still have him credited with 12 and 10 in my files. But I'm sure any change resulted from comparing different accounts in different newspapers. It's common in those days.

One paper will say "Friedman passed to Sedbrook at the 3, and then Sedbrook took it in." Sounds like a TD pass there, but then another paper will say "Friedman passed to Sedbrook on the 3, where Grange made the tackle. On the next play Sedbrook went over." So if you had 12 TD passes for Fridman and then found that second account in a paper you hadn't seen before, you'd probably change it to a rushing TD for Sedbrook.

And sometimes you really can't tell which seems more likely, so you have to decide on some other basis like preponderance of evidence -- say, two papers call it a TD pass and one calls it a run, so you decide it was a TD pass. But if you find two other papers that call it a run, you might change your mind.

I'm sure something like that accounts for the changes you mentioned, along with others that you'll probably come across.
JameisLoseston
Posts: 391
Joined: Wed Oct 16, 2019 12:39 am

Re: Benny Friedman NPR, and a word on statistical completene

Post by JameisLoseston »

Project update! Red Dunn NPR raws:

1924: 74.3
1925: 97.75
1926: 74.07
1927: 66.78
1928: 63.07
1929: 69.13
1930: 133.5
1931: 141.68 (non-qualifying)
Career: 89.7

I honestly expected him to fare a bit better than this; I'd have predicted something closer to 100 than 90. It seems that simply adding Friedman back into the league data pool caused some problems for Dunn; Friedman alone impacted the league ratios very noticeably. His 1930 still makes it as the 4th-highest rated season ever, and 1931 obviously is non-qualifying on only 31 attempts, but it'd have been higher than Friedman '29 if not for the stipulation that the passer rating variables cap at 2.375, because his TD rate was astronomical.

1924 was inherently problematic due to the severe statistical underrecording of that year, and I don't know what to make of the result. It seems to me that Dunn had a very solid rookie season, and he's in my top 3 MVP picks for that year. But the underreporting of stats that would negatively affect the league ratios, combined with random JAGs like Hoge Workman putting up numbers out of nowhere, combined to hold Dunn's efficiency output unexpectedly close to the league average. I'm not sure to what degree we can assume that the underrecording applies equally to Dunn, but if his stats are more complete than everyone else's, then he could have been better than shown here. He is my 1925 MVP, and his NPR of almost 100 on a (disputed) title team provides solid evidence for that selection.

i knew his middle seasons wouldn't fare too well, but somewhat surprisingly, he always managed to hang right around league average even in his off years. Interestingly, his highest volume season (1928) was his lowest rated. Overall, despite a slightly underwhelming result, Dunn still comes in as the #6 all time passer on the NPR leader board, just ahead of Aaron Rodgers. Red Dunn, highest rated Packer QB? Well, I'm not sure anymore, because Rodgers' 2020 probably pushed him up the board a good bit himself, but as of 2018, he sure was!

Actually, we'll still have to see about that one, because he isn't the last player who'll be made a part of this project. A certain other Packer legend is up next! Since Dunn performed slightly below expectation, though, I'm predicting Herber to fall somewhere right around him or slightly lower, because league passing had improved by the latter part of his career. However, Herber also didn't have a Friedman who was head and shoulders better than him to muck things up, so we'll see how it goes. In any case, this exercise has pretty much established that Friedman was an unbelievable player, whereas Dunn was merely a very good one. Very good... that's a perfect characterization! These calculations surely ought to bolster his candidacy for that Hall, and I will no doubt spend a part of the article advocating for his induction.
JameisLoseston
Posts: 391
Joined: Wed Oct 16, 2019 12:39 am

Re: Benny Friedman NPR, and a word on statistical completene

Post by JameisLoseston »

Just acquired my Neft encyclopedia; currently working on NPR calculations for Herber and interpolating incomplete games for Friedman. Regarding the former, I guess I'll save the lengthy discussions for the published article unless I see signs of activity picking back up in this thread, but I'll say that so far, it is corroborating my prior-held notion that Dunn and Herber are largely interchangeable players. As to the latter, it's fairly easy for one player like Friedman, but looks to be significantly more onerous to adjust the league totals by the same methodology. I think the encyclopedia gives me the resources to just make it work, though; Neft gives team passing totals for every year, divided between complete and incomplete games, so I thankfully don't have to manually add up every single player, which would not be worth the trouble at all. Instead, I can just add up the team stats, which isn't too bad since there's no more than 12 teams in the league by 1927.
JameisLoseston
Posts: 391
Joined: Wed Oct 16, 2019 12:39 am

Re: Benny Friedman NPR, and a word on statistical completene

Post by JameisLoseston »

I've run into a couple unexpected developments that I'd like some input on:

Herber's career NPR is 87.8, which puts him between Dawson and Staubach on the board. But I have to give special mention to the granddaddy of all Luckman Effects: the 9th-highest rated season with a minimum of 100 attempts, at 116.57, is now Arnie Herber... in 1934, which I don't even think is one of Herber's three best seasons (I'd put 32, 36, 39 all above it). This is ranked above 2011 Aaron Rodgers, and I absolutely cannot defend that. What's going on here is that the league passer rating in 1934 was below 20, by far the worst season ever, making Herber's meager 45 look like a masterpiece. I'm starting to be convinced that NPR really isn't that great a metric, above all.

I decided to adjust Friedman's rushing stats for fun after finishing his passing (went well), but I'm not too confident in the method. What I do is essentially take the rate stats from the complete games each year and prorate them to the incomplete games, accounting for the numbers already compiled from them. But this gives Friedman over 1100 rushing yards for 1928, which is obviously a near impossibility. The absurdity here is that Friedman's yards per carry, which I use to interpolate the missing yards, is actually higher in his complete games that year (8.8) than in his incomplete games (7.5), so my adjustment method can't do anything but assume he was that dominant all year, which of course is incredibly unlikely. What I'd really need is a breakdown of the 3 individual complete games I'm using for the calculation; I have a feeling there's one game, probably against a bad team, where Friedman just went straight off on the ground, and if so, I could factor out that outlier game and adjust based on the Y/C of the two remaining games. I think the method is mathematically sound here, but it is dependent on some uncomfortably small sample sizes sometimes.
Post Reply