|
I've worked with David's obfuscated data from HandHQ.com, and have managed to pump out some research which is (hopefully) interesting to academics, poker players and laypeople alike. The biggest challenge I encountered while conducting this research is that PokerTracker did not appear to be equipped to handle multiple datasets of these sizes. However, I'm sure the ingenuity of researchers and advances in software and hardware will ameliorate these challenges in the future. Here's a link to a preliminary draft of my research: I welcome any and all feedback from anybody interested in these sorts of research. This is a preliminary draft, so please check with me if you intend to use or cite it. Kyle Siler Cornell University Department of Sociology ksiler[at]gmail[dot]com "The lower proportion of shown down hands offsuit Broadway hands suggests that at higher stakes, players are more likely to fold these hands, which are very conducive to reverse implied odds. As a result, the profitability of these hands increases." I'm confused. Should that say that the profitability of those hands decreases?
(Nov 02 '09 at 18:44)
Matador
Any idea how the site owners feel about this kind of thing? Hopefully the fact that the players and tables are obfuscated prevents this from violating any rules. Thoughts?
(Nov 02 '09 at 19:05)
SleepyLaBeef
Few things: (1) Type page 2: 'eand strategic' (2) IMO, the first comma should be removed: 'Some players use poker as their sole, or a partial means of income, others use it as a lucrative avocation, while others play as recreational gamblers.'
(Nov 03 '09 at 02:30)
Matador
@Kyle Any chance of a Ciff's Notes on your findings?
(Nov 03 '09 at 02:34)
AA Every Time
|
|
Matador: What I meant there was that at higher limits, players are more likely to fold offsuit Broadway hands, and do so in more appropriate situations. As a result, the observed profitability of offsuit Broadway hands as a whole increases, due to players making more and better laydowns. Presumably, lesser skilled players lose value off these hands by being unable or unwilling to fold them at optimal frequencies for the games they're playing (e.g., the archetypal player who can't get away from TPTK or TPGK). The theoretical profitability of hands and implied odds may change depending on skill and table-dynamics equilibria in a given game, but these data are only able to speak to 'empirical' or 'actual' profitability. Down the road, I might do simulations to get at more theoretical issues. Also, the prose that you quoted is awfully clumsy; why do I always notice these things after putting them online? Thanks for your critical eye and query. There will be a revised paper posted in the coming weeks. SleepyLaBeef: Since these poker hands are public events open to observers, and anonymity is being provided to all players, this project should be pretty defensible, at least from the standpoint of university IRB's. I know of at least one site that has tentatively started working with academics, but is moving very cautiously due to legal concerns. I'm not a lawyer, so I'm not sure what those concerns are, or if and how they can be assuaged. The internet is a new frontier for law, especially when tied with online poker, so these issues might take some time to iron out. Erring on the side of caution, in my own paper, I do not reveal the name of the site from which I received anonymous hand histories via HandHQ.com. Good writing takes work. On the whole it's good writing IMO :)
(Nov 02 '09 at 22:51)
Matador
|
|
Apologizing in advance for the brevity of my explanations and my tongue tied statistics writing voice; I have feedback about Kyle Siler’s paper. There are measurement and data structure issues associated with the second regression analysis of big blind win rates (BBWR) on hand type (HT). The data have a nested structure; each individual has only one BBWR which is used for every hand. So in the data file for the regression of BBWR on HT the dependent variable BBWR is assumed to have persons * hands played observations instead of just as many observations as there are people. More seriously the relationship between BBWR and HT could vary between people. Unlikely, but also serious players may learn over the course of the data. A multi level regression analysis (consult your local statistics support people) should address these issues and increase the study’s power. The big measurement issue is that in poker most of the time we lack information about the hands players are holding. The sample of hands we see is not a random sample but instead is a convenience sample of the hands people actually play that actually get shown down. Also see my comment about how the hole cards in hands where players go all in before the river may not have been recorded. The meaning of HT needs to be thought about and included in the paper. I think I can come up with a couple pages on the topic myself A very small measurement issue that most would overlook but I would like to point out is that categorizing hand type does have some consequences. Independent variables in regression analysis are assumed to be measured without error. Are seven’s the same as jacks? Is hitting trips with sevens more likely to give a big payoff than hitting trips with jacks? Another measurement issue is that rake should be calculated each hand so that a players actual rake can be calculated. It’s possible that player strategies affect the average rake. Uncalled bets aren’t raked so tight, aggressive players who make relatively few and large bets should have lower average rakes than loose players. |
|
AA - Good idea: The main point I'm making is that poker is a challenging game, not just due to the architecture of the game itself (e.g., reading betting patterns, calculating pot odds, etc.) but due to the risk structures it presents players with. These incentives often go against how humans handle risk in their lives, particularly in regards to monetary issues. Empirically, the main findings are: I. Winning higher percentages of hands are negatively associated with profitability. This seems counter-intuitive, as you play every hand to win it. However, economic research has shown that humans tend to overweight frequent small wins vis-a-vis infrequent large losses. Relatedly, since humans are generally risk-averse with perceived gains and risk-loving with perceived losses, this also explains how losing/tilty sessions can spiral out of control for some players. Of note, this trend attenuates (but remains significantly negative) at higher stakes, pointing to the importance of stealing pots to retain an edge at these stakes. As an example, this suggests that one cannot merely set mine and wait for monsters at higher stakes and remain profitable. II. Different types of hands have different values at different stakes. The result that stood out most to me is that at low stakes (NL50), small pairs (22-77) actually were more profitable than medium pairs (JJ-88). This likely has to do with the skill of the players in these games, as opposed to special dynamics in NL50 games. Small pairs have less ambiguous value than medium pairs, and it is difficult for less skilled players to correctly take advantage of thin value bets under conditions of uncertainty. Put differently, folding twos after the flop without hitting a set is a much easier decision than deciding what to do with medium pairs. In other words, like a marginal utility curve in economics, it's fairly easy to skim value off of sets and flopped flushes (albeit not quite as easy to maximize value), but being able to optimally adjust one's betting, calling and folding frequencies with 99 on a board with two overcards is both trickier, and a thinner source of value. However, as one moves up stakes, these thin and more uncertain sources of value are your only remaining sources of profit against stronger players. Further, suited connectors had less value at high stakes (NL1K), but since the anonymized data is restricted to shown down hands, this is likely a result of a greater propensity of players at NL1K to push draws, and players at lower stakes to be given drawing odds to see if their draws come in. The greater value for other types of hands at high stakes may be derived from the quasi-sacrificial lambs of suited connectors (Phil Gordon identifies this as Prahlad Friedman's strategy in his Little Green Book). III. TAG play is more profitable and prominent at low stakes. As one moves up stakes, LAG play becomes more prominent and profitable. Once again, this is likely primarily a function of the skill of the players involved. Low-stakes LAGs are are more likely to be playing for fun; high-stakes LAGs are often aggressive and clever, exploiting the risk-aversion of other players. Changing dynamics of the game may also explain why there appears to be a growing niche for LAG play, as fold equity is probably more prominent with more skilled players. At higher levels, TAG strategies still can be successful, but occupy a smaller niche of the top winning players and a larger proportion of the top losing players than at lower stakes. This suggests that as one moves up levels, poker becomes less strategic and more tactical (since if played very well, a large variety of strategies at NL1K have been shown to be successful). Further, a grinding player may get to the big game by set-mining and playing patiently, but unless they're really good at it, they're going to have to modify their previous winning (and presumably ingrained) strategies in order to keep moving up limits. Anyhow, those are the main points I'm inclined to emphasize. Thanks a lot for your interest! P.S. - Is there any way I can continue use my name as a login? I couldn't figure out how to login to edit my old post. Re logging in. Next time you log in click the Google button. This will associate your account with your Google/Gmail OpenID account and you will always be able to log in as the same Outflopped user.
(Nov 03 '09 at 15:18)
Mr. Flibble
@Kyle. I've merged your accounts and associated with your google openid. Please let me know at http://outflopped.uservoice.com/ if there is any problem.
(Nov 04 '09 at 02:00)
Mr. Flibble
Doesn't your work depend on hands shown down, and therefore have a huge bias against hands which make bluff-catchers? You aren't seeing the times players miss draws and fold. Missing draws could be highly unprofitable, balancing the times those hands go to showdown after hitting.
(Mar 12 '10 at 20:26)
Douglas Zare
Douglas Zare is right. This is the big measurement problem I was refering to above. Its like the parable of the blind men and the elephant. In the parable each blind man touches a different part of the elephant and comes to a conclusion about what animal the elephant resembles. When the blind men compare notes they realize that they all came to different conclusions and none of them was right.
(Jun 13 '10 at 07:49)
mazoula
|
|
I have some notes about the poker files. First the poker files do not contain starting chip information. Statistics like m and average table m, and player-is-all-in cannot be computed.
Second the hand numbers have been masked. The only source of time information I have been able to find in poker hands is the hand number. Presumably higher numbered hands start after lower numbered hands. For those who needed time information and couldn't find it, your welcome :) Hand number could be almost as well masked by subtracting a particular hand number as a reference point. Then the hand with the larger hand number occurs after the hand with a smaller hand number and the time information contained in hand number is retained. |
|
Hi Kyle Interesting research, there is much to learn from such large data sets. One remark, regarding the observation that the top 20 winners tend to have marginal win rates: One explanation is indeed that winners tend to move up in stakes, but also there is the multitabling tradeoff. I can win maybe 8 BB/100 at 50NL singletabling, but I can also play nine tables simultaneously with, say, 3 BB/100, obviously generating a huge profit in doing so, even while cutting my winrate. The players who win the most at any given stake are by no means necessarily the most skilled players, and they are not necessarily choosing the strategy that is most profitable if singletabling. The more tables you play, the tighter you'll have to play to keep it up, so certainly the players with the most hands played will be much tighter than average. Regards, Jens Jens - Good point! I'd imagine that multi-tabling is a bit more of an issue at lower stakes (are there people who 9-table NL1K profitably?). Players try to find the optimal point between win-rate and volume of tables. It's definitely an explanation I'll add to subsequent drafts of my paper.
(Nov 14 '09 at 15:54)
Kyle Siler
|
|
I have also been working with the data generously provided by HandHQ. I am mainly working towards building a website for rake(back) comparison. I will post a link when it is finished. @Kyle: I have read your research and I really like the general idea behind it. Poker is in my opinion a beautiful example of complex game-theory. Also the fact that it is one of the few examples of (confined) complex game-theory for which empirical data is abundantly available, I am surprised there is not far more research on the subject. However... here it comes ... I have some critical remarks: You are completely right when you mention that: "As Boyd (1976) observes, someone labeled a „tight‟ player in one game, might be a „loose‟ player in another game, while implementing the same strategies." However when you start rating players (Appendix A) you seem to have forgotten this earlier statement. I can not find proper argumentation for the thresholds you suggest in Appendix A. From personal experience with the data (and playing poker myself) I know that average % of hands played for 6max is approximately 25-26%. This means that by your definitions, hands played by tight players are strongly overpopulated. Also you do not adjust your definitions for the different stake levels you analyze, which again does not follow your earlier statement. With looseness being relative, I would suggest working with percentiles (33th and 66th) in defining your thresholds. For the record, I doubt it will significantly change the outcomes of your study. Finally, I would suggest adding graphs of distributions for % hands played, % hands raised and aggression. Hi DaWeef - My answer was too long for the comment box, so see below for my reply.
(Nov 20 '09 at 19:02)
Kyle Siler
|
|
@DaWeef - Thank you for your thoughtful comments. Substituting percentile ranks as opposed to raw scores is an interesting idea. My biggest reservation is that it would make comparing behaviors across stakes difficult (e.g., a 55%ile AF in NL50 might be 40% in NL1K). Still, computing percentile scores within stakes may be a good idea. I'll acknowledge that my cutoffs at PFR 25 and 35 were arbitrary guesstimates (and might be slightly on the high side), but as per the distribution of strategies shown in Tables 1a, 1b and 1c, I think they did a reasonable job of differentiating players in different pools. Further, having slightly high boundaries also gives me the latitude to better differentiate loose players. Another solution I've considered is running interaction effects between PFR, AF and Hands Raised, as opposed to the user-defined PokerTracker categories. My reservation with this, is that it would make the paper less accessible and more esoteric for non-academics, which is the opposite of what I want to (and according to journal editors, need to) achieve in subsequent drafts of the paper. Thanks again for giving me some good food for thought. I'd be interested in seeing your research, and I'll definitely keep the theoretical and empirical issues you've raised in mind when writing my revisions. |
|
David B, Would you mind if I created a torrent from these hand history files and posted it to this thread and a torrent indexing site? This would lighten the load on the handhq servers as well as give more people access to this useful data. I've no problem with that. Thank for asking. Please reference this thread and handhq.com in the torrent.
(Feb 07 '10 at 14:20)
David B
|
|
Huge props to the folks over at HandHQ for making these available. Currently working on a research project involving these hands, which will soon be available. |