0% found this document useful (0 votes)
177 views9 pages

Lesson 1 Modeling Data-Driven Strategy Simplifications

Uploaded by

akshay.heaton13
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
177 views9 pages

Lesson 1 Modeling Data-Driven Strategy Simplifications

Uploaded by

akshay.heaton13
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Lesson #1: Modeling Data-Driven Strategy

Simplifications
Welcome to lesson #1 of the 30-Day Training Camp. Today, we will expand on what you
have already read Nick's NL Hold ‘Em Success Guide. If you have not, be sure to read
through the guide before starting today’s lesson. In Part I, we will use Rock, Paper,
Scissors to model the performance of data-driven strategies. Then, in Part II, we will use
Pio solver to model a river bluff catching scenario.

Part I: Modeling GTO & Exploits using Rock, Paper, Scissors

If you’ve been around poker for long enough, you’ve probably heard somebody explain
GTO using Rock, Paper, Scissors. In Part I, we will use some EV calculations to build off of
this example, and demonstrate how exploitative adjustments are made. As always
remember:
EV = (Probability of Winning * Amount Won) - (Probability of Losing * Amount Lost)

Let’s imagine you are playing an opponent in a match of Rock, Paper, Scissors. Some
ground rules:

1) You must make a fixed bet of $1 per round.


2) You will play enough rounds to where variance is non-existent.
3) Your opponent is unaware of your strategy and will not make adjustments during the
match.

Scenario #1:
Let’s assume both players are trying to play optimally. In that event, they will throw a
random, and even distribution of 33% Rock, Paper, and Scissors each. Using the standard
EV equation, we find that both players will break even at the end of the match. Neither
player is exploitable, and therefore the EV of each throw, and the strategy altogether is $0.

Optimal vs Optimal Opponent

Rock Paper Scissors

W% 33% 33% 33%

T% 33% 33% 33%


L% 33% 33% 33%

Weigh
0.33 0.33 0.33 Total Strat EV
t

Total
$0 $0 $0 $0.00
EV

Scenario #2:
Now, let’s assume that your opponent is wildly imbalanced, and throws paper 100% of the
time. If you continue to play an unexploitable strategy (throwing an even & random mix of
Rock, Paper, and Scissors), the match will still end with both players breaking even. The
strength of an unexploitable strategy is that nothing your opponent does can cause you to
lose EV. While this might seem safe, and even pretty appealing, keep this in mind: We are
not playing poker to break even. We are playing poker to make money. Let’s explore some
scenarios of how we might counter our opponent to generate a +EV strategy.

Optimal vs Paper Only

Rock Paper Scissors

W% 0% 0% 100%

T% 0% 100% 0%

L% 100% 0% 0%

Weigh
0.33 0.33 0.33 Total Strat EV
t

Total
-1 0 1 $0.00
EV

Scenario #3:
Your opponent is still wildly imbalanced and only throws Paper. What is the optimal
counter, remembering that they will not re-adjust? Of course, the optimal counter is to
throw Scissors 100% of the time. Congratulations, you have now made a maximally
exploitative adjustment, and won a lot of money in the process.
Scissors Only vs Paper Only (MES)

Rock Paper Scissors

W% 0% 0% 100%

T% 0% 0% 0%

L% 0% 0% 0%

Weigh
0 0 1 Total Strat EV
t

Total
0 0 1 $1.00
EV

Scenario #4:
Let’s get slightly more realistic here. Your opponent is still imbalanced to throwing Paper
too often, but also throws Rock & Scissors sometimes. For the sake of simplicity, let’s say
that they throw Paper 50%, Rock 25%, & Scissors 25%. How would you adjust to this? For
most people, the intuitive adjustment would look something like this, “I would start to
throw Scissors more often, but still mix in some other throws just in case.”

Let’s take a look at a counter strategy where we throw Scissors 50%, Rock 25%, and Paper
25%. As you can see, this adjustment is profitable. However, this strategy has some
pitfalls. First, it is still somewhat complex, as it involves randomizing each throw. Second,
and most importantly, while it is a profitable strategy, the winrate generated is relatively
small. We are only making a $0.06 per throw with this strategy. This may seem simple for a
model like this, but imagine the money left on the table when this concept is applied to
poker. Let’s model this in our final scenario.

50% Scissors, 25% Rock, 25% Paper vs. 50% Paper, 25% Rock, 25% Scissors

Rock Paper Scissors

W% 25% 25% 50%

T% 25% 50% 25%


L% 50% 25% 25%

Weigh
0.25 0.25 0.5 Total Strat EV
t

Total
-0.25 0 0.25 $0.06
EV

Scenario #5:
Now, we will make a maximally exploitative strategy (MES) adjustment to our opponent
who is throwing Paper 50%, Rock 25% & Scissors 25%. As a counter, we will throw Scissors
100% of the time. After crunching the numbers we see that this strategy actually generates
the highest EV! We have an expectation of $0.25 a throw, which is over 4x the EV of our
strategy in scenario #4. Additionally this strategy also maximizes simplicity, as it only uses
one strategic option. Feel free to experiment with some more scenarios on your own, and
you will find that as soon as your opponent throws paper more often than optimal, the
MES is to always throw Scissors.

Scissors Only vs 50% Paper, 25% Rock, 25% Scissors (MES)

Rock Paper Scissors

W% 0% 0% 50%

T% 0% 0% 25%

L% 0% 0% 25%

Weigh
0 0 1 Total Strat EV
t

Total
0 0 0.25 $0.25
EV

The same principles of exploitative adjustments hold true in poker. In order to model this,
let’s take a look at a river toy game scenario using Pio Solver.

Part II: Modeling exploitative adjustments in Pio Solver


In this scenario, we have a nuts/air vs. bluff catcher toy game. Both players have 30
combos in range, with the out of position (OOP) range containing nuts (full houses) and
pure air (nothing, 64o), and in the position (IP) range containing pure bluff catchers. We
simplified this toy game to one all-in bet size of 100 chips into a 100 chip pot. You can see
the parameters here:

Scenario #1: River Play at Equilibrium


Let’s explore the out-of-position (OOP) player’s river betting strategy. Using the range
explorer tab we can see OOP's strategy with different parts of its range. OOP is pushing its
nutted advantage hard by polarizing with its nuts , and air, electing to bet 90% of the time.
Put frequency aside for now, and just look at the range composition. Using the range
explorer tab we can also see OOP's construction in the top right. OOP is employing a
perfectly optimal value-to-bluff ratio for a pot-sized bet of 2:1 (33.3% bluffs).
Let's look at IP's response to this strategy. As OOP is playing optimally, IP is forced to do
the same. IP is forced to indifference versus a completely unexploitable opponent whose
value-to-bluff ratio is perfectly optimal. We see that IP is calling 50% of the time, at perfect
minimum defense frequency (MDF) for a pot-sized bet. They are completely indifferent
with every combo, and the overall strategy has 0 EV.

These results are expected in this toy game scenario. Both players are playing completely
optimally, and their overall frequencies and construction fall right in line with what our
basic Optimal Bluff and MDF formulas would predict. Now that we have a GTO baseline
established, let’s see how the solver adjusts if OOP bluffs too often on the river.

Scenario #2: Node Locking OOP to Overbluff


Using Pio solver’s node lock feature we have forced OOP to make a very slight overbluff
with its bluff candidate 64o, now betting this combo just 1% more than at equilibrium.
We also locked its value range to stay the same. This has resulted in a 0.4% increase in
OOP's overall betting frequency. The value-to-bluff ratio of OOP's range has marginally
increased from 33.3% bluffs, to 33.6% bluffs.

This small shift may seem completely trivial, but actually have a massive impact on IP's
strategy. How do you think IP’s bluff catching strategy changes versus this node-locked
overbluff? Stop here and take a minute to make a prediction, using the models in Part I for
reference if needed. In other words, how will IP respond when it knows its opponent is
marginally imbalanced to bluff too often?
As you can see below, IP has made a hard adjustment to OOP’s overbluff. Similar to the
adjustment in our Rock Paper Scissors model, once an imbalance is present, the most
profitable counter is to always exploit that imbalance. IP now calls with every single bluff
catcher in pure! Additionally, IP's bluff catchers and overall strategy are no longer
indifferent, they are actually +EV! The solver throws MDF out of the window... it does not
care how low in its range it is... it does not care about counting combos... it is not afraid of
calling and losing to the OOP's nuts, which make up ~66% of their range... the solver only
cares about winning the most money by exploiting the imbalance of its opponent. With
millions of hands showing major imbalances in the player pool, poker strategy really can
be this simple.
Keep in mind these adjustments occurred after just a 1% increase in OOP’s bluffing
frequency. What do you think would happen in the real world where data indicates the
player pool is overbluffing by 3%, 5%, 8% or more? As a thought exercise, how about
when your opponents are folding too much themselves?

Conclusion:
We hope these modeling exercises have helped convince you that data-driven strategies
outperform by simultaneously maximizing simplicity and EV. If these results seem
surprising to you, that’s okay. These concepts are the cornerstone that Poker Detox has
used to outperform an industry that is too concerned with “balance.” Throughout this 30-
Day Training Course, we will share key pieces of data for preflop and post flop spots. Of
course, when you move away from a toy game into a real poker scenario, other concepts
will be at play such as resilience (disguising exploits), blocker effects, and range vs range
interactions. Despite this, always remember that when data indicates an imbalance away
from unexploitable play, the GTO exploit is to counter that imbalance by making dramatic
adjustments that are both simpler, and yield higher EV. Take this concept with you for
every strategy spot you learn throughout the course, and you will come out the other side
as a true red line crusher.

Daily Mission #2 🚀
One struggle players sometimes have with simplified data-driven strategies is deviating
from the strategy too often. Using a simplified data-driven approach to the game, what
spots do you think you would feel yourself wanting to deviate, and why? Would the
deviations really be grounded?
Start a conversation or give a thoughtful reply to another teammate in the #daily-missions
channel. After you do so, as with the first mission, scroll to the appropriate column on the
'Reports' page in the Volume Challenge and give yourself a checkmark (✔️) by your name.

Min-requirement: Demonstrates critical thought on lesson concepts/tools, written


coherently, stays within subject boundaries, 250 words

(For quick access to the Volume Challenge, click the pin icon at the top of the #daily-
missions channel and follow the first link 📌)

You might also like