CS212 Unit 5 - Udacity Wiki
CS212 Unit 5 - Udacity Wiki
CS212 Unit 5
Contents 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 01 02 03 03 04 05 05 06 07 07 08 08 09 09 10 11 11 12 12 13 14 14 15 15 16 17 18 19 19 20 21 21 22 23 23 24 25 26 26 27 28 28 29 Welcome Back Porcine Probability q The State of Pig s The State of Pig l Concept Inventory p Hold and Roll s Hold and Roll l Named Tuples p Clueless s Clueless p Hold At Strategy s Hold At Strategy p Play Pig s Play Pig l Dependency Injection p Loading the Dice s Loading the Dice q Optimizing Strategy s Optimizing Strategy l Utility q Game Theory s Game Theory q Break Even Point s Break Even Point q Whats your Crossover l Optimal Pig l Pwin p Maxwins s Maxwins l Impressing Pig Scouts p Maximizing Differential s Maximizing Differential l Being Careful p Legal Actions s Legal Actions l Using Tools l Telling A Story q Simulation vs Enumeration s Simulation vs Enumeration l Conditional Probability q Tuesday s Tuesday l Summary
1. 01 Welcome Back
Hey, welcome back. Now, as we've said, this class is all about managing complexity. Now many types of software manage complexity by trying to artificially rule out any type of uncertainty. That is, say you have a checkbook-balancing program, and it says you've got to enter the exact amount. You've got to say $39.27. You can't say, oh I don't know about $40. It's easier to write programs that deal that way, but it constrains what you can do. So, in this unit we're going to learn about how the laws of probability can allow you to deal with uncertainty in your programs. Now, the truly amazing thing is that you can allow uncertainty and what you know about the world, or what's true right now and uncertainty in your actions, if the program does something, what happens next? Even though both of those are uncertain you can still use the laws of probability to calculate what it means to do the right thing. That is, we can have clarity of action. We can know exactly what the best thing to do is even though we're uncertain about what's going to happen. So follow with this unit, and we'll learn how to do that.
2. 02 Porcine Probability
This unit is about probability, which is a tool for dealing with uncertainty. Once you understand probability, you'll be able to tackle a much broader range of problems than you could with programs that don't understand probability. Often when we have problems with uncertainty, we're dealing with search problems. Recall, in a search problem, we are in a current state. There are other states that we can transition into, and we're trying to achieve some goal, but we can't do it all in one step. We have to paste together a sequence
wiki.udacity.com/CS212 Unit 5
1/11
6/1/12
5. 04 l Concept Inventory
At the low level--I count as low-level things like the roll of a die, the implementation of scores, the implementation of the players and of the player to move, the goal--so these are all things that we're going to have to represent. And then at the high level, I'm going to have a function play-pig, that plays a game between two players, and I have the notion of a strategy--a strategy that a player is taking in order to play the game. Now let's think about how to implement these things, and when I'm doing the implementation, I'm going to move top-down. So I started sort of middle-out saying these are the kinds of things I think I'm going to need; now I have a good enough feel for them that I feel confident in moving top-down. I don't see any difficulties in implementing any of these pieces. If I start at the top, then I'll be able to make choices later on without feeling constrained. If I thought there was something down here that was difficult to deal with, I might spend more time now, at the low level, trying to resolve what the right representation is for one of these difficult pieces, and that would inform my high-level decisions. But since I don't see any difficulty, I'm going to jump to the high level. Now, what's
wiki.udacity.com/CS212 Unit 5
2/11
6/1/12
8. 06 l Named Tuples
Now here's an alternative. Instead of just defining a state by just creating a tuple and then getting at the fields of a state by doing an assignment, we can use something called a namedtuple that gives a name to the tuple itself as well as to the individual elements. We can define a new data type called state and use capitalized letters for data types. Say state is equal to a namedtuple, and the name of the data type is state, and the fields of the data type are p, me, you, and pending. So I can just go ahead and make that assertion. Namedtuples is in a module. So, from collections import namedtuple gives me access to it. Now I can say s = state (1,2,3,4), and I can ask for the components of s by name. How would I choose between this representation for states and the normal tuple representation? Well the namedtuple had a couple of advantages. It's explicit about the types. It helps you catch errors. So if you ask for the p field of something that's not a state that would give you an error. Whereas if you just broke up something that was four elements into these components that would work even if it didn't happen to be a proper state. There are a few negatives as well. It's a little bit more verbose, although not so much, and it may be unfamiliar to some programmers. It may take them a while to understand what namedtuples mean. I should say we could also do the same type of thing by defining a class. That has all the same positives, and it's certainly familiar to most Python programmers, but it would be even more verbose. Here's what hold and roll look like in this new notation. So, hold--where we're explicitly creating a new state. We look at the state.p, the state.you, the state.me, and the state.pending and so on, similarly for roll. They look fairly similar. You notice the lines are a little bit longer in terms of we're being more explicit. So, it takes a little bit more to say that. I'm sort of up in the air whether this representation is better than the previous representation with tuples. I could go either way.
9. 07 p Clueless
Now I'm going to talk about strategies for a minute. Remember a strategy is a function, which takes a state as input, and it's output is one of the action names, roll or hold. I want you to write a strategy function, which we're calling clueless. So its a function that takes a state as input, and it's going to return one of the possible moves, roll or hold. It does that by ignoring the state and just choosing one of the possible moves at random. So go ahead and write that.
10. 07 s Clueless
Here's my solution: I gave you the hint of importing the random module. I just call it the random choice function, which takes a set of possible moves and picks one at random.
wiki.udacity.com/CS212 Unit 5
3/11
6/1/12
wiki.udacity.com/CS212 Unit 5
4/11
6/1/12
20. 13 l Utility
Now in economics and in game theory, the value of a state is called its utility. It's just a number. So we're at the end of the game, and if there's 1 state where we win, we'll give that a utility of 1. If there's another state where we lose, we'll give that a utility of 0. Now if I have a choice here--it's my turn to move--I have a choice to go either way. I'm going to maximize my choice, and I'm going to move there. That means the utility of this state is going to be 1 because I know I can get 1 by taking the optimal strategy. We keep backing up the tree that way. That's if it was my choice here. If it was my opponent's choice, and they could go in either way, then my opponent is going to minimize my score or maximize their score and go in this direction, forcing me to lose and allowing them to win. So we'll say the utility of this state is 0 for me. And I want to also introduce here another idea called the quality, which is the function on a state and an action and gives us a number--a utility number. So that's saying, what's the quality of this action in this particular state? So if these were the actions, hold and roll, then we'd say for my opponent the quality of rolling from this state would give us this utility, and the quality of
wiki.udacity.com/CS212 Unit 5
5/11
6/1/12
wiki.udacity.com/CS212 Unit 5
6/11
6/1/12
27. 18 l Pwin
Now what's the probability of winning from a state? It seems complicated. It seems like we've got a lot of work to do, but actually, we've almost solved the whole thing. All we have to do is say, "What's the end point?" So remember, we start out in the start position, and then we have some end positions where the game is over, and we have to assign utilities, which is the same as probability of winning, which is either 0 or 1. So this is a losing state, so it gets a Pwin of 0. This is a winning state. It gets Pwin of 1. We assign all of those, and then all the other states that depend on these-- we've already figured that out in terms of the Q function. Let's see how that works. So the probability of winning is 1 if my current score plus the pending is greater than or equal to goal. Then I win automatically just by reaping those pending. My probability of winning is 0 if your score is greater than the goal and I haven't won. And otherwise, my probability of winning is the probability that I get by taking the best action. So for all the actions-- among all the actions I can do, look for the Q value of that action-- from the current state according to the utility function-- try to maximize that, and that's going to be my probability of winning. So that's saying I can make the best choice that I can. So we said that we had 3 choice points. Here, I'm making the best choice by maximizing. Here, the die gets to roll, and we're averaging-- we're summing them all up and dividing by 6, so that takes care of the averaging-- and what about the worst choice that the opponent makes? Well, that's just folded in because rather than explicitly worrying about me and my opponent, I just said, "Well, I can use That's the probability of the opponent winning.
28. 19 p Maxwins
So now we're almost there. We've defined the problem. We've defined how the game works, and we're ready to write the optimal strategy-- the best possible strategy for Pig, and I'll let you finish it off. We'll call this function "max_wins"-- the strategy function that maximizes the number of wins-- --at least the number of expected wins-- and go ahead and write your code there, and you can write it in terms of the functions we've defined above and in terms of a call to best action.
29. 19 s Maxwins
And this is all it is. We just call best action from the current state using the Pig actions, using the quality function for Pig and trying to maximize the probability of winning.
wiki.udacity.com/CS212 Unit 5
7/11
6/1/12
wiki.udacity.com/CS212 Unit 5
8/11
6/1/12
wiki.udacity.com/CS212 Unit 5
9/11
6/1/12
41. 28 q Tuesday
Now let's move on to a slightly more complicated question: out of all the families with two kids-- with at least 1 boy, born on a Tuesday-- what's the probability of two boys? Now, you might think that the answer should be the same-- it should still be 1/3 because why does Tuesday matter? After all, the kid's gotta be born sometime and if it happens to be Tuesday, why would that be any different than any other day? So is it 1/3? Well, as Gottfried Leibniz said, "Let us calculate." So we have the technology to model that. First, a random variable for day of the week-_ and I had to fool around with the capitalization there, to make sure that we have 7 distinct letters: Sunday, Monday, Tuesday, Wednesday, Thursday, Friday, Saturday-- plus ample space of two_kids_bday, one kid with their day of birth; the second kid, with their day of birth. What does that look like? Well, it's this huge thing of (2 X 7 X 2 X 7) entries. The first one: Boy born on Sunday; boy born on Sunday, all the way through to the last one: girl born on Saturday, girl born on Saturday. Then a boy born on Tuesday is all the elements of this, where "BT" appears is in the string. So either "BT" will be the first 2 characters or the last 2 characters. And now we're finally at the point where we can say: given at least 1 boy_tuesday, what's the probability of two_boys? And before I show the results, I'm going to ask you what you think it is. You could follow along, either with pencil and paper or do the computation or just think it out in your head. So Enter as a fraction. If you think it's 1/3, put a 1 here and a 3 here--or whatever.
42. 28 s Tuesday
If I go ahead and execute this and print the result, it comes out: 13/27. Wow! Where did that come from? So that' surprising--first of all, it's not 1/3, which you might have thought should be the answer if you believe the argument that Tuesday doesn't matter. And secondly, not only is it not 1/3, but it's much closer to 1/2 than it is to 1/3. So just having the birthday there really changed things a lot. How did that happen? Well, I wrote up a little function here to report my findings, and here's its arguments. You can give it a bunch of cases that you care about-- the predicate that you care about and whether you want the results to be verbose or not. And it just prints out some information-- and, by the way, as part of this, I also looked at the question of what's the probability of two_boys, given that there's one boy born in December so I threw that in as well. And here's the output I get: 2 boys, given 1 boy is 1/3; and born on Tuesday is 13/27, and born in December is 23/47. Now, I can turn on the verbose option to report In that case, here's what I see: The probability of 2 boys, given at least 1 boy-- born on Tuesday--is 13/27. And here's the reason--at least 1 boy, born on Tuesday, has 27 elements--and there they are-- and of these, 13 are 2 boys--and there they are. And so, you can't really argue with that. You can go through and you can make sure that that's correct, and you can look at the other elements of the sample space and say no, we didn't miss any-- so that's got to be the right answer. It's not quite intuitive yet, and I'd like to define my report function so that it gives me that intuition but right now, I don't have the right visualization. So I've got to do some of the work myself. And here's what I came up with: We still have the four possibilities that we showed before but now we're interested, not just in boys-- we're interested in boys born on Tuesday. So there's going to be some others over here where there's, say, boy born on Wednesday, along with some other partner-- maybe a boy born on Saturday. But we're not even considering them; we're throwing all those out. We're just considering the ones that match here. And like before, we draw 2 circles: one of the right-hand side of the event-- of the conditional probability. And so how many of those are there? Well, there's 7 possibilities here because the boy has to be born on Tuesday-- there's only 1 way to do that--but there's 7 ways for the girls to be born. So there's 7 elements of the sample state there; likewise, 7 elements over here. Now how many elements over here? Well here, either one of the 2 can be a boy born on Tuesday. So really, we should draw this state as either a boy born on Tuesday, followed by another boy or a boy, followed by a boy born on Tuesday. And how many of those are there? Well, there's 7 of these by the same argument we used in the other case, and of these, there's also 7 but now I've double-counted because in one of these 14 cases is a boy born on Tuesday, followed by a boy born on Tuesday. So I'll just count 6 here. And so now it should be clear: 7, 14, 21, 6, 27. There's 27 on the right-hand side, and then what's the probability of 2 boys, given this event of at least 1 boy born on Tuesday? Well, 2 boys--that's here--so it's 13 out of the 27. So that's the result. Seems hard to argue with. Both the drawing it out with a pen and the computing worked out to the same answer. Now why is it that we have a strong intuition that, knowing the boy born on Tuesday shouldn't make any difference? I think the answer is because we're associating that fact with an individual boy. We're like taking that fact and nailing it on to him--and it's true. If we did that, that wouldn't make any difference. But, in this situation, that's not what we're doing. We're
wiki.udacity.com/CS212 Unit 5
10/11
6/1/12
43. 29 l Summary
So let's summarize what we did in this Unit. We learned that probability is a powerful tool for tackling problems with uncertainty. We learned that we can do Search with uncertainty, like we did Search in the previous Unit, over Exact Certain domains. Here, we can handle uncertainty in our Search. We learned that the notion of Utility gives us a powerful and beautiful general approach to solving the Search problems. It gives us the best-action function with which we can solve any problem that can be specified in the form that best-action expects, and that's a wide variety of problems. Now, some of them are so complex that they can't be computed in a feasible amount of time. And there are more advanced techniques for dealing with approximations to that. But it's incredibly powerful because it separates out the How versus the What. You only have to tell the computer what the situation is. You don't have to tell it how to find the best answer, and it automatically finds the best answer. And we learned you can deal with probability through simulation, making repeated random choices, and just counting up in how many one answer occurs, versus another. And we learned that if the total number of possibilities is small, you can just enumerate them. You can count them all, and you can get an exact answer, as an exact fraction rather than an approximation. And we learned some general strategies that don't have to do with probability. When we were trying to figure out how to add printing to our game, we looked at the notion of a wrapper function. That is, how we inject functionality into an existing function, by sneaking it in on top of one of the arguments. And this is an example of aspect-oriented programming, where we take the aspect of printing out what's happening and keep that separate from the main logic of the program. We learned that you can do exploratory data analysis. When I was looking at the two strategies for playing PIG and where they differed, that was a completely different question than what I'd designed the PIG program for. Because I had put together the right pieces, it was easy to do the exploration and come to an understanding. And we learned--or, at least, I learned because I was the one who made the mistake-- that errors can pop up, particularly in the types of arguments and results that functions expect in return-- and that you have to be careful, Python, to deal with that because Python doesn't give you the seatbelts that other languages have, to protect yourself from those type of errors. So you have to be vigilant, on your own. And finally, that was a lot to cram into one Unit. So if you followed along all of that-- congratulations, for the work you've done. You've learned a lot. Have fun with the homework; we'll see you in the next Unit.
last edited 2012-05-21 03:04:24 b y Ed Grochowski
wiki.udacity.com/CS212 Unit 5
11/11