Python Myths About Indention
Python Myths About Indention
hawk
There are quite some prejudices and myths about Python's indentation rules among people who don't really know Python. I'll try to address a few of these concerns on this page.
>>> if 1 + 1 == 2: ... print "foo" ... print "bar" ... x = 42 >>> if 1 + 1 == 2: ... print "foo"; print "bar"; x = 42 >>> if 1 + 1 == 2: print "foo"; print "bar"; x = 42 Of course, most of the time you will want to write the blocks in separate lines (like the first version above), but sometimes you have a bunch of similar "if" statements which can be conveniently written on one line each. If you decide to write the block on separate lines, then yes, Python forces you to obey its indentation rules, which simply means: The enclosed block (that's two "print" statements and one assignment in the above example) have to be indented more than the "if" statement itself. That's it. And frankly, would you really want to indent it in any other way? I don't think so. So the conclusion is: Python forces you to use indentation that you would have used anyway, unless you wanted to obfuscate the structure of the program. In other words: Python does not allow to obfuscate the structure of a program by using bogus indentations. In my opinion, that's a very good thing. Have you ever seen code like this in C or C++? /* Warning: bogus C code! */
if (some condition) if (another condition) do_something(fancy); else this_sucks(badluck); Either the indentation is wrong, or the program is buggy, because an "else" always applies to the nearest "if", unless you use braces. This is an essential problem in C and C++. Of course, you could resort to always use braces, no matter what, but that's tiresome and bloats the source code, and it doesn't prevent you from accidentally obfuscating the code by still having the wrong indentation. (And that's just a very simple example. In practice, C code can be much more complex.) In Python, the above problems can never occur, because indentation levels and logical block structure are always consistent. The program always does what you expect when you look at the indentation. Quoting the famous book writer Bruce Eckel: Because blocks are denoted by indentation in Python, indentation is uniform in Python programs. And indentation is meaningful to us as readers. So because we have consistent code formatting, I can read somebody else's code and I'm not constantly tripping over, "Oh, I see. They're putting their curly braces here or there." I don't have to think about that.
indentation level is pushed on the stack, and an "INDENT" token is inserted into the token stream which is passed to the parser. There can never be more than one "INDENT" token in a row. When a line is encountered with a smaller indentation level, values are popped from the stack until a value is on top which is equal to the new indentation level (if none is found, a syntax error occurs). For each value popped, a "DEDENT" token is generated. Obviously, there can be multiple "DEDENT" tokens in a row. At the end of the source code, "DEDENT" tokens are generated for each indentation level left on the stack, until just the 0 is left. Look at the following piece of sample code: >>> if foo: ... if bar: ... x = 42 ... else: ... print foo ... In the following table, you can see the tokens produced on the left, and the indentation stack on the right. <if> <foo> <:> <INDENT> <if> <bar> <:> <INDENT> <x> <=> <42> <DEDENT> <DEDENT> <else> <:> <INDENT> <print> <foo> <DEDENT> [0] [0, 4] [0, 4, 8] [0] [0, 2] [0]
Note that after the lexical analysis (before parsing starts), there is no whitespace left in the list of tokens (except possibly within string literals, of course). In other words, the indentation is handled by the lexer, not by the parser. The parser then simply handles the "INDENT" and "DEDENT" tokens as block delimiters -- exactly like curly braces are handled by a C compiler. The above example is intentionally simple. There are more things to it, such as continuation lines. They are well-defined, too, and you can read about them in the Python Language Reference if you're interested, which includes a complete formal grammar of the language.