파이썬의 Indentation에 관한 미신들

Posted by epicdev Archive : 2011. 12. 3. 15:25
출처: http://www.secnetix.de/olli/Python/block_indentation.hawk

Python: Myths about Indentation

Note: Lines beginning with ">>>" and "..." indicate input to Python (these are the default prompts of the interactive interpreter). Everything else is output from Python.

There are quite some prejudices and myths about Python's indentation rules among people who don't really know Python. I'll try to address a few of these concerns on this page.


"Whitespace is significant in Python source code."

No, not in general. Only the indentation level of your statements is significant (i.e. the whitespace at the very left of your statements). Everywhere else, whitespace is not significant and can be used as you like, just like in any other language. You can also insert empty lines that contain nothing (or only arbitrary whitespace) anywhere.

Also, the exact amount of indentation doesn't matter at all, but only the relative indentation of nested blocks (relative to each other).

Furthermore, the indentation level is ignored when you use explicit or implicit continuation lines. For example, you can split a list across multiple lines, and the indentation is completely insignificant. So, if you want, you can do things like this:

>>> foo = [
...            'some string',
...         'another string',
...           'short string'
... ]
>>> print foo
['some string', 'another string', 'short string']

>>> bar = 'this is ' \
...       'one long string ' \
...           'that is split ' \
...     'across multiple lines'
>>> print bar
this is one long string that is split across multiple lines

"Python forces me to use a certain indentation style."

Yes and no. First of all, you can write the inner block all on one line if you like, therefore not having to care about intendation at all. The following three versions of an "if" statement are all valid and do exactly the same thing (output omitted for brevity):

>>> if 1 + 1 == 2:
...     print "foo"
...     print "bar"
...     x = 42

>>> if 1 + 1 == 2:
...     print "foo"; print "bar"; x = 42

>>> if 1 + 1 == 2: print "foo"; print "bar"; x = 42

Of course, most of the time you will want to write the blocks in separate lines (like the first version above), but sometimes you have a bunch of similar "if" statements which can be conveniently written on one line each.

If you decide to write the block on separate lines, then yes, Python forces you to obey its indentation rules, which simply means: The enclosed block (that's two "print" statements and one assignment in the above example) have to be indented more than the "if" statement itself. That's it. And frankly, would you really want to indent it in any other way? I don't think so.

So the conclusion is: Python forces you to use indentation that you would have used anyway, unless you wanted to obfuscate the structure of the program. In other words: Python does not allow to obfuscate the structure of a program by using bogus indentations. In my opinion, that's a very good thing.

Have you ever seen code like this in C or C++?

/*  Warning:  bogus C code!  */

if (some condition)
        if (another condition)
                do_something(fancy);
else
        this_sucks(badluck);

Either the indentation is wrong, or the program is buggy, because an "else" always applies to the nearest "if", unless you use braces. This is an essential problem in C and C++. Of course, you could resort to always use braces, no matter what, but that's tiresome and bloats the source code, and it doesn't prevent you from accidentally obfuscating the code by still having the wrong indentation. (And that's just a very simple example. In practice, C code can be much more complex.)

In Python, the above problems can never occur, because indentation levels and logical block structure are always consistent. The program always does what you expect when you look at the indentation.

Quoting the famous book writer Bruce Eckel:

Because blocks are denoted by indentation in Python, indentation is uniform in Python programs. And indentation is meaningful to us as readers. So because we have consistent code formatting, I can read somebody else's code and I'm not constantly tripping over, "Oh, I see. They're putting their curly braces here or there." I don't have to think about that.


"You cannot safely mix tabs and spaces in Python."

That's right, and you don't want that. To be exact, you cannot safely mix tabs and spaces in C either: While it doesn't make a difference to the compiler, it can make a big difference to humans looking at the code. If you move a piece of C source to an editor with different tabstops, it will all look wrong (and possibly behave differently than it looks at first sight). You can easily introduce well-hidden bugs in code that has been mangled that way. That's why mixing tabs and spaces in C isn't really "safe" either. Also see the "bogus C code" example above.

Therefore, it is generally a good idea not to mix tabs and spaces for indentation. If you use tabs only or spaces only, you're fine.

Furthermore, it can be a good idea to avoid tabs alltogether, because the semantics of tabs are not very well-defined in the computer world, and they can be displayed completely differently on different types of systems and editors. Also, tabs often get destroyed or wrongly converted during copy&paste operations, or when a piece of source code is inserted into a web page or other kind of markup code.

Most good editors support transparent translation of tabs, automatic indent and dedent. That is, when you press the tab key, the editor will insert enough spaces (not actual tab characters!) to get you to the next position which is a multiple of eight (or four, or whatever you prefer), and some other key (usually Backspace) will get you back to the previous indentation level.

In other words, it's behaving like you would expect a tab key to do, but still maintaining portability by using spaces in the file only. This is convenient and safe.

Having said that -- If you know what you're doing, you can of course use tabs and spaces to your liking, and then use tools like "expand" (on UNIX machines, for example) before giving the source to others. If you use tab characters, Python assumes that tab stops are eight positions apart.


"I just don't like it."

That's perfectly OK; you're free to dislike it (and you're probably not alone). Granted, the fact that indentation is used to indicate the block structure might be regarded as uncommon and requiring to get used to it, but it does have a lot of advantages, and you get used to it very quickly when you seriously start programming in Python.

Having said that, you can use keywords to indicate the end of a block (instead of indentation), such as "endif". These are not really Python keywords, but there is a tool that comes with Python which converts code using "end" keywords to correct indentation and removes those keywords. It can be used as a pre-processor to the Python compiler. However, no real Python programmer uses it, of course. 
[Update] It seems this tool has been removed from recent versions of Python. Probably because nobody really used it.


"How does the compiler parse the indentation?"

The parsing is well-defined and quite simple. Basically, changes to the indentation level are inserted as tokens into the token stream.

The lexical analyzer (tokenizer) uses a stack to store indentation levels. At the beginning, the stack contains just the value 0, which is the leftmost position. Whenever a nested block begins, the new indentation level is pushed on the stack, and an "INDENT" token is inserted into the token stream which is passed to the parser. There can never be more than one "INDENT" token in a row.

When a line is encountered with a smaller indentation level, values are popped from the stack until a value is on top which is equal to the new indentation level (if none is found, a syntax error occurs). For each value popped, a "DEDENT" token is generated. Obviously, there can be multiple "DEDENT" tokens in a row.

At the end of the source code, "DEDENT" tokens are generated for each indentation level left on the stack, until just the 0 is left.

Look at the following piece of sample code:

>>> if foo:
...     if bar:
...         x = 42
... else:
...   print foo
... 

In the following table, you can see the tokens produced on the left, and the indentation stack on the right.

<if> <foo> <:>                    [0]
<INDENT> <if> <bar> <:>           [0, 4]
<INDENT> <x> <=> <42>             [0, 4, 8]
<DEDENT> <DEDENT> <else> <:>      [0]
<INDENT> <print> <foo>            [0, 2]
<DEDENT>                          [0]

Note that after the lexical analysis (before parsing starts), there is no whitespace left in the list of tokens (except possibly within string literals, of course). In other words, the indentation is handled by the lexer, not by the parser.

The parser then simply handles the "INDENT" and "DEDENT" tokens as block delimiters -- exactly like curly braces are handled by a C compiler.

The above example is intentionally simple. There are more things to it, such as continuation lines. They are well-defined, too, and you can read about them in the Python Language Reference if you're interested, which includes a complete formal grammar of the language.

  

시니어 프로그래머, 행복한 프로그래밍 (3)

Posted by epicdev Archive : 2011. 12. 3. 13:09
한 프로그래머가 작성한 코드를 다른 프로그래머가 함께 검토하는 것을 의미하는 코드리뷰는 결코 형식적인 절차가 아니다. 버그를 미리 잡아내기 위한 것도 아니고, 코딩스타일이나 편집스타일을 강제하기 위한 것도 아니다. 코드리뷰를 수행하면 누구든지 자신이 작성한 코드의 내용을 다른 누군가에게 설명해야 한다. 그러한 설명은 일차적으로 코드를 작성한 사람이 스스로 작성한 코드의 내용을 정확히 이해하게 만들고, 이차적으로 코드를 검토하는 사람이 다른 사람이 작성한 코드의 내용에 친숙해지게 만든다. 그것이 핵심이다. 믿기 어렵겠지만 자기가 작성한 코드의 내용을 충분히 이해하지 못하는 프로그래머가 세상에는 생각보다 많다. 표창던지기 초식이나 베끼기 초식을 구사하는 프로그래머는 물론이고, 정상적인 방법으로 코드를 작성하는 사람조차도 유닛테스트를 작성하지 않거나 코드리뷰를 수행하지 않으면 자기가 작성한 코드의 내용을 순식간에 잊어버린다. 그런 사람들은 코드리뷰를 통해서 누군가에게 자기 코드의 내용을 상세히 설명하는 기회를 갖는다면 도움이 될 것이다. 바둑에서도 방금 두었던 바둑의 내용을 다시 검토하는 복기가 바둑실력을 향상하는 데 큰 도움을 주는 것처럼 프로그래밍에서도 코드리뷰는 많은 도움을 준다.


프로그래머그다음이야기프로그래머의길을생각한다
카테고리 컴퓨터/IT > 컴퓨터공학
지은이 임백준 (로드북, 2011년)
상세보기
 
  

시니어 프로그래머, 행복한 프로그래밍 (2)

Posted by epicdev Archive : 2011. 12. 3. 13:04
그렇지만 개발현장에서 유닛테스트 코드를 습관적으로 작성하는 프로그래머를 만나는 일은 쉽지 않다. 대부분의 프로그래머는 유닛테스트를 작성하는 데 거의 시간을 들이지 않는다. 그들은 요구사항의 윤곽이 대충 드러나면 곧바로 코딩을 시작한다. 코딩을 하다가 길이 막혀서 길이 보이지 않으면 처음으로 돌아가서 설계 자체를 검토하거나 리팩토링을 통해서 근본적인 문제를 해결하려고 노력하지 않는다. 대신 불리언 타입의 글로벌 변수를 선언한 다음, 꽈배기처럼 비비 꼬인 if-else 문장을 표창처럼 흩뿌린다. 그렇게 임시변통을 하고 길을 걷다가 길이 막히면 더 많은 불리언 변수와 더 많은 if-else 표창을 뿌리며 길을 헤쳐나간다. 그러다가 마침내 코드가 동작하는 방식이 요구사항이 설명하는 것과 대충 비슷한 것으로 보이면 손을 털고 일어선다. 프로그래밍이 끝난 것이다. 이렇게 작성된 코드는 당연히 수많은 논리적 오류와 버그를 내포하고 있을 수 밖에 없다. 하지만 걱정 없다. 버그가 보고되면 더 많은 if-else의 표창으로 버그를 잡아 버리면 되기 때문이다. 그리하여 코드는 버그가 하나씩 보고될 때마다 더욱 복잡해지고 난해해진다. 그러다가 마침내 프로그램은 코드를 작성한 프로그래머 자신조차 이해할 수 없는 누더기가 되어 버린다.

 
프로그래머그다음이야기프로그래머의길을생각한다
카테고리 컴퓨터/IT > 컴퓨터공학
지은이 임백준 (로드북, 2011년)
상세보기
  

시니어 프로그래머, 행복한 프로그래밍 (1)

Posted by epicdev Archive : 2011. 12. 3. 12:58
프로그래밍은 바둑과 닮은 구석이 많다. 고수와 하수의 관계는 그런 닮은 부분 중의 하나다. 고수인 바둑 1급이 하수인 바둑 5급의 수를 보면 너무 속내가 뻔히 들여다 보여서 웃음이 나온다. 그런데 바둑 5급은 1급의 수를 보면 숨이 막힐 뿐 상대방의 속내를 짐작하기 어렵다. 그렇지만 초절정고수인 프로기사가 바둑 1급이 놓은 수를 보면 어이가 없어서 한숨이 나오는 수가 한둘이 아닐 것이다.

이와 같은 실력의 상대성이 프로그래밍 세계에도 그대로 적용된다. 실력이 뛰어난 프로그래머가 실력이 부족한 사람이 작성한 코드를 보면 답답하고 안타까워서 한숨이 나온다. 리팩토링을 수행하고 싶은 욕구가 절로 일어난다. 실력이 떨어지는 사람은 자기보다 실력이 뛰어난 사람이 신묘한 방법을 동원해서 어려운 문제를 척척 해결하고 깔끔하게 코드를 작성하는 것을 보면 숨이 막히고 탄성이 나온다. 그렇지만 실력이 훨씬 더 뛰어난 사람이 보기에는 그런 뛰어난 프로그래머가 작성한 코드에서조차 마음에 들지 않는 부분이 많을 것이다.

 
프로그래머그다음이야기프로그래머의길을생각한다
카테고리 컴퓨터/IT > 컴퓨터공학
지은이 임백준 (로드북, 2011년)
상세보기
  
출처: http://stackoverflow.com/questions/285955/java-get-the-newest-file-in-a-directory


  

Reflection을 사용한 String Destroyer

Posted by epicdev Archive : 2011. 11. 16. 21:09
출처: http://snippets.dzone.com/posts/show/7920


출력 결과
Java World!

Note:
The original string ("Hello World") should be smaller or equal in size when compared to the new string ("Java World!") or else you will end up in ArrayIndexOufofBoundsException.

If the original string is smaller than the new string, then the new string will be truncated. In the above code if i use 'value.set("Hello World", "Java World$Rocks!".toCharArray());' then the output is 'Java World$'

If the original string is larger than the new string then you will end up in ArrayIndexOutOfBoundsException
Stack trace below:
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException
at java.lang.System.arraycopy(Native Method)
at java.lang.String.getChars(Unknown Source)
at java.io.BufferedWriter.write(Unknown Source)
at java.io.Writer.write(Unknown Source)
at java.io.PrintStream.write(Unknown Source)
at java.io.PrintStream.print(Unknown Source)
at java.io.PrintStream.println(Unknown Source)
at com.test.reflection.StringDestroyer.main(StringDestroyer.java:16)

I digged into String.java source code and found that there is a variable int count declared final. Once assigned, the value for count variable cannot be changed. The System.arraycopy(in the above exception) statement uses this count variable for copying the char array to another one. 
  
출력 결과
  

Java에서 현재 스택 상황을 보는 방법

Posted by epicdev Archive : 2011. 11. 16. 20:17
출력 결과
  
출처: http://stackoverflow.com/questions/160970/how-do-i-invoke-a-java-method-when-given-the-method-name-as-a-string

Java reflection API를 사용하면된다. 

Coding from the hip, it would be something like:

The parameters identify the very specific method you need (if there are several overloaded available, if the method has no arguments, only give methodName).

Then you invoke that method by calling

Again, leave out the arguments in .invoke, if you don't have any. But yeah. Read about Java Reflection 
  
읽어보고 article 새로 쓰기: http://www.ibm.com/developerworks/java/library/j-jtp01255/index.html

출처: http://stackoverflow.com/questions/2927391/whats-the-reason-i-cant-create-generic-array-types-in-java

Arrays of generic types are not allowed because they're not sound. The problem is due to the interaction of Java arrays, which are not statically sound but are dynamically checked, with generics, which are statically sound and not dynamically checked. Here is how you could exploit the loophole:


출처: http://download.oracle.com/javase/tutorial/java/generics/erasure.html

Type Erasure

When a generic type is instantiated, the compiler translates those types by a technique called type erasure — a process where the compiler removes all information related to type parameters and type arguments within a class or method. Type erasure enables Java applications that use generics to maintain binary compatibility with Java libraries and applications that were created before generics.

For instance, Box<String> is translated to type Box, which is called the raw type — a raw type is a generic class or interface name without any type arguments. This means that you can't find out what type of Object a generic class is using at runtime. The following operations are not possible:

public class MyClass<E> {
    public static void myMethod(Object item) {
        if (item instanceof E) {  //Compiler error
            ...
        }
        E item2 = new E();       //Compiler error
        E[] iArray = new E[10];  //Compiler error
        E obj = (E)new Object(); //Unchecked cast warning
    }
}

The operations shown in bold are meaningless at runtime because the compiler removes all information about the actual type argument (represented by the type parameter E) at compile time.

Type erasure exists so that new code may continue to interface with legacy code. Using a raw type for any other reason is considered bad programming practice and should be avoided whenever possible.

When mixing legacy code with generic code, you may encounter warning messages similar to the following:

Note: WarningDemo.java uses unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.

This can happen when using an older API that operates on raw types, as shown in the following WarningDemo program:

public class WarningDemo {
    public static void main(String[] args){
        Box<Integer> bi;
        bi = createBox();
    }

    static Box createBox(){
        return new Box();
    }
}

Recompiling with -Xlint:unchecked reveals the following additional information:

WarningDemo.java:4: warning: [unchecked] unchecked conversion
found   : Box
required: Box<java.lang.Integer>
        bi = createBox();
                      ^ 
1 warning 

'Archive' 카테고리의 다른 글

Java에서 현재 스택 상황을 보는 방법  (0) 2011.11.16
Java에서 메소드명(String)으로 메소드 호출하기  (0) 2011.11.16
In Praise Of Small Code  (0) 2011.11.15
Hollywood Principle  (0) 2011.11.15
A Taxonomy for "Bad Code Smells"  (0) 2011.11.15
  
 «이전 1 ··· 5 6 7 8 9 10 11 ··· 17  다음»