your programing

[]이 list ()보다 빠른 이유는 무엇입니까?

lovepro 2020. 9. 30. 11:14
반응형

[]이 list ()보다 빠른 이유는 무엇입니까?


최근의 처리 속도 비교 []list()그 발견 놀랐다 []실행 빠르게 세 번 것보다 더 이상 list(). 나는과 동일한 테스트를 실행 {}하고 dict(): 결과는 실질적으로 동일했다 []{}동안 모두 약 0.128sec / 만회했다 list()dict()약 0.428sec / 만회 각했다.

왜 이런거야? 수행 []하고 {}(아마 ()''도) 자신의 명시 적 이름의 대응이 (동안 즉시 일부 빈 재고 리터럴의 복사본을 다시 전달 list(), dict(), tuple(), str())가 완전히 실제로 요소가 있는지 여부, 개체를 만드는 방법에 대해 이동?

이 두 가지 방법이 어떻게 다른지 모르겠지만 알고 싶습니다. 문서 또는 SO에서 답을 찾을 수 없었고 빈 괄호를 검색하는 것이 예상보다 더 문제가되는 것으로 나타났습니다.

나는 호출하여 내 타이밍 결과를 가지고 timeit.timeit("[]")timeit.timeit("list()"), 및 timeit.timeit("{}")timeit.timeit("dict()")각각 목록 및 사전을 비교. 저는 Python 2.7.9를 실행하고 있습니다.

나는 최근에 " 왜 True가 1보다 느린가? "라는 것을 발견 했다. 이것은 if Trueto 의 성능을 비교하고 if 1유사한 리터럴 대 글로벌 시나리오를 다루는 것처럼 보인다. 아마도 고려할 가치가 있습니다.


때문에 []하고 {}있습니다 리터럴 구문 . 파이썬은 목록이나 사전 객체를 만들기 위해 바이트 코드를 만들 수 있습니다 :

>>> import dis
>>> dis.dis(compile('[]', '', 'eval'))
  1           0 BUILD_LIST               0
              3 RETURN_VALUE        
>>> dis.dis(compile('{}', '', 'eval'))
  1           0 BUILD_MAP                0
              3 RETURN_VALUE        

list()그리고 dict()별도의 개체입니다. 이름을 확인해야하고, 인수를 푸시하기 위해 스택을 포함해야하며, 나중에 검색하기 위해 프레임을 저장해야하며, 호출을해야합니다. 모두 시간이 더 걸립니다.

비어있는 경우에, 그 방법 당신은에있는 아주 최소한 LOAD_NAME(전역 네임 스페이스뿐만 아니라 통해 검색해야하는 __builtin__모듈 a로 다음) CALL_FUNCTION현재 프레임을 유지해야한다 :

>>> dis.dis(compile('list()', '', 'eval'))
  1           0 LOAD_NAME                0 (list)
              3 CALL_FUNCTION            0
              6 RETURN_VALUE        
>>> dis.dis(compile('dict()', '', 'eval'))
  1           0 LOAD_NAME                0 (dict)
              3 CALL_FUNCTION            0
              6 RETURN_VALUE        

다음을 사용하여 이름 조회 시간을 별도로 지정할 수 있습니다 timeit.

>>> import timeit
>>> timeit.timeit('list', number=10**7)
0.30749011039733887
>>> timeit.timeit('dict', number=10**7)
0.4215109348297119

시간 불일치로 인해 사전 해시 충돌이있을 수 있습니다. 해당 객체를 호출 한 시간에서 해당 시간을 빼고 결과를 리터럴 사용 시간과 비교합니다.

>>> timeit.timeit('[]', number=10**7)
0.30478692054748535
>>> timeit.timeit('{}', number=10**7)
0.31482696533203125
>>> timeit.timeit('list()', number=10**7)
0.9991960525512695
>>> timeit.timeit('dict()', number=10**7)
1.0200958251953125

따라서 개체를 호출하는 데는 1.00 - 0.31 - 0.30 == 0.391,000 만 호출 당 추가로 몇 초가 걸립니다 .

전역 이름을 로컬로 별칭을 지정하여 전역 조회 비용을 피할 수 있습니다 ( timeit설정을 사용 하면 이름에 바인딩하는 모든 것이 로컬 임).

>>> timeit.timeit('_list', '_list = list', number=10**7)
0.1866450309753418
>>> timeit.timeit('_dict', '_dict = dict', number=10**7)
0.19016098976135254
>>> timeit.timeit('_list()', '_list = list', number=10**7)
0.841480016708374
>>> timeit.timeit('_dict()', '_dict = dict', number=10**7)
0.7233691215515137

하지만 그 CALL_FUNCTION비용 을 결코 극복 할 수 없습니다 .


list()전역 조회 및 함수 호출이 필요하지만 []단일 명령어로 컴파일됩니다. 보다:

Python 2.7.3
>>> import dis
>>> print dis.dis(lambda: list())
  1           0 LOAD_GLOBAL              0 (list)
              3 CALL_FUNCTION            0
              6 RETURN_VALUE        
None
>>> print dis.dis(lambda: [])
  1           0 BUILD_LIST               0
              3 RETURN_VALUE        
None

왜냐하면 문자열을 목록 객체로 변환 list하는 함수 이기 때문에 []박쥐에서 목록을 만드는 데 사용됩니다. 이것을 시도하십시오 (당신에게 더 이해가 될 것입니다) :

x = "wham bam"
a = list(x)
>>> a
["w", "h", "a", "m", ...]

동안

y = ["wham bam"]
>>> y
["wham bam"]

입력 한 내용이 포함 된 실제 목록을 제공합니다.


The answers here are great, to the point and fully cover this question. I'll drop a further step down from byte-code for those interested. I'm using the most recent repo of CPython; older versions behave similar in this regard but slight changes might be in place.

Here's a break down of the execution for each of these, BUILD_LIST for [] and CALL_FUNCTION for list().


The BUILD_LIST instruction:

You should just view the horror:

PyObject *list =  PyList_New(oparg);
if (list == NULL)
    goto error;
while (--oparg >= 0) {
    PyObject *item = POP();
    PyList_SET_ITEM(list, oparg, item);
}
PUSH(list);
DISPATCH();

Terribly convoluted, I know. This is how simple it is:

  • Create a new list with PyList_New (this mainly allocates the memory for a new list object), oparg signalling the number of arguments on the stack. Straight to the point.
  • Check that nothing went wrong with if (list==NULL).
  • Add any arguments (in our case this isn't executed) located on the stack with PyList_SET_ITEM (a macro).

No wonder it is fast! It's custom-made for creating new lists, nothing else :-)

The CALL_FUNCTION instruction:

Here's the first thing you see when you peek at the code handling CALL_FUNCTION:

PyObject **sp, *res;
sp = stack_pointer;
res = call_function(&sp, oparg, NULL);
stack_pointer = sp;
PUSH(res);
if (res == NULL) {
    goto error;
}
DISPATCH();

Looks pretty harmless, right? Well, no, unfortunately not, call_function is not a straightforward guy that will call the function immediately, it can't. Instead, it grabs the object from the stack, grabs all arguments of the stack and then switches based on the type of the object; is it a:

We're calling the list type, the argument passed in to call_function is PyList_Type. CPython now has to call a generic function to handle any callable objects named _PyObject_FastCallKeywords, yay more function calls.

This function again makes some checks for certain function types (which I cannot understand why) and then, after creating a dict for kwargs if required, goes on to call _PyObject_FastCallDict.

_PyObject_FastCallDict finally gets us somewhere! After performing even more checks it grabs the tp_call slot from the type of the type we've passed in, that is, it grabs type.tp_call. It then proceeds to create a tuple out of of the arguments passed in with _PyStack_AsTuple and, finally, a call can finally be made!

tp_call, which matches type.__call__ takes over and finally creates the list object. It calls the lists __new__ which corresponds to PyType_GenericNew and allocates memory for it with PyType_GenericAlloc: This is actually the part where it catches up with PyList_New, finally. All the previous are necessary to handle objects in a generic fashion.

In the end, type_call calls list.__init__ and initializes the list with any available arguments, then we go on a returning back the way we came. :-)

Finally, remmeber the LOAD_NAME, that's another guy that contributes here.


It's easy to see that, when dealing with our input, Python generally has to jump through hoops in order to actually find out the appropriate C function to do the job. It doesn't have the curtesy of immediately calling it because it's dynamic, someone might mask list (and boy do many people do) and another path must be taken.

This is where list() loses much: The exploring Python needs to do to find out what the heck it should do.

Literal syntax, on the other hand, means exactly one thing; it cannot be changed and always behaves in a pre-determined way.

Footnote: All function names are subject to change from one release to the other. The point still stands and most likely will stand in any future versions, it's the dynamic look-up that slows things down.


Why is [] faster than list()?

The biggest reason is that Python treats list() just like a user-defined function, which means you can intercept it by aliasing something else to list and do something different (like use your own subclassed list or perhaps a deque).

It immediately creates a new instance of a builtin list with [].

My explanation seeks to give you the intuition for this.

Explanation

[] is commonly known as literal syntax.

In the grammar, this is referred to as a "list display". From the docs:

A list display is a possibly empty series of expressions enclosed in square brackets:

list_display ::=  "[" [starred_list | comprehension] "]"

A list display yields a new list object, the contents being specified by either a list of expressions or a comprehension. When a comma-separated list of expressions is supplied, its elements are evaluated from left to right and placed into the list object in that order. When a comprehension is supplied, the list is constructed from the elements resulting from the comprehension.

In short, this means that a builtin object of type list is created.

There is no circumventing this - which means Python can do it as quickly as it may.

On the other hand, list() can be intercepted from creating a builtin list using the builtin list constructor.

For example, say we want our lists to be created noisily:

class List(list):
    def __init__(self, iterable=None):
        if iterable is None:
            super().__init__()
        else:
            super().__init__(iterable)
        print('List initialized.')

We could then intercept the name list on the module level global scope, and then when we create a list, we actually create our subtyped list:

>>> list = List
>>> a_list = list()
List initialized.
>>> type(a_list)
<class '__main__.List'>

Similarly we could remove it from the global namespace

del list

and put it in the builtin namespace:

import builtins
builtins.list = List

And now:

>>> list_0 = list()
List initialized.
>>> type(list_0)
<class '__main__.List'>

And note that the list display creates a list unconditionally:

>>> list_1 = []
>>> type(list_1)
<class 'list'>

We probably only do this temporarily, so lets undo our changes - first remove the new List object from the builtins:

>>> del builtins.list
>>> builtins.list
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: module 'builtins' has no attribute 'list'
>>> list()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'list' is not defined

Oh, no, we lost track of the original.

Not to worry, we can still get list - it's the type of a list literal:

>>> builtins.list = type([])
>>> list()
[]

So...

Why is [] faster than list()?

As we've seen - we can overwrite list - but we can't intercept the creation of the literal type. When we use list we have to do the lookups to see if anything is there.

Then we have to call whatever callable we have looked up. From the grammar:

A call calls a callable object (e.g., a function) with a possibly empty series of arguments:

call                 ::=  primary "(" [argument_list [","] | comprehension] ")"

We can see that it does the same thing for any name, not just list:

>>> import dis
>>> dis.dis('list()')
  1           0 LOAD_NAME                0 (list)
              2 CALL_FUNCTION            0
              4 RETURN_VALUE
>>> dis.dis('doesnotexist()')
  1           0 LOAD_NAME                0 (doesnotexist)
              2 CALL_FUNCTION            0
              4 RETURN_VALUE

For [] there is no function call at the Python bytecode level:

>>> dis.dis('[]')
  1           0 BUILD_LIST               0
              2 RETURN_VALUE

It simply goes straight to building the list without any lookups or calls at the bytecode level.

Conclusion

We have demonstrated that list can be intercepted with user code using the scoping rules, and that list() looks for a callable and then calls it.

Whereas [] is a list display, or a literal, and thus avoids the name lookup and function call.

참고URL : https://stackoverflow.com/questions/30216000/why-is-faster-than-list

반응형