your programing

두 목록을 사전으로 변환

lovepro 2020. 9. 27. 13:29
반응형

두 목록을 사전으로 변환


당신이 가지고 있다고 상상해보십시오 :

keys = ['name', 'age', 'food']
values = ['Monty', 42, 'spam']

다음 사전을 생성하는 가장 간단한 방법은 무엇입니까?

a_dict = {'name' : 'Monty', 'age' : 42, 'food' : 'spam'}

이렇게 :

>>> keys = ['a', 'b', 'c']
>>> values = [1, 2, 3]
>>> dictionary = dict(zip(keys, values))
>>> print(dictionary)
{'a': 1, 'b': 2, 'c': 3}

쌍을 이루는 dict생성자와 zip함수는 매우 유용합니다 : https://docs.python.org/3/library/functions.html#func-dict


이 시도:

>>> import itertools
>>> keys = ('name', 'age', 'food')
>>> values = ('Monty', 42, 'spam')
>>> adict = dict(itertools.izip(keys,values))
>>> adict
{'food': 'spam', 'age': 42, 'name': 'Monty'}

Python 2에서는 .NET에 비해 메모리 소비가 더 경제적입니다 zip.


당신이 가지고 있다고 상상해보십시오 :

keys = ('name', 'age', 'food')
values = ('Monty', 42, 'spam')

다음 사전을 생성하는 가장 간단한 방법은 무엇입니까?

dict = {'name' : 'Monty', 'age' : 42, 'food' : 'spam'}

최고의 성능-Python 2.7 및 3, dict 이해 :

dict 생성자 사용에 대한 가능한 개선 사항은 dict comprehension의 기본 구문을 사용하는 것입니다 (다른 사람들이 실수로 입력 한 것처럼 목록 이해가 아님).

new_dict = {k: v for k, v in zip(keys, values)}

Python 2에서는 zip목록을 반환합니다. 불필요한 목록을 만들지 않으려면 izip대신 사용 합니다 (Zip으로 별칭을 지정하면 Python 3으로 이동할 때 코드 변경을 줄일 수 있음).

from itertools import izip as zip

그래서 그것은 여전히 ​​:

new_dict = {k: v for k, v in zip(keys, values)}

Python 2, <= 2.6에 이상적

izipfrom itertoolszipPython 3에서 izipzip보다 낫습니다 (불필요한 목록 생성을 방지하기 때문에). 2.6 이하에서는 이상적입니다.

from itertools import izip
new_dict = dict(izip(keys, values))

파이썬 3

Python 3에서는 모듈에 zip있던 것과 동일한 함수가 itertools되므로 간단합니다.

new_dict = dict(zip(keys, values))

dict 이해가 더 성능이 좋을 것입니다 (이 답변 끝에있는 성능 검토 참조).

모든 사례에 대한 결과 :

모든 경우에:

>>> new_dict
{'age': 42, 'name': 'Monty', 'food': 'spam'}

설명:

도움말을 살펴보면 dict다양한 형태의 인수가 필요하다는 것을 알 수 있습니다.


>>> help(dict)

class dict(object)
 |  dict() -> new empty dictionary
 |  dict(mapping) -> new dictionary initialized from a mapping object's
 |      (key, value) pairs
 |  dict(iterable) -> new dictionary initialized as if via:
 |      d = {}
 |      for k, v in iterable:
 |          d[k] = v
 |  dict(**kwargs) -> new dictionary initialized with the name=value pairs
 |      in the keyword argument list.  For example:  dict(one=1, two=2)

최적의 접근 방식은 불필요한 데이터 구조를 생성하지 않고 반복 가능을 사용하는 것입니다. Python 2에서 zip은 불필요한 목록을 만듭니다.

>>> zip(keys, values)
[('name', 'Monty'), ('age', 42), ('food', 'spam')]

Python 3에서 동등한 것은 다음과 같습니다.

>>> list(zip(keys, values))
[('name', 'Monty'), ('age', 42), ('food', 'spam')]

파이썬 3은 zip단순히 반복 가능한 객체를 생성합니다.

>>> zip(keys, values)
<zip object at 0x7f0e2ad029c8>

불필요한 데이터 구조를 생성하지 않기를 원하기 때문에 일반적으로 Python 2를 피하고 싶습니다 zip(불필요한 목록을 생성하기 때문에).

성능이 떨어지는 대안 :

이것은 dict 생성자에 전달되는 생성기 표현식입니다.

generator_expression = ((k, v) for k, v in zip(keys, values))
dict(generator_expression)

또는 동등하게 :

dict((k, v) for k, v in zip(keys, values))

그리고 이것은 dict 생성자에 전달되는 목록 이해입니다.

dict([(k, v) for k, v in zip(keys, values)])

In the first two cases, an extra layer of non-operative (thus unnecessary) computation is placed over the zip iterable, and in the case of the list comprehension, an extra list is unnecessarily created. I would expect all of them to be less performant, and certainly not more-so.

Performance review:

In 64 bit Python 3.7.3, on Ubuntu 18.04, ordered from fastest to slowest:

>>> min(timeit.repeat(lambda: dict(zip(keys, values))))
0.4772876740025822
>>> min(timeit.repeat(lambda: {k: v for k, v in zip(keys, values)}))
0.5217149950040039
>>> min(timeit.repeat(lambda: {keys[i]: values[i] for i in range(len(keys))}))
0.6797661719901953
>>> min(timeit.repeat(lambda: dict([(k, v) for k, v in zip(keys, values)])))
0.7864680950006004
>>> min(timeit.repeat(lambda: dict((k, v) for k, v in zip(keys, values))))
0.8561034000013024

A commenter said:

min seems like a bad way to compare performance. Surely mean and/or max would be much more useful indicators for real usage.

We use min because these algorithms are deterministic. We want to know the performance of the algorithms under the best conditions possible.

If the operating system hangs for any reason, it has nothing to do with what we're trying to compare, so we need to exclude those kinds of results from our analysis.

If we used mean, those kinds of events would skew our results greatly, and if we used max we will only get the most extreme result - the one most likely affected by such an event.

A commenter also says:

In python 3.6.8, using mean values, the dict comprehension is indeed still faster, by about 30% for these small lists. For larger lists (10k random numbers), the dict call is about 10% faster.

I presume we mean dict(zip(... with 10k random numbers. That does sound like a fairly unusual use case. It does makes sense that the most direct calls would dominate in large datasets, and I wouldn't be surprised if OS hangs are dominating given how long it would take to run that test, further skewing your numbers. And if you use mean or max I would consider your results meaningless.

Let's use a more realistic size on our top examples:

import numpy
import timeit
l1 = list(numpy.random.random(100))
l2 = list(numpy.random.random(100))

And we see here that dict(zip(... does indeed run faster for larger datasets by about 20%.

>>> min(timeit.repeat(lambda: {k: v for k, v in zip(l1, l2)}))
9.698965263989521
>>> min(timeit.repeat(lambda: dict(zip(l1, l2))))
7.9965161079890095

>>> keys = ('name', 'age', 'food')
>>> values = ('Monty', 42, 'spam')
>>> dict(zip(keys, values))
{'food': 'spam', 'age': 42, 'name': 'Monty'}

You can also use dictionary comprehensions in Python ≥ 2.7:

>>> keys = ('name', 'age', 'food')
>>> values = ('Monty', 42, 'spam')
>>> {k: v for k, v in zip(keys, values)}
{'food': 'spam', 'age': 42, 'name': 'Monty'}

A more natural way is to use dictionary comprehension

keys = ('name', 'age', 'food')
values = ('Monty', 42, 'spam')    
dict = {keys[i]: values[i] for i in range(len(keys))}

If you need to transform keys or values before creating a dictionary then a generator expression could be used. Example:

>>> adict = dict((str(k), v) for k, v in zip(['a', 1, 'b'], [2, 'c', 3])) 

Take a look Code Like a Pythonista: Idiomatic Python.


with Python 3.x, goes for dict comprehensions

keys = ('name', 'age', 'food')
values = ('Monty', 42, 'spam')

dic = {k:v for k,v in zip(keys, values)}

print(dic)

More on dict comprehensions here, an example is there:

>>> print {i : chr(65+i) for i in range(4)}
    {0 : 'A', 1 : 'B', 2 : 'C', 3 : 'D'}

For those who need simple code and aren’t familiar with zip:

List1 = ['This', 'is', 'a', 'list']
List2 = ['Put', 'this', 'into', 'dictionary']

This can be done by one line of code:

d = {List1[n]: List2[n] for n in range(len(List1))}

  • 2018-04-18

The best solution is still:

In [92]: keys = ('name', 'age', 'food')
...: values = ('Monty', 42, 'spam')
...: 

In [93]: dt = dict(zip(keys, values))
In [94]: dt
Out[94]: {'age': 42, 'food': 'spam', 'name': 'Monty'}

Tranpose it:

    lst = [('name', 'Monty'), ('age', 42), ('food', 'spam')]
    keys, values = zip(*lst)
    In [101]: keys
    Out[101]: ('name', 'age', 'food')
    In [102]: values
    Out[102]: ('Monty', 42, 'spam')

I had this doubt while I was trying to solve a graph-related problem. The issue I had was I needed to define an empty adjacency list and wanted to initialize all the nodes with an empty list, that's when I thought how about I check if it is fast enough, I mean if it will be worth doing a zip operation rather than simple assignment key-value pair. After all most of the times, the time factor is an important ice breaker. So I performed timeit operation for both approaches.

import timeit
def dictionary_creation(n_nodes):
    dummy_dict = dict()
    for node in range(n_nodes):
        dummy_dict[node] = []
    return dummy_dict


def dictionary_creation_1(n_nodes):
    keys = list(range(n_nodes))
    values = [[] for i in range(n_nodes)]
    graph = dict(zip(keys, values))
    return graph


def wrapper(func, *args, **kwargs):
    def wrapped():
        return func(*args, **kwargs)
    return wrapped

iteration = wrapper(dictionary_creation, n_nodes)
shorthand = wrapper(dictionary_creation_1, n_nodes)

for trail in range(1, 8):
    print(f'Itertion: {timeit.timeit(iteration, number=trails)}\nShorthand: {timeit.timeit(shorthand, number=trails)}')

For n_nodes = 10,000,000 I get,

Iteration: 2.825081646999024 Shorthand: 3.535717916001886

Iteration: 5.051560923002398 Shorthand: 6.255070794999483

Iteration: 6.52859034499852 Shorthand: 8.221581164998497

Iteration: 8.683652416999394 Shorthand: 12.599181543999293

Iteration: 11.587241565001023 Shorthand: 15.27298851100204

Iteration: 14.816342867001367 Shorthand: 17.162912737003353

Iteration: 16.645022411001264 Shorthand: 19.976680120998935

You can clearly see after a certain point, iteration approach at n_th step overtakes the time taken by shorthand approach at n-1_th step.


you can use this below code:

dict(zip(['name', 'age', 'food'], ['Monty', 42, 'spam']))

But make sure that length of the lists will be same.if length is not same.then zip function turncate the longer one.


Here is also an example of adding a list value in you dictionary

list1 = ["Name", "Surname", "Age"]
list2 = [["Cyd", "JEDD", "JESS"], ["DEY", "AUDIJE", "PONGARON"], [21, 32, 47]]
dic = dict(zip(list1, list2))
print(dic)

always make sure the your "Key"(list1) is always in the first parameter.

{'Name': ['Cyd', 'JEDD', 'JESS'], 'Surname': ['DEY', 'AUDIJE', 'PONGARON'], 'Age': [21, 32, 47]}

method without zip function

l1 = [1,2,3,4,5]
l2 = ['a','b','c','d','e']
d1 = {}
for l1_ in l1:
    for l2_ in l2:
        d1[l1_] = l2_
        l2.remove(l2_)
        break  

print (d1)


{1: 'd', 2: 'b', 3: 'e', 4: 'a', 5: 'c'}

참고URL : https://stackoverflow.com/questions/209840/convert-two-lists-into-a-dictionary

반응형