Python list with initial size

Python - Create a list with initial capacity

Code like this often happens:

l = [] while foo: #baz l.append[bar] #qux

This is really slow if you're about to append thousands of elements to your list, as the list will have to be constantly resized to fit the new elements.

In Java, you can create an ArrayList with an initial capacity. If you have some idea how big your list will be, this will be a lot more efficient.

I understand that code like this can often be re-factored into a list comprehension. If the for/while loop is very complicated, though, this is unfeasible. Is there any equivalent for us Python programmers?

python list dictionary initialization
bhargav
3 Years ago

Answers 9

Subscribe
Submit Answer
  • Sunil Patel
    3 Years ago

    For some applications, a dictionary may be what you are looking for. For example, in the find_totient method, I found it more convenient to use a dictionary since I didn't have a zero index.

    def totient[n]: totient = 0 if n == 1: totient = 1 else: for i in range[1, n]: if math.gcd[i, n] == 1: totient += 1 return totient def find_totients[max]: totients = dict[] for i in range[1,max+1]: totients[i] = totient[i] print['Totients:'] for i in range[1,max+1]: print[i,totients[i]]

    This problem could also be solved with a preallocated list:

    def find_totients[max]: totients = None*[max+1] for i in range[1,max+1]: totients[i] = totient[i] print['Totients:'] for i in range[1,max+1]: print[i,totients[i]]

    I feel that this is not as elegant and prone to bugs because I'm storing None which could throw an exception if I accidentally use them wrong, and because I need to think about edge cases that the map lets me avoid.

    It's true the dictionary won't be as efficient, but as others have commented, small differences in speed are not always worth significant maintenance hazards.

  • Rakesh
    3 Years ago

    From what I understand, python lists are already quite similar to ArrayLists. But if you want to tweak those parameters I found this post on the net that may be interesting [basically, just create your own ScalableList extension]:

    //mail.python.org/pipermail/python-list/2000-May/035082.html

  • sahil Kothiya
    3 Years ago

    Concerns about pre-allocation in Python arise if you're working with numpy, which has more C-like arrays. In this instance, pre-allocation concerns are about the shape of the data and the default value.

    Consider numpy if you're doing numerical computation on massive lists and want performance.

  • sahil Kothiya
    3 Years ago

    i ran @s.lott's code and produced the same 10% perf increase by pre-allocating. tried @jeremy's idea using a generator and was able to see the perf of the gen better than that of the doAllocate. For my proj the 10% improvement matters, so thanks to everyone as this helps a bunch.

    def doAppend[ size=10000 ]: result = [] for i in range[size]: message= "some unique object %d" % [ i, ] result.append[message] return result def doAllocate[ size=10000 ]: result=size*[None] for i in range[size]: message= "some unique object %d" % [ i, ] result[i]= message return result def doGen[ size=10000 ]: return list["some unique object %d" % [ i, ] for i in xrange[size]] size=1000 @print_timing def testAppend[]: for i in xrange[size]: doAppend[] @print_timing def testAlloc[]: for i in xrange[size]: doAllocate[] @print_timing def testGen[]: for i in xrange[size]: doGen[] testAppend[] testAlloc[] testGen[] testAppend took 14440.000ms testAlloc took 13580.000ms testGen took 13430.000ms
  • sahil Kothiya
    3 Years ago

    As others have mentioned, the simplest way to pre-seed a list with NoneType objects.

    That being said, you should understand the way Python lists actually work before deciding this is necessary. In the CPython implementation of a list, the underlying array is always created with overhead room, in progressively larger sizes [ 4, 8, 16, 25, 35, 46, 58, 72, 88, 106, 126, 148, 173, 201, 233, 269, 309, 354, 405, 462, 526, 598, 679, 771, 874, 990, 1120, etc], so that resizing the list does not happen nearly so often.

    Because of this behavior, most list.append[] functions are O[1] complexity for appends, only having increased complexity when crossing one of these boundaries, at which point the complexity will be O[n]. This behavior is what leads to the minimal increase in execution time in S. Lott's answer.

    Source: //www.laurentluce.com/posts/python-list-implementation/

View more answer

Video liên quan

Chủ Đề