Dark theme

Using a profiler


Here's my best guess. The overall program doesn't really seem to use much memory, and there's little increase between size == 100 and size == 1000. The non-creation of random numbers seems to help. However, the range numbers don't seem to be being garbage collected very well. If 1000 numbers = 0.0117187500MiB (above) 10,000 loops of 10,000 numbers will be 1171.875MiB or ~1.4GiB. This is getting dangerously close to the amount of memory Windows is likely to allocate to a program on an old 32bit/X84 machine like mine (details), and it might be worth running it on a newer machine, or, better still, a Linux machine which has less memory limits. In actual fact, this is a worse-case memory prediction as we'd expect some variables to be garbage collected as the system sees memory being used up, so I'd probably risk it. Looking at the speed: most of the elements that take any time (forming the random numbers, for example) seem likely to scale linearly at best with size (we don't, for example, know whether append needs more time as the material added increases, but let's assume it does). We might, therefore, think that under the worst case, the same program, which runs at 0.049s for 100 loops of 100 values, might take 490s or ~9 minutes for 10,000 loops of 10,000 values.

However, the worse case is actually only the worst estimate of the most optimistic prediction. As it happens, infact the first 1000 loops took about 10 minutes on my machine, at which point I killed it off, so the estimate is ~one order of magnitude out. This shows up an important point – that while memory estimates are likely to be approximately right (indeed, slightly pesimistic in this case as the system will probably do some garbage collection of dead variables), timing needs more than one known value in order to estimate it as it scales very badly. In general you need several tests to estimate timing limits at even low memory uses, and as (as here) the memory requirements approach the limits allocated to Python by the operating system the time requirements are likely to be unpredictable. The best you can do with timing is use small runs to optimise the code, and then slowly scale up until you have an idea what is impractical.


In general, all code goes through one or more cycles of "refactoring", which is where you take code that runs, and optimise it and clear up any stylistic elements so it is readable and well documented. The kind of things we might do to speed up code during refactoring includes:

  • Removing extranious variables, for example, turning:
    data_row = fill_row()
    data.append(data_row)

    into:
    data.append(fill_row())
  • Removing function calls, where it can be done without making the code more confusing and long-winded. We'd especially remove them from loops, as they take a long time and we don't want to be doing them thousands of times.

So, truth be told, our code would be better as: test3.py. This runs on the same machine as the previous code in 0.012s rather than 0.049s for size == 100, which shows somewhat the effect of function calls. We'll point out other ways of being more efficient, both with time and memory, during the course.