The right way to loop in Python (codes included)
What is the fastest and most efficient way to loop in Python. We found that the numpy is fastest and python builtins are the most memory efficient.
Introduction
Since, Python by itself is slow, it becomes import to know the nitty-gritty of different components of our code to efficienty code. In this post, we will look into most common ways we loop in Python using a simple summing example. We will also compute the memory profile to inspect which way is the most memory efficient for analyzing huge datasets.
Speed-up your codes by parallel computing in Python
Parallel computing is quickly becoming a necessity. Modern computers comes with more than one process and we most often only use single process to do most of...
Similar posts
The while
loop
import timeit
import numpy as np
nval = 1000000
# usual while loop
def while_loop(n=nval):
i, sumval = 0, 0
while i < n:
sumval += 1
i += 1
return sumval
if __name__ == "__main__":
print(
f"while_loop: {timeit.timeit(while_loop, number = 10):.6f}s")
This returns while_loop: 0.727578s
. We can also do the memory profiling of this function.
import timeit
import numpy as np
from memory_profiler import profile
nval = 1000000
# usual while loop
@profile(precision=4)
def while_loop(n=nval):
i, sumval = 0, 0
while i < n:
sumval += 1
i += 1
return sumval
if __name__ == "__main__":
while_loop()
This returns:
Line # Mem usage Increment Occurences Line Contents
============================================================
10 25.8984 MiB 25.8984 MiB 1 @profile(precision=4)
11 def while_loop(n=nval):
12 25.8984 MiB 0.0000 MiB 1 i, sumval = 0, 0
13 25.9727 MiB 0.0000 MiB 1000001 while i < n:
14 25.9727 MiB 0.0625 MiB 1000000 sumval += 1
15 25.9727 MiB 0.0117 MiB 1000000 i += 1
16
17 25.9727 MiB 0.0000 MiB 1 return sumval
In total, the while loop took 0.0743
Mb of the memory usage for the above task.
The for
loop
import timeit
import numpy as np
nval = 1000000
# usual for loop
def for_loop(n=nval):
sumval = 0
for i in range(n):
sumval += i
return sumval
if __name__ == "__main__":
print(
f"for_loop: {timeit.timeit(for_loop, number = 10):.6f}s")
This returns for_loop: 0.490051s
. Now, we do the memory profiling of this function.
Line # Mem usage Increment Occurences Line Contents
============================================================
22 25.9922 MiB 25.9922 MiB 1 @profile(precision=4)
23 def for_loop(n=nval):
24 25.9922 MiB 0.0000 MiB 1 sumval = 0
25 26.0273 MiB 0.0117 MiB 1000001 for i in range(n):
26 26.0273 MiB 0.0234 MiB 1000000 sumval += i
27 26.0273 MiB 0.0000 MiB 1 return sumval
In total, the for loop took 0.0351
Mb of the memory usage for the above task.
The builtin python function
import timeit
import numpy as np
nval = 1000000
# using built in sum
def builtinsum(n=nval):
return sum(range(n))
if __name__ == "__main__":
print(
f"builtinsum: {timeit.timeit(builtinsum, number = 10):.6f}s")
This returns builtinsum: 0.175238s
.
Line # Mem usage Increment Occurences Line Contents
============================================================
46 25.8867 MiB 25.8867 MiB 1 @profile(precision=4)
47 def builtinsum(n=nval):
48 25.8906 MiB 0.0039 MiB 1 return sum(range(n))
In total, the “builtin function” based function took 0.0039
Mb of the memory usage for the above task.
The numpy
function
import timeit
import numpy as np
nval = 1000000
# using numpy sum
def numpysum(n=nval):
return np.sum(np.arange(n))
if __name__ == "__main__":
print(
f"numpysum: {timeit.timeit(numpysum, number = 10):.6f}s")
This returns numpysum: 0.017640s
.
Line # Mem usage Increment Occurences Line Contents
============================================================
53 25.9766 MiB 25.9766 MiB 1 @profile(precision=4)
54 def numpysum(n=nval):
55 33.6172 MiB 7.6406 MiB 1 return np.sum(np.arange(n))
In total, the numpy based function took 7.6407
Mb of the memory usage for the above task.
Conclusions
Please note that these values of run time and memory usage may differ from system to system but the ratio of these values between different methods will stay very similar.
We found that the numpy is fastest (0.017640
s) and while loop sum is the slowest (0.727578
s). The reason for the while loop to be slow is that each step of the task is completed in the native Python. Since numpy
is written in C
, it runs quite fast.
In terms of the memory usage, the numpy is the worst. It took ~7
Mb of the memory usage. In contrast, the “builtin python function” based function is the most memory efficient as it does not store all the data into memory but does it in steps.
If we compare the while and for loop, then for loop is fast and also more memory efficient. Hence, for loop should always be our first choice (and usually is) unless we don’t know the total number of runs.
References
Disclaimer of liability
The information provided by the Earth Inversion is made available for educational purposes only.
Whilst we endeavor to keep the information up-to-date and correct. Earth Inversion makes no representations or warranties of any kind, express or implied about the completeness, accuracy, reliability, suitability or availability with respect to the website or the information, products, services or related graphics content on the website for any purpose.
UNDER NO CIRCUMSTANCE SHALL WE HAVE ANY LIABILITY TO YOU FOR ANY LOSS OR DAMAGE OF ANY KIND INCURRED AS A RESULT OF THE USE OF THE SITE OR RELIANCE ON ANY INFORMATION PROVIDED ON THE SITE. ANY RELIANCE YOU PLACED ON SUCH MATERIAL IS THEREFORE STRICTLY AT YOUR OWN RISK.
Leave a comment