Python list efficiency

  • Products
    • Retrace APM, logs, errors, and monitoring for QA and Prod
    • Prefix Free Lightweight web request profiler for workstations
    • Pricing
  • Company
    • Blog
    • Support
    • About
    • Contact
    • Case Studies
    • Careers
    • GDPR
  • Login
  • Free Retrace Trial
Type your search query and hit enter:

Stackify

Type your search query and hit enter:
Categories: Developer Tips, Tricks & Resources

Python Performance Tuning: 20 Simple Tips

Python is a powerful and versatile higher-order programming language. Whether youre developing a web application or working with machine learning, this language has you covered. Python does well at optimizing developer productivity. You can quickly create a program that solves a business problem or fills a practical need. However, the solutions you reach when developing quickly arent always optimized for python performance.

When youre trying to shave secondsor even minutesfrom execution time, its good to get a reminder of strategies that might help. The strategies on this list can help you make your applications as fast as possible.

Some of the things on this list might be obvious to you, but others may be less so. Some will have a big impact on execution, and others will have smaller, more subtle effects. Check out this list, and consider bookmarking this page for future reference.

Python list efficiency
Python list efficiency

1. Use list comprehensions.

When youre working in Python, loops are common. Youve probably come across list comprehensions before. Theyre a concise and speedy way to create new lists.

For example, lets say you wanted to find the cubes of all the odd numbers in a given range. Using a for loop, that task might look like this:

cube_numbers = [] for n in range(0,10): if n % 2 == 1: cube_numbers.append(n**3)

In contrast, a list comprehension approach would just be one line:

cube_numbers = [n**3 for n in range(1,10) if n%2 == 1]

The list comprehension approach is shorter and more concise, of course. More important, its notably faster when running in code. As with all these tips, in small code bases that have small ranges, using this approach may not make much of a difference. But in other situations, it may make all the difference when youre trying to save some time.

2. Remember the built-In functions.

Python comes with a lot of batteries included. You can writehigh-quality, efficient code, but its hard to beat the underlying libraries. These have been optimized and are tested rigorously (like your code, no doubt). Read thelistof the built-ins, and check if youre duplicating any of this functionality in your code.

3. Use xrange() instead of range().

Python 2 used the functions range() and xrange() to iterate over loops. The first of these functions stored all the numbers in the range in memory and got linearly large as the range did. The second, xrange(), returned the generator object. When looping with this object, the numbers are in memory only on demand.

import sys numbers = range(1, 1000000) print(sys.getsizeof(numbers))

This returns 8000064, whereas the same range of numbers with xrange returns 40. If your application is in Python 2, then swapping these functions can have a big impact on memory usage. The good news is that Python 3 implements the xrange() functionality by default. So, while theres no xrange() function, the range() function already acts like this.

4. Consider writing your own generator.

The previous tip hints at a general pattern for optimizationnamely, that its better to use generators where possible. These allow you to return an item at a time rather than all the items at once. As mentioned, the xrange() function is a generator in Python 2, as is the range() function in Python 3.

If youre working with lists, consider writing your own generator to take advantage of this lazy loading and memory efficiency. Generators are particularly useful when reading a large number of large files. Its possible to process single chunks without worrying about the size of the files.

Heres an example you might use when web scraping and crawling recursively.

import requests import re def get_pages(link): pages_to_visit = [] pages_to_visit.append(link) pattern = re.compile('https?') while pages_to_visit: current_page = pages_to_visit.pop(0) page = requests.get(current_page) for url in re.findall('<a href="([^"]+)">', str(page.content)): if url[0] == '/': url = current_page + url[1:] if pattern.match(url): pages_to_visit.append(url) yield current_page webpage = get_pages('http://www.example.com') for result in webpage: print(result)

This example simply returns a page at a time and performs an action of some sort. In this case, youre printing the link. Without a generator, youd need to fetch and process at the same time or gather all the links before you started processing. This code is cleaner, faster, and easier to test.

5. Use in if possible.

To check if membership of a list, its generally faster to use the in keyword.

for name in member_list: print('{} is a member'.format(name))

6. Be lazy with your module importing.

When you startedlearning Python, you probably got advice to import all the modules youre using at the start of your program. Maybe you still sort these alphabetically.

This approach makes it easier to keep track of what dependencies your program has. However, the disadvantage is that all your imports load at startup.

Why not try a different approach? You can load the modules only when you need them. This technique helps distribute the loading time for modules more evenly, which may reduce peaks of memory usage.

7. Use sets and unions.

Ive mentioned loops a few times in this list already. Most experts agree that too much looping puts unnecessary strain on your server. Its rarely the most efficient approach.

Say you wanted to get the overlapping values in two lists. You could do this using nested for loops, like this:

a = [1,2,3,4,5] b = [2,3,4,5,6] overlaps = [] for x in a: for y in b: if x==y: overlaps.append(x) print(overlaps)

This will print the list [2, 3, 4, 5]. The number of comparisons here will get very large, very quickly.

Another approach would be:

a = [1,2,3,4,5] b = [2,3,4,5,6] overlaps = set(a) & set(b) print(overlaps)

This will print the dictionary {2, 3, 4, 5}. Youre leaning on the built-in functions and getting a big speed and memory bump as a result.

8. Remember to use multiple assignment.

Python has an elegant way to assign the values of multiple variables.

first_name, last_name, city = "Kevin", "Cunningham", "Brighton"

You can use this method to swap the values of variables.

x, y = y, x

This approach is much quicker and cleaner than:

temp = x x = y y = temp

9. Avoid global variables.

Using few global variables is an effective design pattern because it helps you keep track of scope and unnecessary memory usage. Also, Python is faster retrieving a local variable than a global one. So, avoid that global keyword as much as you can.

10. Use join() to concatenate strings.

In Python, you can concatenate strings using +. However, strings in Python are immutable, and the + operation involves creating a new string and copying the old content at each step. A more efficient approach would be to use the array module to modify the individual characters and then use the join() function to re-create your final string.

new = "This" + "is" + "going" + "to" + "require" + "a" + "new" + "string" + "for" + "every" + "word" print(new)

That code will print:

Thisisgoingtorequireanewstringforeveryword

On the other hand, this code:

new = " ".join(["This", "will", "only", "create", "one", "string", "and", "we", "can", "add", "spaces."]) print(new)

will print:

This will only create one string and we can add spaces.

This is cleaner, more elegant, and faster.

11. Keep up-to-date on the latest Python releases.

The Python maintainers are passionate about continually making the language faster and more robust. In general, each new release of the language has improved python performance and security. Just be sure that the libraries you want to use are compatible with the newest version before you make the leap.

12. Use while 1 for an infinite loop.

If youre listening on a socket, then youll probably want to use an infinite loop. The normal route to achieve this is to use while True. This works, but you can achieve the same effect slightly faster by using while 1. This is a single jump operation, as it is a numerical comparison.

13. Try another way.

Once youve used a coding approach in your application, it can be easy to rely on that method again and again. However, experimenting can allow you to see which techniques are better. Not only will this keep you learning and thinking about the code you write, but it can also encourage you to be more innovative. Think about how you can creatively apply new coding techniques to get faster results in your application.

14. Exit early.

Try to leave a function as soon as you know it can do no more meaningful work. Doing this reduces the indentation of your program and makes it more readable. It also allows you to avoid nested if statements.

if positive_case: if particular_example: do_something else: raise exception

You can test the input in a few ways before carrying out your actions. Another approach is to raise the exception early and to carry out the main action in the else part of the loop.

if not positive_case: raise exception if not particular_example: raise exception do_something

Now you can see what this block of code is trying to achieve at first glance. You dont need to follow the chain of logic in the conditionals. Also, you can clearly see when this function would raise an exception.

15. Learn itertools.

Its been called agem.If you havent heard of it, then youre missing out on a great part of the Python standard library. You can use the functions in itertools to create code thats fast, memory efficient, and elegant.

Dive into the documentation, and look fortutorialsto get the most out of this library. One example is the permutations function. Lets say you wanted to generate all the permutations of [Alice, Bob, Carol].

import itertools iter = itertools.permutations(["Alice", "Bob", "Carol"]) list(iter)

This function will return all possible permutations:

[('Alice', 'Bob', 'Carol'), ('Alice', 'Carol', 'Bob'), ('Bob', 'Alice', 'Carol'), ('Bob', 'Carol', 'Alice'), ('Carol', 'Alice', 'Bob'), ('Carol', 'Bob', 'Alice')]

Its really useful and blazingly fast!

16. Try decorator caching.

Memoization is a specific type of caching that optimizes software running speeds. Basically, a cache stores the results of an operation for later use. The results could be rendered web pages or the results of complex calculations.

You can try this yourself with calculating the 100th Fibonacci number. If you havent come across these numbers, each one is the sum of the previous two numbers. Fibonacci was an Italian mathematician who discovered that these numbers cropped up in lots of places. From the number of petals on a flower to legs on insects or branches on a tree, these numbers are common in nature. The first few are 1, 1, 2, 3, 5.

One algorithm to calculate these is:

def fibonacci(n): if n == 0: # There is no 0'th number return 0 elif n == 1: # We define the first number as 1 return 1 return fibonacci(n - 1) + fibonacci(n-2)

When I used this algorithm to find the 36th Fibonacci number, fibonacci(36), my computer sounded like it was going to take off! The calculation took five seconds, and (in case youre curious) the answer was 14,930,352.

When you introduce caching from the standard library, however, things change. It takes only a few lines of code.

import functools @functools.lru_cache(maxsize=128) def fibonacci(n): if n == 0: return 0 elif n == 1: return 1 return fibonacci(n - 1) + fibonacci(n-2)

In Python, a decorator function takes another function and extends its functionality. We denote these functions with the @ symbol. In the example above, Ive used the decoratorfunctools.lru_cachefunction provided by the functools module. Ive passed the maximum number of items to store in my cache at the same time as an argument. There are other forms of decorator caching, including writing your own, but this is quick and built-in. How quick? Well, this time the calculation took 0.7 seconds, and reassuringly, the answer was the same.

17. Use keys for sorts.

If you search for some examples of sorting, a lot of the code examples you find will work but could be outdated. Often these examples create a custom sort and cost time in the setup and in performing the sort. The best way to sort items is to use keys and the default sort() method whenever possible. Ive mentioned already that the built-in functions are generally faster, and this is one of those times.

import operator my_list = [("Josh", "Grobin", "Singer"), ("Marco", "Polo", "General"), ("Ada", "Lovelace", "Scientist")] my_list.sort(key=operator.itemgetter(0)) my_list

This will sort the list by the first keys:

[('Ada', 'Lovelace', 'Scientist'), ('Josh', 'Grobin', 'Singer'), ('Marco', 'Polo', 'General')]

You can easily sort by the second key, like so:

my_list.sort(key=operator.itemgetter(1)) my_list

This will return the list below. You can see its sorted by the second names.

[('Josh', 'Grobin', 'Singer'), ('Ada', 'Lovelace', 'Scientist'), ('Marco', 'Polo', 'General')]

In each case, the list is sorted according to the index you select as part of the key argument. This approach works with numbers and strings, and its readable and fast.

18. Dont construct a set for a conditional.

Sometimes you might find yourself wanting to optimize your code with something like this:

if animal in set(animals):

This idea seems to make sense. There might be a lot of animals, and de-duplicating them feels like it might be faster.

if animal in animals:

Even though there may be significantly more animals in the list to check, the interpreter is optimized so much that applying the set function is likely to slow things down. Checking in a long list is almost always a faster operation without using the set function.

19. Use linked lists.

The Python list datatype implements as an array. That means adding an element to the start of the list is a costly operation, as every item has to be moved forward. A linked list is a datatype that may come in handy. It differs from arrays, as each item has a link to the next item in the listhence the name!

An array needs the memory for the list allocated up front. That allocation can be expensive and wasteful, especially if you dont know the size of the array in advance.

A linked list lets you allocate the memory when you need it. Each item can be stored in different parts of memory, and the links join the items.

The gotcha here is that lookup times are slower. Youll need to do some thorough profiling to work out whether this is a better method for you.

20. Use a cloud-based python performance tool.

When youre working locally, you can useprofiling toolsthat will give you insight into the bottlenecks in your application. If your application will be deployed to the web, however, things are different.Stackifywill allow you to see how well your application performs under production load. It also provides code profiling, error tracking, and server metrics. Jump over to thePython sectionto find out how this could work with your application.

Conclusion

Hopefully, some of these tips will help your code run faster and allow you to get better python performance from your application.

Any list of tips is not going to do your thinking for you. However, this list points out some common pitfalls and poses questions for you to ask of your code. It also encourages you to ask questions about architecture and design that will make your applications run faster and more efficiently.

Kevin Cunningham

This post was written by Kevin Cunningham. Kevin is a full-stack developer working with PHP and Python on the backend and Vanilla JS and Vue on the front end. He loves taking ideas and making something tangible from them that adds value to businesses and lives. He spent 12 years teaching high school mathematics and still loves doing puzzles when hes not hanging out with his wife and sons on the beach in Brighton, England.

Next A Guide to Streams in PHP: In-Depth Tutorial With Examples »
Previous « Reducing MTTR in the Field: 10 Simple Steps Using Retrace
Share
Tags: DeveloperPrefixprofilingPython

    Related Post

  • How To Ensure You Are Following The Best Coding Practices

    From email automation to website functions to apps on our phones, the digital world runs

  • A Full Guide on Using Application Monitoring for Your Business

    Wondering whether you can optimize your business using application performance monitoring? The answer is Surely!

  • Test Coverage in Software Testing: Its Relevance & Important Techniques to Take Note

    Software is wonderfully useful for so many situations but it can be a challenge to

Recent Posts

  • Developer Tips, Tricks & Resources

How To Ensure You Are Following The Best Coding Practices

From email automation to website functions to apps on our phones, the digital world runs

2 weeks ago
  • Developer Tips, Tricks & Resources

OOP Concepts for Beginners: What is Polymorphism

Object-Oriented Programming has different concepts allowing developers to build logical code. One of these concepts

2 weeks ago
  • Developer Tips, Tricks & Resources

A Full Guide on Using Application Monitoring for Your Business

Wondering whether you can optimize your business using application performance monitoring? The answer is Surely!

3 weeks ago
  • Developer Tips, Tricks & Resources

What are Linux Logs? How to View Them, Most Important Directories & More

In software, its essential to monitor logs of system activities. Today well unravel what Linux

3 weeks ago
  • Developer Tips, Tricks & Resources

Test Coverage in Software Testing: Its Relevance & Important Techniques to Take Note

Software is wonderfully useful for so many situations but it can be a challenge to

4 weeks ago
  • Developer Tips, Tricks & Resources

How to Log to Console in PHP

As a programming language, PHP is a developer favorite. An essential PHP programming best practice

1 month ago
All Rights ReservedView Non-AMP Version

This website uses cookies.

Accept