Python itertools module


Introduction

itertools is a collection of “fast, memory efficient tools”. If you have not used "itertools" module, then most likely you might have written code, that was unnecessary. Its name is itertools as, it consists of functions made using iterator building blocks. All of the functions in the itertools module have yield keyword. Remember ? Every generator is an iterator but not every iterator is a generator.


Split

islice
to slice list like [start:stop] we can optionally pass step also. If we pass only one argument then it is value for stop and start = 0 by default. If you want to specify start, then you must specify stop. to indicate stop = 'till end', use None as stop value  

combinatoric 

accumulate
Here default is sum (for int) and concatenation (for string). Default function : "operator.sum". One can pass custom function also. E.g. operator.mul. 
Example: add new item and repeat last two items. So each item will be repeated 3 times. 

def f(a, b):
    print(a, b)
    return a[-2:] + b

def fn_accumulate():
    print(list(itertools.accumulate('abcdefg', f)))

This can be written in better way, using second order recurrence relation, as mentioned in https://realpython.com/python-itertools/

permutations = npr
combinations = ncr
combinations_with_replacement = ncr + pair with self

For all above three functions, r can be passed optionally. 
product to generate Cartesian product. It can be used to generate binary numbers. 
map(list, itertools.product("01", repeat = 5))

Merge

chain
merging two lists, quicker. it can use to add anything as prefix in list. As suffix in list. 

chain.from_iterable
Here we need to first put all iterables in a list and then pass the list (iterables )

zip_longest
similar to zip. Add 'None' for unpair items. We can also pass fillvalue to replace None with something else. 

we can use zip to add index number to all. 

for i in zip(count(1), ['a', 'b', 'c']):
    print i

Infinite operator

count is an infinite iterator 
To stop (1) use if condition OR (2) use islice
optionally step can be passed to count. This step can be fractions.Fraction too. 

cycle is also infinite iterator. It will cycle through a series of values infinitely
To stop use if condition

repeat takes object as input, instead of itertable. 
To stop pass optional argument 'times'

Filter

All functions in filter category takes predicate function as an argument. 

compress 
Here second argument is list of Booleans. Accordingly the first argument will be compressed, i.e. absent or present. It is like masking with boolean & (AND) operator.  Here the Boolean list can be output of cycle to create a pattern. Note: these values in list are consider as False. 0, False, None, ''
For example

def fn_compress():
    mask = [1, 0, 0]
    res = itertools.compress(range(20), itertools.cycle(mask))
    for i in res:
        print(i)


dropwhile
instead of list of Booleans, we can pass predicate as first argument here. Here if predicate return true, then drop. if predict return false then do not drop + disable predict. 

takewhile
it is opposite of dropwhile. Here if predicate return true then take. if predict return false then do not take + siable predict. 

filterfalse 
Same as dropwhile. Here the predict will not be disable. 
It will return all values where predicate function returns false. It takes predicate function and itertable as input. This is opposite to built-in-function filter.

We can use built-in function filter, to list out all the factors of given integer and find out prime or not also. 

Grouping

groupby
if sort tuple, then it will be sorted as per first element. Then if use groupby for this sorted tuple then it will return itertato of "key , iterator" combinations. Here key is first memeber of tuple, which repeat  in several tuple. and iterator is list of all tuples, where that perticular key is used. 

This can be used to remove duplicate character in string 
>>> foo = "SSYYNNOOPPSSIISS"
>>> import itertools
>>> ''.join(ch for ch, _ in itertools.groupby(foo))
'SYNOPSIS'

Note: 
groupby will not do sorting. So if we call without calling sorted then it will group by only locally. As such no use of groupby without calling sorted. in sorted we can pass key.

sorted function by default sort on first value of tuple. This can be change by passing
1. key = lambda x: x[1]
2. key = operator.itemgetter(1)

Miscellaneous 

starmap 
It is similar to map. Here, we can pss list, or, list of tuple. We can use to convert co-ordinate systems, Cartesian to Polar and vica-versa. with map function, we can pass two list. Those two list should be zip and pass as list of tuple in starmap. for map if we want to pass constant in stead of one of the iterators, we can use repeat from itertools. The map can also use to create list of objects, where constructor is passed as function. 

tee
it will generate multiple independant iterators. It is like copy function. It can also create multiple and identical tuples (iterators) from a string. Default = 2 copies. When you call tee() to create n independent iterators, each iterator is essentially working with its own FIFO queue.

Possible software using itertools

1. Veda Chanting pattern. GHAN - PAATH etc.
2. Convolution coding.
3. Turbo coding
4. Gray code generator. 
5. Fibonacci series. 


Reference

Blog

https://medium.com/discovering-data-science-a-chronicle/itertools-to-the-rescue-427abdecc412
https://www.blog.pythonlibrary.org/2016/04/20/python-201-an-intro-to-itertools/
https://pymotw.com/3/itertools/index.html
https://realpython.com/python-itertools/

Github : similar tools

https://github.com/erikrose/more-itertools
https://github.com/topics/itertools

Jupiter Notebook

https://github.com/ericchan24/itertools/blob/master/itertools.ipynb
https://github.com/dineshsonachalam/Python_Practice/blob/master/6.Itertools.ipynb

Official Documentation

https://docs.python.org/3/library/itertools.html

0 comments:

Post a Comment