You've got the right idea extracting the first item from each tuple. You can make your code more concise using a list/generator comprehension, as I show you below.
From that point on, the most idiomatic manner to find frequency counts of elements is using a collections.Counter
object.
- Extract the first elements from your list of tuples (using a comprehension)
- Pass this to
Counter
- Query count of
example
from collections import Counter
counts = Counter(x[0] for x in b_data)
print(counts['example'])
Sure, you can use list.count
if it’s only one item you want to find frequency counts for, but in the general case, a Counter
is the way to go.
The advantage of a Counter
is it performs frequency counts of all elements (not just example
) in linear (O(N)
) time. Say you also wanted to query the count of another element, say foo
. That would be done with -
print(counts['foo'])
If 'foo'
doesn’t exist in the list, 0
is returned.
If you want to find the most common elements, call counts.most_common
-
print(counts.most_common(n))
Where n
is the number of elements you want to display. If you want to see everything, don't pass n
.
To retrieve counts of most common elements, one efficient way to do this is to query most_common
and then extract all elements with counts over 1, efficiently with itertools
.
from itertools import takewhile
l = [1, 1, 2, 2, 3, 3, 1, 1, 5, 4, 6, 7, 7, 8, 3, 3, 2, 1]
c = Counter(l)
list(takewhile(lambda x: x[-1] > 1, c.most_common()))
[(1, 5), (3, 4), (2, 3), (7, 2)]
(OP edit) Alternatively, use a list comprehension to get a list of items having count > 1 -
[item[0] for item in counts.most_common() if item[-1] > 1]
Keep in mind that this isn’t as efficient as the itertools.takewhile
solution. For example, if you have one item with count > 1, and a million items with count equal to 1, you’d end up iterating over the list a million and one times, when you don’t have to (because most_common
returns frequency counts in descending order). With takewhile
that isn’t the case, because you stop iterating as soon as the condition of count > 1 becomes false.