I have Pandas Data Frame containing product and it's state along with other information. An example data frame can be created as follows
data = {'Product':['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B', 'B', 'B', 'C', 'C'],
'Date' : ['2020-01-01', '2020-01-02', '2020-01-03', '2020-01-04','2020-01-01', '2020-01-02', '2020-01-03', '2020-01-04', '2020-01-05', '2020-01-06', '2020-01-01', '2020-01-02'],
'Price':[10, 20, 30, 40, 15, 25, 35, 45, 55, 65, 101, 102],
'state':[1, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 1] }
test = pd.DataFrame(data)
I want count how many times the product state changes from 0 to 1. I've used the following code to check if product state goes from 0 to 1 and named it as change
test['change'] = np.where(test.state < test.state.shift(-1), 1, 0)
The problem is the code above does not takes into account product therefore I need to group by product and then check change in state.
Output:
+---------+----------+-------+-------+--------+
| Product | Date | Price | state | change |
+---------+----------+-------+-------+--------+
| A | 1/1/2020 | 10 | 1 | 0 |
| A | 1/2/2020 | 20 | 0 | 1 |
| A | 1/3/2020 | 30 | 1 | 0 |
| A | 1/4/2020 | 40 | 0 | 1 |
| B | 1/1/2020 | 15 | 1 | 0 |
| B | 1/2/2020 | 25 | 0 | 0 |
| B | 1/3/2020 | 35 | 0 | 1 |
| B | 1/4/2020 | 45 | 1 | 0 |
| B | 1/5/2020 | 55 | 0 | 1 |
| B | 1/6/2020 | 65 | 1 | 0 |
| C | 1/1/2020 | 101 | 0 | 1 |
| C | 1/2/2020 | 102 | 1 | 0 |
+---------+----------+-------+-------+--------+
As seen from the output above for product A on 4th date change
is 1 because on next date state
is 1 but that is for different product.
Desired Output:
+---------+----------+-------+-------+--------+
| Product | Date | Price | state | change |
+---------+----------+-------+-------+--------+
| A | 1/1/2020 | 10 | 1 | 0 |
| A | 1/2/2020 | 20 | 0 | 1 |
| A | 1/3/2020 | 30 | 1 | 0 |
| A | 1/4/2020 | 40 | 0 | 0 |
| B | 1/1/2020 | 15 | 1 | 0 |
| B | 1/2/2020 | 25 | 0 | 0 |
| B | 1/3/2020 | 35 | 0 | 1 |
| B | 1/4/2020 | 45 | 1 | 0 |
| B | 1/5/2020 | 55 | 0 | 1 |
| B | 1/6/2020 | 65 | 1 | 0 |
| C | 1/1/2020 | 101 | 0 | 1 |
| C | 1/2/2020 | 102 | 1 | 0 |
+---------+----------+-------+-------+--------+
+---------+------------+
| Product |count_change|
+---------+------------+
| A | 1 |
| B | 2 |
| C | 1 |
+---------+------------+
How can I tweak the code so change
is computed after grouping based on product and I can get Product wise count of how many times state changed from 0 to 1.