Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
4.0k views
in Technique[技术] by (71.8m points)

python - Count Change in State For Each Group in Pandas DataFrame

I have Pandas Data Frame containing product and it's state along with other information. An example data frame can be created as follows

data = {'Product':['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B', 'B', 'B', 'C', 'C'],
        'Date' : ['2020-01-01', '2020-01-02', '2020-01-03', '2020-01-04','2020-01-01', '2020-01-02', '2020-01-03', '2020-01-04', '2020-01-05', '2020-01-06', '2020-01-01', '2020-01-02'],
        'Price':[10, 20, 30, 40, 15, 25, 35, 45, 55, 65, 101, 102],
         'state':[1, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 1]   }
test = pd.DataFrame(data)

I want count how many times the product state changes from 0 to 1. I've used the following code to check if product state goes from 0 to 1 and named it as change

test['change'] = np.where(test.state < test.state.shift(-1), 1, 0)

The problem is the code above does not takes into account product therefore I need to group by product and then check change in state.

Output:

+---------+----------+-------+-------+--------+
| Product |   Date   | Price | state | change |
+---------+----------+-------+-------+--------+
| A       | 1/1/2020 |    10 |     1 |      0 |
| A       | 1/2/2020 |    20 |     0 |      1 |
| A       | 1/3/2020 |    30 |     1 |      0 |
| A       | 1/4/2020 |    40 |     0 |      1 |
| B       | 1/1/2020 |    15 |     1 |      0 |
| B       | 1/2/2020 |    25 |     0 |      0 |
| B       | 1/3/2020 |    35 |     0 |      1 |
| B       | 1/4/2020 |    45 |     1 |      0 |
| B       | 1/5/2020 |    55 |     0 |      1 |
| B       | 1/6/2020 |    65 |     1 |      0 |
| C       | 1/1/2020 |   101 |     0 |      1 |
| C       | 1/2/2020 |   102 |     1 |      0 |
+---------+----------+-------+-------+--------+

As seen from the output above for product A on 4th date change is 1 because on next date state is 1 but that is for different product.

Desired Output:

+---------+----------+-------+-------+--------+
| Product |   Date   | Price | state | change |
+---------+----------+-------+-------+--------+
| A       | 1/1/2020 |    10 |     1 |      0 |
| A       | 1/2/2020 |    20 |     0 |      1 |
| A       | 1/3/2020 |    30 |     1 |      0 |
| A       | 1/4/2020 |    40 |     0 |      0 |
| B       | 1/1/2020 |    15 |     1 |      0 |
| B       | 1/2/2020 |    25 |     0 |      0 |
| B       | 1/3/2020 |    35 |     0 |      1 |
| B       | 1/4/2020 |    45 |     1 |      0 |
| B       | 1/5/2020 |    55 |     0 |      1 |
| B       | 1/6/2020 |    65 |     1 |      0 |
| C       | 1/1/2020 |   101 |     0 |      1 |
| C       | 1/2/2020 |   102 |     1 |      0 |
+---------+----------+-------+-------+--------+



+---------+------------+
| Product |count_change|
+---------+------------+
| A       |          1 |
| B       |          2 |
| C       |          1 |
+---------+------------+

How can I tweak the code so change is computed after grouping based on product and I can get Product wise count of how many times state changed from 0 to 1.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Try groupby:

g = test.groupby('Product')
test['change'] = (g['state'].diff(-1)<0).astype(int)
g['change'].sum()

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...