Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
818 views
in Technique[技术] by (71.8m points)

compare two data in a columns using one csv file in python

I am trying to compare two data in one csv file and I cannot use panda. What I am trying to get is the total Unit sold that the two person sell and the sum of all the years then compare who sold more based on the sum of all they sold through out the years. Then also get the least they sold on that particular year.

For example, my .csv is setup like this:
John Smith, 343, 2020
John Smith, 522, 2019
John Smith, 248, 2018
Sherwin Cooper, 412, 2020
Sherwin Cooper, 367, 2019
Sherwin Cooper, 97, 2018
Dorothy Lee, 612, 2020
Dorothy Lee, 687, 2019
Dorothy Lee, 591, 2018

I want to compare John and Dorothy's unit sold and who sold more. So the output should be:
Dorothy Lee sold more units than John smith. A total of 1890 to 1113.
Dorothy Lee sold less in 2018, for only 591.
John Smith sold less in 2018, for only 248.

My code so far is:

import csv

def compare1(employee1):

    with open("employeedata.csv") as file:
    rows = list(csv.DictReader(file, fieldnames = ['c1', 'c2', 'c3']))

    res = {}

       for row in rows:
       if row['c1'] == employee1:
          res[employee1] = res.get(employee1, 0) + int(row['c2'])
        
       print(res)
        
def compare2(employee2):

   with open("employee2.csv") as file:
      rows = list(csv.DictReader(file, fieldnames = ['c1', 'c2', 'c3']))

   res = {}

   for row in rows:
      if row['c1'] == employee2:
         res[employee2] = res.get(employee2, 0) + int(row['c2'])
        
   print(res)

employee1 = input("Enter the first name: ")
employee2 = input("Enter the first name: ")


compare1(employee1)
compare2(employee2)

I don't know the rest. I am stuck. I am a beginner and I can't use Panda. The output I need to have should look like this:

Dorothy Lee sold more units than John smith. A total of 1890 to 1113.
Dorothy Lee sold less in 2018, for only 591.
John Smith sold less in 2018, for only 248.
right now I got the output:
{'John Smith : 1113}
{'Dorothy Lee' : 1890}

question from:https://stackoverflow.com/questions/65951520/compare-two-data-in-a-columns-using-one-csv-file-in-python

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Suppose my.csv has columns name, sales, year:

import pandas as pd

emp_df = pd.read_csv("my.csv")

emp_gp = emp_df.groupby("name").sales.sum().reset_index(inplace=True)


def compare(saler1, saler2):
    if saler1 in emp_pg.name.values and saler2 in emp_pg.name.values:
         saler1_tol = emp_pg.loc[emp_pg.name == saler1, ["sales"]]
         saler2_tol = emp_pg.loc[emp_pg.name == saler2, ["sales"]]
         if saler1_tol > saler2_tol:
             print(f"{saler1} sold more unit than {saler2}. A total {saler1_tol} to {saler1_tol}")
         else:
             print(f"{saler2} sold more unit than {saler1}. A total {saler2_tol} to {saler2_tol}")
         emp_gb2 = emp_df.groupby("name")
         emp_agg = emp_gb2.agg({
              "sales" : "min"
         })
         emp_agg = emp_agg.reset_index()
         print("{saler1} sold less in {emp_pg.loc[emp_pg.name == saler1, ["year"]].values}, for only {emp_pg.loc[emp_pg.name == saler1, ["sales"]].values}")
         print("{saler2} sold less in {emp_pg.loc[emp_pg.name == saler2, ["year"]].values}, for only {emp_pg.loc[emp_pg.name == saler2, ["sales"]].values}")
    else:
        print("names of salers are not in the table")

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...