I have a csv file which contains three columns - computer_name, software_code, software_update_date. The file contains computers that I don't need in my final report. I only need the data for computers whose name starts with 40- , 46- or 98-. Here is the sample file:
computer_name software_code software_update_date
07-0708 436 2019-02-07 0:00
30-0207 35170 2021-01-18 0:00
40-0049 41 2017-06-21 23:00
46-0001 11 2013-11-23 0:00
So I would like to delete rows 07-0708 and 30-0207. I tried with pandas but the generated file is exactly the same with no error message. I am quite new to python and still grasping the concepts. I wrote the below code:
import csv
import pandas as pd
fname = 'RAWfile.csv'
df=pd.read_csv(fname,encoding='ISO-8859-1')
#Renaming columns from the report
df.rename(columns = {'computer_name':'PC_NO', 'software_code':'SOFT_CODE', 'software_update_date':'UPDATE_DATE'}, inplace=True)
computers = ['40-','46-','98-']
searchstr = '|'.join(computers)
df[df['PC_NO'].str.contains(searchstr)]
df.to_csv('updatedfile.csv',index=False,quoting=csv.QUOTE_ALL,line_terminator='
')
UPDATE: There are almost 70,000 rows in the csv file. Corrected the values in computers list to match the question.
question from:
https://stackoverflow.com/questions/65850595/remove-rows-in-a-csv-file-based-on-the-format-of-column-value 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…