Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
495 views
in Technique[技术] by (71.8m points)

How to count word frequencies within a file in python

I have a .txt file with the following format,

C
V
EH
A
IRQ
C
C
H
IRG
V

Although obviously it's a lot bigger then that, this is essentially it.Basically I'm trying to sum how many times each individual string is in the file (each letter/string is on a separate line, so technically the file is C V EH etc. However when I try to convert these files into a list, and then use the count function on, it separates out letters so that strings such as 'IRQ' are [' 'I','R','Q',' '] so then when I count it I get the frequencies of each individual letter and not of the strings.

Here is the code that I have written so far,

def countf():
    fh = open("C:/x.txt","r")
    fh2 = open("C:/y.txt","w")
    s = []
    for line in fh:
        s += line
    for x in s:
        fh2.write("{:<s} - {:<d}".format(x,s.count(x))

What I want to end up with is an output file that looks something like this

C  10
V  32
EH 7
A  1
IRQ  9
H 8
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

use Counter(), and use strip() to remove the :

from collections import Counter
with open('x.txt') as f1,open('y.txt','w') as f2:
    c=Counter(x.strip() for x in f1)
    for x in c:
        print x,c[x]   #do f2.write() here if you want to write them to f2

output:

A 1
C 3
EH 1
IRQ 1
V 2
H 1
IRG 1

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...