Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
896 views
in Technique[技术] by (71.8m points)

command line - PowerShell: how to count number of rows in csv file?

How can I count the number of rows in a csv file using powershell? I tried something like

Get-Content -length "C:Directoryfile.csv"

or

(Get-Content).length "C:Directoryfile.csv"

but these result an error.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Get-Content and Measure-Object are fine for small files, but both are super inefficient with memory. I had real problems with large files.

When counting rows in a 1GB file using either method, Powershell gobbled up all available memory on the server (8GB), then started paging to disk. I left it over an hour, but it was still paging to disk so I killed it.

The best method I found for large files is to use IO.StreamReader to load the file from disk and count each row using a variable. This keeps memory usage down to a very reasonable 25MB and is much, much quicker, taking around 30 seconds to count rows in a 1GB file or a couple of minutes for a 6GB file. It never eats up unreasonable amounts of RAM, no matter how large your file is:

[int]$LinesInFile = 0
$reader = New-Object IO.StreamReader 'c:filename.csv'
 while($reader.ReadLine() -ne $null){ $LinesInFile++ }

The above snippet can be inserted wherever you would use get-content or measure-object, simply refer to the $LinesInFile variable to get the row count of the file.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...