Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
232 views
in Technique[技术] by (71.8m points)

optimization - Java NIO FileChannel versus FileOutputstream performance / usefulness

I am trying to figure out if there is any difference in performance (or advantages) when we use nio FileChannel versus normal FileInputStream/FileOuputStream to read and write files to filesystem. I observed that on my machine both perform at the same level, also many times the FileChannel way is slower. Can I please know more details comparing these two methods. Here is the code I used, the file that I am testing with is around 350MB. Is it a good option to use NIO based classes for File I/O, if I am not looking at random access or other such advanced features?

package trialjavaprograms;

import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.InputStream;
import java.nio.ByteBuffer;
import java.nio.channels.FileChannel;

public class JavaNIOTest {
    public static void main(String[] args) throws Exception {
        useNormalIO();
        useFileChannel();
    }

    private static void useNormalIO() throws Exception {
        File file = new File("/home/developer/test.iso");
        File oFile = new File("/home/developer/test2");

        long time1 = System.currentTimeMillis();
        InputStream is = new FileInputStream(file);
        FileOutputStream fos = new FileOutputStream(oFile);
        byte[] buf = new byte[64 * 1024];
        int len = 0;
        while((len = is.read(buf)) != -1) {
            fos.write(buf, 0, len);
        }
        fos.flush();
        fos.close();
        is.close();
        long time2 = System.currentTimeMillis();
        System.out.println("Time taken: "+(time2-time1)+" ms");
    }

    private static void useFileChannel() throws Exception {
        File file = new File("/home/developer/test.iso");
        File oFile = new File("/home/developer/test2");

        long time1 = System.currentTimeMillis();
        FileInputStream is = new FileInputStream(file);
        FileOutputStream fos = new FileOutputStream(oFile);
        FileChannel f = is.getChannel();
        FileChannel f2 = fos.getChannel();

        ByteBuffer buf = ByteBuffer.allocateDirect(64 * 1024);
        long len = 0;
        while((len = f.read(buf)) != -1) {
            buf.flip();
            f2.write(buf);
            buf.clear();
        }

        f2.close();
        f.close();

        long time2 = System.currentTimeMillis();
        System.out.println("Time taken: "+(time2-time1)+" ms");
    }
}
Question&Answers:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

My experience with larger files sizes has been that java.nio is faster than java.io. Solidly faster. Like in the >250% range. That said, I am eliminating obvious bottlenecks, which I suggest your micro-benchmark might suffer from. Potential areas for investigating:

The buffer size. The algorithm you basically have is

  • copy from disk to buffer
  • copy from buffer to disk

My own experience has been that this buffer size is ripe for tuning. I've settled on 4KB for one part of my application, 256KB for another. I suspect your code is suffering with such a large buffer. Run some benchmarks with buffers of 1KB, 2KB, 4KB, 8KB, 16KB, 32KB and 64KB to prove it to yourself.

Don't perform java benchmarks that read and write to the same disk.

If you do, then you are really benchmarking the disk, and not Java. I would also suggest that if your CPU is not busy, then you are probably experiencing some other bottleneck.

Don't use a buffer if you don't need to.

Why copy to memory if your target is another disk or a NIC? With larger files, the latency incured is non-trivial.

Like other have said, use FileChannel.transferTo() or FileChannel.transferFrom(). The key advantage here is that the JVM uses the OS's access to DMA (Direct Memory Access), if present. (This is implementation dependent, but modern Sun and IBM versions on general purpose CPUs are good to go.) What happens is the data goes straight to/from disc, to the bus, and then to the destination... bypassing any circuit through RAM or the CPU.

The web app I spent my days and night working on is very IO heavy. I've done micro benchmarks and real-world benchmarks too. And the results are up on my blog, have a look-see:

Use production data and environments

Micro-benchmarks are prone to distortion. If you can, make the effort to gather data from exactly what you plan to do, with the load you expect, on the hardware you expect.

My benchmarks are solid and reliable because they took place on a production system, a beefy system, a system under load, gathered in logs. Not my notebook's 7200 RPM 2.5" SATA drive while I watched intensely as the JVM work my hard disc.

What are you running on? It matters.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...