Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
226 views
in Technique[技术] by (71.8m points)

Passing command line unicode argument to Java code

I have to pass command line argument which is Japanese to Java main method. If I type Unicode characters on command-line window, it displays '?????' which is OK, but the value passed to java program is also '?????'. How do I get the correct value of argument passed by the command window? Below is sample program which writes to a file the value supplied by command line argument.

public static void main(String[] args) {
        String input = args[0];
        try {
            String filePath = "C:/Temp/abc.txt";
            File file = new File(filePath);
            OutputStream out = new FileOutputStream(file);
            byte buf[] = new byte[1024];
            int len;
            InputStream is = new ByteArrayInputStream(input.getBytes());
            while ((len = is.read(buf)) > 0) {
                out.write(buf, 0, len);
            }
            out.close();
            is.close();
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
Question&Answers:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Unfortunately you cannot reliably use non-ASCII characters with command-line apps that use the Windows C runtime's stdlib, like Java (and pretty much all non-Windows-specific scripting languages really).

This is because they read their input and output using a locale-specific code page by default, which is never a UTF, unlike every other modern OS which uses UTF-8.

Whilst you can change the code page of a terminal to something else using the chcp command, the support for the UTF-8 encoding under chcp 65001 is broken in a few ways that are likely to trip apps up fatally.

If you only need Japanese you could switch to code page 932 (similar to Shift-JIS) by setting your locale (‘language for non-Unicode applications’ in the Regional settings) to Japan. This will still fail for characters that aren't in that code page though.

If you need to get non-ASCII characters through the command line reliably on Windows, you need to call the Win32 API function GetCommandLineW directly to avoid the encode-to-system-code-page layer. Probably you'd want to do that using JNA.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...