Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
415 views
in Technique[技术] by (71.8m points)

c# - Reading unicode from console

I am trying to read unicode string from a console in C#, for the sake of example, lets uset his one:

c:SVND3ebuggersrcвиталикProgram.cs

At first I just tried to Console.ReadLine() which returned me c:SVND3ebuggersrc???????Program.cs

I've tried to set the Console.InputEncoding to UTF8 like so Console.InputEncoding = Encoding.UTF8 but that returned me c:SVND3ebuggersrc???????Program.cs, basically mucking up the Cyrillic part of the string.

So randomly stumbling I've tried to set the encoding like that, Console.InputEncoding = Encoding.GetEncoding(1251); which returned c:SVND?ebuggersrcвиталикProgram.cs, this time corrupting the 3 character.

At this point it seems that by switching encodings for the InputStream I can only get a single language at a time.

I've also tried going native and doing something like that:

// Code
public static string ReadLine()
{
    const uint nNumberOfCharsToRead = 1024;
    StringBuilder buffer = new StringBuilder();

    uint charsRead = 0;
    bool result = ReadConsoleW(GetStdHandle(STD_INPUT_HANDLE), buffer, nNumberOfCharsToRead, out charsRead, (IntPtr)0);

    // Return the input minus the newline character
    if (result && charsRead > 1) return buffer.ToString(0, (int)charsRead - 1);
    return string.Empty;
}

// Extern definitions

    [DllImport("Kernel32.DLL", ExactSpelling = true)]
    internal static extern IntPtr GetStdHandle(int nStdHandle);

    [DllImport("kernel32.dll", CharSet = CharSet.Unicode, ExactSpelling = true)]
    static extern bool ReadConsoleW(IntPtr hConsoleInput, [Out] StringBuilder lpBuffer, 
        uint nNumberOfCharsToRead, out uint lpNumberOfCharsRead, IntPtr lpReserved);

That was working fine for non-unicode strings, however, when I tried to make it read my sample string, the application crashed. I've tried to tell Visual Studio to break on ALL exception (including native ones), yet, the application would still crash.

I also found this open bug in Microsoft's Connect that seems to say that it is impossible right now to read Unicode from the console's InputStream.

It is worth noting, even though not strictly related to my question, that Console.WriteLine is able to print this string just fine, if Console.OutputEncoding is set to UTF8.

Thank you!

Update 1

I am looking for a solution for .NET 3.5

Update 2

Updated with the full native code I've used.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

This seems to work fine when targetting .NET 4 client profile, but unfortunately not when targetting .NET 3.5 client profile. Ensure you change the console font to Lucida Console.
As pointed out by @jcl, even though I have targetted .NET4, this is only because I have .NET 4.5 installed.

class Program
{
    private static void Main(string[] args)
    {
        Console.InputEncoding = Encoding.Unicode;
        Console.OutputEncoding = Encoding.Unicode;

        while (true)
        {
            string s = Console.ReadLine();

            if (!string.IsNullOrEmpty(s))
            {
                Debug.WriteLine(s);

                Console.WriteLine(s);
            }
        }
    }
}

enter image description here


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...