Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
156 views
in Technique[技术] by (71.8m points)

c++ - utfcpp and Win32 wide API

Is it good/safe/possible to use the tiny utfcpp library for converting everything I get back from the wide Windows API (FindFirstFileW and such) to a valid UTF8 representation using utf16to8?

I would like to use UTF8 internally, but am having trouble getting the correct output (via wcout after another conversion or plain cout). Normal ASCII characters work of course, but ?? gets messed up.

Or is there an easier alternative?

Thanks!

UPDATE: Thanks to Hans (below), I now have an easy UTF8<->UTF16 conversion through the Windows API. Two way conversion works, but the UTF8 from UTF16 string has some extra characters that might cause me some trouble later on...). I'll share it here out of pure friendliness :) ):

// UTF16 -> UTF8 conversion
std::string toUTF8( const std::wstring &input )
{
    // get length
    int length = WideCharToMultiByte( CP_UTF8, NULL,
                                      input.c_str(), input.size(),
                                      NULL, 0,
                                      NULL, NULL );
    if( !(length > 0) )
        return std::string();
    else
    {
        std::string result;
        result.resize( length );

        if( WideCharToMultiByte( CP_UTF8, NULL,
                                 input.c_str(), input.size(),
                                 &result[0], result.size(),
                                 NULL, NULL ) > 0 )
            return result;
        else
            throw std::runtime_error( "Failure to execute toUTF8: conversion failed." );
    }
}
// UTF8 -> UTF16 conversion
std::wstring toUTF16( const std::string &input )
{
    // get length
    int length = MultiByteToWideChar( CP_UTF8, NULL,
                                      input.c_str(), input.size(),
                                      NULL, 0 );
    if( !(length > 0) )
        return std::wstring();
    else
    {
        std::wstring result;
        result.resize( length );

        if( MultiByteToWideChar(CP_UTF8, NULL,
                                input.c_str(), input.size(),
                                &result[0], result.size()) > 0 )
            return result;
        else
            throw std::runtime_error( "Failure to execute toUTF16: conversion failed." );
    }
}
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The Win32 API already has a function to do this, WideCharToMultiByte() with CodePage = CP_UTF8. Saves you from having to rely on another library.

You cannot normally use the result with wcout. Its output goes to the console, it uses an 8-bit OEM encoding for legacy reasons. You can change the code page with SetConsoleCP(), 65001 is the code page for UTF-8 (CP_UTF8).

Your next stumbling block would be the font that's used for the console. You'll have to change it but finding a font that's fixed-pitch and has a full set of glyphs to cover Unicode is going to be difficult. You'll see you have a font problem when you get square rectangles in the output. Question marks are encoding problems.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...