I am trying to convert double byte character sequence (DBCS) in CP936 to wchar_t
using C++ locale. This is the code:
#include <iostream>
#include <locale>
#include <codecvt>
// 国 in CP936
char const src[] = "xB9xFA";
int main()
{
std::locale loc(".936");
typedef std::codecvt<wchar_t, char, std::mbstate_t> codecvt_type;
codecvt_type const & cvt = std::use_facet<codecvt_type>(loc);
std::mbstate_t state;
std::memset(&state, 0, sizeof(state));
char const * src_mid = src;
wchar_t buf[10];
wchar_t * buf_mid = buf;
std::codecvt_base::result res = cvt.in(state,
src, src + 2, src_mid,
buf, buf + 10, buf_mid);
int eno = errno;
std::cout << "res: " << +res << "
"
<< "errno: " << eno << "
";
return 0;
}
Now, the conversion always ends with error and errno
set to 42, which is EILSEQ
. I have debugged the code and I think I can see what goes wrong but I do not understand why.
What goes wrong is that the code that ultimately leads to call to MultiByteToWideChar()
, has a conditional like this:
if ( ploc->_Isleadbyte[ch >> 3] & (1 << (ch & 7)) )
This branch is never taken, despite the fact that the source string AFAIK contains correct lead byte and trailing byte. I have checked the _Isleadbyte
array in debugger and it is all zeroes. So this branch which sets the input length to 2
is never taken and instead the one where length is set to 1
is taken and thus the MultiByteToWideChar()
fails because lead byte has to be accompanied by trailing byte.
I have even checked that C_936.NLS
is present in C:WindowsSystem32
, so that should not be the problem.
So, I guess the question is: Is this issue on my end, with the test code, with Windows OS setup, missing components? Or is this issue in the Visual Studio 2015 code?
UPDATE
So I have incidentally stumbled upon this question:Shift-JIS decoding fails using wifstrem in Visual C++ 2013
The OPs own answer shows a workaround:
const int oldMbcp = _getmbcp();
_setmbcp(932);
const std::locale locale("Japanese_Japan.932");
_setmbcp(oldMbcp);
The same workaround seems to work for the CP936 that I am trying to use.
UPDATE 2
I filed a bug report with Microsoft.
See Question&Answers more detail:
os