i18n fails

Visual Studio (MSVC) Text Encoding

Microsoft can never escape the historical burden of code pages.

Date Published

Vendor

Microsoft logoMicrosoft
From
English (United States)
English (United States)
To
Visual Studio C++ Code editor

Source code:

#include <stdio.h>
int main()
{
	printf("Hello world!🙂\n");
	return 0;
}

Text encoding: UTF-8

Execution result:

Hello world!🙂

ConsoleApplication1.exe (process 72744) exited with code 0 (0x0).
Press any key to close this window . . .
Visual Studio C++ Source code editor:

#include <stdio.h>
int main()
{
  printf("これはテストプログラムです。\n");
}

Text encoding: UTF-8

Compiler warning:
FileName.cpp(1,1): warning C4819: ファイルは、現在のコードページ (932) で表示できない文字を含んでいます。データの損失を防ぐために、ファイルを Unicode 形式で保存してください。

Compiler warning C4819: The file contains a character that cannot be represented in the current code page (932). Save the file in Unicode format to prevent data loss.

Execution result: 縺薙l縺ッ繝�繧ケ繝医�励Ο繧ー繝ゥ繝�縺ァ縺吶��

In Visual Studio 2026, newly created C++ source files have the default text encoding of UTF-8 (without BOM). However, the default compiler (MSVC) configuration in a new project can only parse source files either in the current OS code page or as UTF-8 with BOM. The discrepancy in default configuration causing new source code files being saved as UTF-8, but processed by compiler as the current OS code page.

In the first example, the compiler is parsing the UTF-8 source as Windows 1252, which can parse most valid UTF-8 bytes as Windows 1252 bytes, thus passing the compiler validation silently. In the second example, the compiler is parsing the source as CP 932 (Shift-JIS), which has a stricter set of rules of valid byte sequences, thus triggering compiler warning when parsing UTF-8 bytes.