Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
230 views
in Technique[技术] by (71.8m points)

java - How can I make Notepad to save text in UTF-8 without the BOM?

I have a CSV file with special accents and save it in Notepad by selecting UTF-8 encoding. When I read the file using Java, it reads the BOM characters too.

So I want to save this file in UTF-8 format without appending a BOM initially in Notepad.

Otherwise, is there a built-in class in Java that eliminates the BOM characters that present at beginning, when reading the contents in a file?

Question&Answers:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
  1. Use Notepad++ - it is free and much better than Notepad. It will help to save text without a BOM using EncodingEncode in UTF-8 without BOM:

    Notepad++ v6 and olders: Screenshot of the Notepad++ Menubar -> Encoding -> Encode in UTF-8 without BOM menu in Notepad++ v6.7.9.2

    Notepad++ v7+:
    Screenshot of the Notepad++ Menubar -> Encoding -> Encode in UTF-8 without BOM menu in Notepad++ v7+

  2. When I encountered this problem in Java, I didn't find any library to parse these first three bytes (BOM). So my advice:

    • Use PushbackInputStream(in, 3).
    • Read the first three bytes
    • If it's not BOM (EF BB BF), push them back
    • Process the stream as UTF-8

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...