Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
472 views
in Technique[技术] by (71.8m points)

node.js - Module request how to properly retrieve accented characters? ? ? ?

I'm using: Module: Request -- Simplified HTTP request method to scrape a webpage with accented characters á é ó ú ê ? etc.

I've already tried encoding: utf-8 with no success. I'm still getting this ??? characters in the result.

request.get({
    uri: url,
    encoding: 'utf-8'
    // ...

Is there any configuration to fix it?

I don't know if it is an issue, but I filled one for this module. No answers yet. :/

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Since binary is deprecated it seems like a better idea to use iconv and correctly handle the decoding:

var request = require("request"), iconv  = require('iconv-lite');
var requestOptions  = { encoding: null, method: "GET", uri: "http://something.com"};

request(requestOptions, function(error, response, body) {
    var utf8String = iconv.decode(new Buffer(body), "ISO-8859-1");
    console.log(utf8String);
});

The important part is to set the encoding on the HTTP request to be null encoding: null.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

57.0k users

...