Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.5k views
in Technique[技术] by (71.8m points)

json - UnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in position 3131: invalid start byte

I am trying to read twitter data from json file using python 2.7.12.

Code I used is such:

    import json
    import sys
    reload(sys)
    sys.setdefaultencoding('utf-8')

    def get_tweets_from_file(file_name):
        tweets = []
        with open(file_name, 'rw') as twitter_file:
            for line in twitter_file:
                if line != '
':
                    line = line.encode('ascii', 'ignore')
                    tweet = json.loads(line)
                    if u'info' not in tweet.keys():
                        tweets.append(tweet)
    return tweets

Result I got:

    Traceback (most recent call last):
      File "twitter_project.py", line 100, in <module>
        main()                  
      File "twitter_project.py", line 95, in main
        tweets = get_tweets_from_dir(src_dir, dest_dir)
      File "twitter_project.py", line 59, in get_tweets_from_dir
        new_tweets = get_tweets_from_file(file_name)
      File "twitter_project.py", line 71, in get_tweets_from_file
        line = line.encode('ascii', 'ignore')
    UnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in position 3131: invalid start byte

I went through all the answers from similar issues and came up with this code and it worked last time. I have no clue why it isn't working now...I would appreciate any help!

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

In my case(mac os), there was .DS_store file in my data folder which was a hidden and auto generated file and it caused the issue. I was able to fix the problem after removing it.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...