Issue
- Need to use json.dumps to create the outut string to write to file.
- Using Python string formatting i.e. '{"%s": "%s"}' % (review_id, movies_data[review_id]) creates problems you described
Code
train, test = {}, {} # Dicionaries for storing training and test data
for review_id, review in movies_data.items():
if review_id in test_IDS:
test[review_id] = review
else:
train[review_id] = review
# Output Test
with open("test_movies.json", "w") as outfile_test:
json.dump(test, outfile_test)
# Output training
with open("train_movies.json", "w") as outfile_train:
json.dump(train, outfile_train)
Results
Input: File Contents of test.json
{ "review_1": {"tokens": ["Best", "show", "ever", "!"],
"movie_user_4": {"aspects": ["O", "B_A", "O", "O"], "sentiments": ["B_S", "O", "O", "O"]},
"movie_user_6": {"aspects": ["O", "B_A", "O", "O"], "sentiments": ["B_S", "O", "O", "O"]}},
"review_2": {"tokens": ["Its", "a", "great", "show"],
"movie_user_1": {"aspects": ["O", "O", "O", "B_A"], "sentiments": ["O", "O", "B_S", "O"]},
"movie_user_6": {"aspects": ["O", "O", "O", "B_A"], "sentiments": ["O", "O", "B_S", "O"]}},
"review_3": {"tokens": ["I", "love", "this", "actor", "!"],
"movie_user_17": {"aspects": ["O", "O", "O", "B_A", "O"], "sentiments": ["O", "B_S", "O", "O", "O"]},
"movie_user_23": {"aspects": ["O", "O", "O", "B_A", "O"], "sentiments": ["O", "B_S", "O", "O", "O"]}},
"review_4": {"tokens": ["Bad", "movie"],
"movie_user_1": {"aspects": ["O", "B_A"], "sentiments": ["B_S", "O"]},
"movie_user_6": {"aspects": ["O", "B_A"], "sentiments": ["B_S", "O"]}}
}
Output: File Contents of test_movies.json
{"review_2": {"tokens": ["Its", "a", "great", "show"], "movie_user_1": {"aspects": ["O", "O", "O", "B_A"], "sentiments": ["O", "O", "B_S", "O"]}, "movie_user_6": {"aspects": ["O", "O", "O", "B_A"], "sentiments": ["O", "O", "B_S", "O"]}}, "review_4": {"tokens": ["Bad", "movie"], "movie_user_1": {"aspects": ["O", "B_A"], "sentiments": ["B_S", "O"]}, "movie_user_6": {"aspects": ["O", "B_A"], "sentiments": ["B_S", "O"]}}}
Output: File Contents of train_movies.json
{"review_1": {"tokens": ["Best", "show", "ever", "!"], "movie_user_4": {"aspects": ["O", "B_A", "O", "O"], "sentiments": ["B_S", "O", "O", "O"]}, "movie_user_6": {"aspects": ["O", "B_A", "O", "O"], "sentiments": ["B_S", "O", "O", "O"]}}, "review_3": {"tokens": ["I", "love", "this", "actor", "!"], "movie_user_17": {"aspects": ["O", "O", "O", "B_A", "O"], "sentiments": ["O", "B_S", "O", "O", "O"]}, "movie_user_23": {"aspects": ["O", "O", "O", "B_A", "O"], "sentiments": ["O", "B_S", "O", "O", "O"]}}}