when retrieving and caching/saving (in a database) some posts from an rss feed, how to determine that:
- it is the same post (example: when some typos are fixed in the feed or if the title changes, the date changes, etc...)
- find feeds that talk about the same topic (example: same story from different sources)
are there any best practices for these things?
thnx a lot
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…