I've been thinking about something for a project I want to do, I'm not an advance user and I'm just learning. Do not know if this is possible:
Suppose we have 100 html documents containing many tables and text inside them.
Question one is: is it possible to analyze all this text and find words repeated and count it?.
Yes, It's possible to do with some functions but here's the problem: what if we did not know the words that will gonna find? That is, we would have to tell the code what a word means.
Suppose, for example, that one word would be a union of seven characters, the idea would be to find other similar patterns and mention it. What would be the best way to do this?
Thank you very much in advance.
Example:
Search: Five characters patterns on the next phrases:
Text one:
"It takes an ocean not to break"
Text two:
"An ocean is a body of saline water"
Result
Takes 1
Break 1
water 1
Ocean 2
Thanks in advance for your help.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…