Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.3k views
in Technique[技术] by (71.8m points)

algorithm - String manipulation: calculate the "similarity of a string with its suffixes"

For two strings A and B, we define the similarity of the strings to be the length of the longest prefix common to both strings. For example, the similarity of strings "abc" and "abd" is 2, while the similarity of strings "aaa" and "aaab" is 3.

The problem is to give an algorithm to calculate the sum of similarities of a string S with each of it's suffixes. For example, let the string be : ababaa. Then, the suffixes of the string are ababaa, babaa, abaa, baa, aa and a. The similarities of each of these strings with the string ababaa are 6,0,3,0,1,1, respectively. Thus the answer is 6 + 0 + 3 + 0 + 1 + 1 = 11.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You want to consider suffix arrays. The suffix array of a word is the array of the indices of suffixes sorted in lexicographical order. In the linked wikipedia article, the algorithms compute the LCP (longest common prefix) as they compute the suffix array. You can compute this in O(n) using similarities with suffix trees, as shown in this paper.

EXAMPLE: Your string is ababaa, so the suffix array looks like this:

5 | a
4 | aa
2 | abaa
0 | ababaa
3 | baa
1 | babaa

where the number on the left is the index at which the suffix begins. Now it's pretty each to compute prefixes since everything is stored lexicographically.

As a side note, this is closely related to the longest common substring problem. To practice for your next interview, think about ways to solve that efficiently.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

57.0k users

...