It splits a word into two parts: stem
and end
. There are three cases:
- The word ends with
ss
(or even more s
): stem <- word
and end <- ""
- The word ends with a single
s
: stem <- word without "s"
and end <- "s"
- The word does not end with
s
: stem <- word
and end <- ""
This is done by a regular expression which captures the full word (due to ^....$
). The first part (i.e. stem
) consists either of as much as possible ending in ss
(.*ss
) or if that is not possible of as less as possible (.*?
). Then possibly an ending s
is taken to be the end
part.
Note that in the first case (as much as possible ending in ss
) there can never be an additional s
for the end
part.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…