In the first string, the set of trigrams is
- {" w"," wo","ord","wor","rd "}.
+ {" w"," wo","wor","ord","rd "}.
In the second string, the ordered set of trigrams is
- {" t"," tw",two,"wo "," w"," wo","wor","ord","rds", ds "}.
+ {" t"," tw","two","wo "," w"," wo","wor","ord","rds","ds "}.
The most similar extent of an ordered set of trigrams in the second string
is {" w"," wo","wor","ord"}, and the similarity is
0.8.
At the same time, strict_word_similarity(text, text)
has to select an extent that matches word boundaries. In the example above,
strict_word_similarity(text, text) would select the
- extent {" w"," wo","wor","ord","rds", ds "}, which
+ extent {" w"," wo","wor","ord","rds","ds "}, which
corresponds to the whole word 'words'.