Я рекомендую использовать NSLinguisticTagger . Мы можем использовать его для поиска Here is my string. His isn't a mississippi isthmus. It is?
NSLinguisticTagger *linguisticTagger = [[NSLinguisticTagger alloc] initWithTagSchemes:@[
NSLinguisticTagSchemeTokenType,
]
options:
NSLinguisticTaggerOmitPunctuation |
NSLinguisticTaggerOmitWhitespace |
NSLinguisticTaggerOmitOther ];
[linguisticTagger setString:@"Here is my string. His isn't a mississippi isthmus. It is?"];
[linguisticTagger enumerateTagsInRange:NSMakeRange(0,
[[linguisticTagger string] length])
scheme:NSLinguisticTagSchemeTokenType
options:
NSLinguisticTaggerOmitPunctuation |
NSLinguisticTaggerOmitWhitespace |
NSLinguisticTaggerOmitOther |
NSLinguisticTaggerJoinNames
usingBlock:^(NSString *tag, NSRange tokenRange, NSRange sentenceRange, BOOL *stop) {
NSLog(@"tag: %@, tokenRange: %@, sentenceRange: %@, token: %@",
tag,
NSStringFromRange(tokenRange),
NSStringFromRange(sentenceRange),
[[linguisticTagger string] substringWithRange:tokenRange]);
}];
Это выводит:
tag: Word, tokenRange: {0, 4}, sentenceRange: {0, 19}, token: Here
tag: Word, tokenRange: {5, 2}, sentenceRange: {0, 19}, token: is
tag: Word, tokenRange: {8, 2}, sentenceRange: {0, 19}, token: my
tag: Word, tokenRange: {11, 6}, sentenceRange: {0, 19}, token: string
tag: Word, tokenRange: {19, 3}, sentenceRange: {19, 33}, token: His
tag: Word, tokenRange: {23, 2}, sentenceRange: {19, 33}, token: is
tag: Word, tokenRange: {25, 3}, sentenceRange: {19, 33}, token: n't
tag: Word, tokenRange: {29, 1}, sentenceRange: {19, 33}, token: a
tag: Word, tokenRange: {31, 11}, sentenceRange: {19, 33}, token: mississippi
tag: Word, tokenRange: {43, 7}, sentenceRange: {19, 33}, token: isthmus
tag: Word, tokenRange: {52, 2}, sentenceRange: {52, 6}, token: It
tag: Word, tokenRange: {55, 2}, sentenceRange: {52, 6}, token: is
Он игнорирует His
mississippi
и isthmus
и даже идентифицирует is
внутри isn't
.