Чтобы удалить все слова, помеченные 'NNP', из следующего текста (из документации ), вы можете сделать следующее:
from textblob import TextBlob
# Sample text
text = '''
The titular threat of The Blob has always struck me as the ultimate movie
monster: an insatiably hungry, amoeba-like mass able to penetrate
virtually any safeguard, capable of--as a doomed doctor chillingly
describes it--"assimilating flesh on contact.'''
text = TextBlob(text)
# Create a list of words that are tagged with 'NNP'
# In this case it will only be 'Blob'
words_to_remove = [word[0] for word in [tag for tag in text.tags if tag[1] == 'NNP']]
# Remove the Words from the sentence, using words_to_remove
edited_sentence = ' '.join([word for word in text.split(' ') if word not in words_to_remove])
# Show the result
print(edited_sentence)
out
# Notice the lack of the word 'Blob'
'\nThe titular threat of The has always struck me as the ultimate
movie\nmonster: an insatiably hungry, amoeba-like mass able to
penetrate\nvirtually any safeguard, capable of--as a doomed doctor
chillingly\ndescribes it--"assimilating flesh on contact.\nSnide
comparisons to gelatin be damned, it\'s a concept with the
most\ndevastating of potential consequences, not unlike the grey goo
scenario\nproposed by technological theorists fearful of\nartificial
intelligence run rampant.\n'
Комментарии к вашему образцу
from textblob import TextBlob
strings = [] # This variable is not used anywhere
for col in result:
for i in range(result.shape[0]):
text = result[col][i]
txt_blob = TextBlob(text)
# txt_blob.noun_phrases will return a list of noun_phrases,
# To get the position of each list you need use the function 'enuermate', like this
for word, pos in enumerate(txt_blob.noun_phrases):
# Now you can print the word and position
print (word, pos)
# This will give you something like the following:
# 0 titular threat
# 1 blob
# 2 ultimate movie monster
# This following line does not make any sense, because tag has not yet been assigned
# and you are not iterating over the words from the previous step
if tag != 'NNP'
# You are not assigning anything to edited_sentence, so this would not work either.
print(' '.join(edited_sentence))
Ваш образец с новым кодом
from textblob import TextBlob
for col in result:
for i in range(result.shape[0]):
text = result[col][i]
txt_blob = TextBlob(text)
# Create a list of words that are tagged with 'NNP'
# In this case it will only be 'Blob'
words_to_remove = [word[0] for word in [tag for tag in txt_blob.tags if tag[1] == 'NNP']]
# Remove the Words from the sentence, using words_to_remove
edited_sentence = ' '.join([word for word in text.split(' ') if word not in words_to_remove])
# Show the result
print(edited_sentence)