Найти слова в предложении и создать переменную индикатора - PullRequest
0 голосов
/ 07 декабря 2018

У меня есть переменная с различными предложениями:

Cats are good pets, for they are clean and are not noisy.
Abstraction is often one floor above you.
She wrote a long letter to Charlie, but he didn't read it.
Where do random thoughts come from?
Mary plays the piano.
I want more detailed information.
I'd rather be a bird than a fish.
When I was little I had a car door slammed shut on my hand. I still remember it quite vividly.
Malls are great places to shop; John can find everything he needs under one roof.
My Mum tries to be cool by saying that she likes all the same things that I do.

Как я могу создать переменную name == 1, если имя найдено?

Я также хотел бы иметь переменную name == 2, если какое-либо слово в предложении соответствует слову моего выбора (например, letter).

Я попробовал следующее:

gen name = regexm(sentence, "letter* & (Charlie | Mary | John)*")` 

Однако это не работает.Я получаю только name == 0 во всех наблюдениях.

Ответы [ 2 ]

0 голосов
/ 07 декабря 2018

Регулярные выражения - это здорово, но Catch-22 заключается в том, что вам нужно очень усердно работать над изучением языка;если и когда вы станете опытным, то вы увидите преимущества.

Я оставлю это на другие ответы, чтобы дать умные решения регулярных выражений.Цель здесь - подчеркнуть, что другие строковые функции могут быть исправны.Здесь я использую тот факт, что strpos() возвращает положительный результат, эквивалентный true, если он находит строку в другой строке.Кроме того, эта Stata будет разбираться на слова, так что даже (например) поиск строки в том и только в том случае, если это слово, не так уж сложно из первых принципов.

clear 
input strL whatever 
"Cats are good pets, for they are clean and are not noisy."
"Abstraction is often one floor above you."
"She wrote a long letter to Charlie, but he didn't read it."
"Where do random thoughts come from?"
"Mary plays the piano."
"I want more detailed information."
"I'd rather be a bird than a fish."
"When I was little I had a car door slammed shut on my hand. I still remember it quite vividly."
"Malls are great places to shop; John can find everything he needs under one roof."
"My Mum tries to be cool by saying that she likes all the same things that I do."
end 

gen wanted1 = strpos(whatever, "Charlie") | strpos(whatever, "Mary") | strpos(whatever, "John") 

* cat or cats as a word 
gen wanted2 = 0 
gen wordcount = wordcount(whatever) 
su wordcount, meanonly 
local J = r(max) 
quietly foreach w in cat cats { 
    forval j = 1/`J' { 
        replace wanted2 = 1 if word(lower(whatever), `j') == "`w'" 
    }
} 

gen what = substr(whatever, 1, 40) 
list wanted? what, sep(0) 

     +--------------------------------------------------------------+
     | wanted1   wanted2                                       what |
     |--------------------------------------------------------------|
  1. |       0         1   Cats are good pets, for they are clean a |
  2. |       0         0   Abstraction is often one floor above you |
  3. |       1         0   She wrote a long letter to Charlie, but  |
  4. |       0         0        Where do random thoughts come from? |
  5. |       1         0                      Mary plays the piano. |
  6. |       0         0          I want more detailed information. |
  7. |       0         0          I'd rather be a bird than a fish. |
  8. |       0         0   When I was little I had a car door slamm |
  9. |       1         0   Malls are great places to shop; John can |
 10. |       0         0   My Mum tries to be cool by saying that s |
     +--------------------------------------------------------------+
0 голосов
/ 07 декабря 2018

Рассмотрим предложения в вашем примере:

clear

input strL sentence
"Cats are good pets, for they are clean and are not noisy."
"Abstraction is often one floor above you."
"She wrote a long letter to Charlie, but he didn't read it."
"Where do random thoughts come from?"
"Mary plays the piano."
"I want more detailed information."
"I'd rather be a bird than a fish."
"When I was little I had a car door slammed shut on my hand. I still remember it quite vividly."
"Malls are great places to shop; John can find everything he needs under one roof."
"My Mum tries to be cool by saying that she likes all the same things that I do."           
end

Комбинируя функции strmatch() и ustrregexm():

generate name = strmatch(sentence, "*letter*") + ustrregexm(sentence, "(Charlie|Mary|John)")

Вы можете получить желаемый результат:

list name, separator(0)

     +------+
     | name |
     |------|
  1. |    0 |
  2. |    0 |
  3. |    2 |
  4. |    0 |
  5. |    1 |
  6. |    0 |
  7. |    0 |
  8. |    0 |
  9. |    1 |
 10. |    0 |
     +------+
...