Я задавал тот же вопрос, но все еще имею проблемы с этой темой.
Предположим, у меня есть набор данных A как:
**Name**
Liver cell carcinoma
Stomach, unspecified
Malignant neoplasm of rectum
Lumbar and other intervertebral disc disorders with radiculopathy
Bronchus or lung, unspecified
Cerebral infarction, unspecified
Pneumonia, unspecified
Headache
Spinal stenosis, lumbar region
Other specified intervertebral disc displacement
Sigmoid colon
Calculus of ureter
Colon, unspecified
Concussion, without open intracranial wound
Malignant neoplasm of thyroid gland
Breast, unspecified
Other and unspecified cirrhosis of liver
Chronic viral hepatitis B without delta- agent
Dizziness and giddiness
Tension-type headache
Malignant neoplasm of stomach, unspecified, unspecified
Cervical disc disorder with radiculopathy
Malignant neoplasm of bronchus or lung, unspecified, unspecified side
Chest pain, unspecified
Gastroenteritis and colitis of unspecified origin
Bronchiectasis
Concussion
Body of stomach
Acute tubulo-interstitial nephritis
Traumatic subdural haemorrhage, without open intracranial wound
Abnormal findings on diagnostic imaging of lung
Angina pectoris, unspecified
Other disorders of lung
Ascending colon
Essential(primary) hypertension
Pyloric antrum
Intrahepatic bile duct carcinoma
Cervix uteri, unspecified
Gastro-oesophageal reflux disease with oesophagitis
Liver
Fracture of nasal bone, closed
Malignant neoplasm of rectosigmoid junction
Open wound of scalp
Other cerebral infarction
Cerebral aneurysm, nonruptured
Malignant neoplasm of kidney, except renal pelvis
Malignant neoplasm of prostate
Unspecified abdominal pain
И набор данных B похож на:
Part Key
Abdominal abdomen
Abdominal abdominal
Other acute myeloblastic leukaemia
Abdominal adrenal
Head allergic rhinitis
Head Alzheimer's
Abdominal ampulla
Abdominal aneurysm
Chest angina
Abdominal antrum
Chest aorta
Abdominal appendicitis
Head arteries
Abdominal ascites
Chest asthma
Abdominal back
other b-cell lymphoma
Abdominal bile duct
Abdominal biliary tract
Abdominal bladder
Head brain
Chest breast
Chest Bronchiectasis
Chest bronchitis
Chest bronchopneumonia
Chest bronchus
Abdominal C64
Abdominal caecum
Abdominal cardia
Head cavity
Head cerebral
Chest cerebrovascular
Head cerebrovascular
Abdominal cervical
Abdominal cervix
Other chemotherapy session for neoplasm
Chest chest
Abdominal cholangitis
Abdominal cholecystitis
Chest circulatorycomplications
Abdominal colon
Head concussion
other connective and soft tissue, unspecified
Head convulsions
Chest Cough
Lung cough
Я запустил следующий код:
result <-A %>%
mutate(key = gsub(paste0(".*(", paste(B$key, collapse = "|"), ").*"),"\\1",tolower(A$NAME))) %>%
left_join(B)
и в результате появилось несколько дублированных строк.
Каким будет лучший код для создания набора данных, который я хочу?Я ожидаю, что моя таблица результатов как:
Name Key Part
Liver cell carcinoma liver Abdominal
Stomach, unspecified stomach Abdominal