Как использовать .replace для удаления части строки (если часть существует в массиве)? - PullRequest
0 голосов
/ 29 сентября 2019

У меня есть несколько слов, которые я хотел бы удалить из строки (это будет в цикле for):

Большинство слов, которые мне нужно удалить, - это (это регулярное выражение, которое я пробовал):

\b([[:<:]][0-9a-zA-z][[:>:]]|^'|about|after|all|also|[an]|and|another|any|are|[as]|at|[be]|because|been|before|being|\bbetween|both|but|by|came|can|come|could|did|do|each|for|from|get|got|had|[has]|have|he|her|here|him|himself|his|how|if|in|into|is|it|like|make|many|me|might|more|most|much|must|my|never|now|of|on|only|or|other|our|out|over|said|same|see|should|since|some|still|such|take|than|that|the|their|them|then|there|these|they|this|those|through|to|too|under|up|very|was|way|we|well|were|what|where|which|while|who|with|would|you|your)

Как видите, мне нужно удалить az, AZ, 0-9 и несколько слов

В качестве примера у меня есть эта фраза:

"Это данные Stackoverflow и его на многих сайтах "

Мой ожидаемый результат будет:

" Это данные Stackoverflow и его многих сайтов "

что япробовал это:

   let wordsHidden=["[about]","[after]","[all]","[also]","[an]","[and]","[another]","[any]","[are]","[as]","[at]","[be]","[because]","[been]","[before]","[being]","[between]","[both]","[but]","[by]","[came]","[can]","[come]","[could]","[did]","[do]","[each]","[for]","[from]","[get]","[got]","[had]","[has]","[have]","[he]","[her]","[here]","[him]","[himself]","[his]","[how]","[if]","[in]","[into]","[is]","[it]","[like]","[make]","[many]","[me]","[might]","[more]","[most]","[much]","[must]","[my]","[never]","[now]","[of]","[on]","[only]","[or]","[other]","[our]","[out]","[over]","[said]","[same]","[see]","[should]","[since]","[some]","[still]","[such]","[take]","[than]","[that]","[the]","[their]","[them]","[then]","[there]","[these]","[they]","[this]","[those]","[through]","[to]","[too]","[under]","[up]","[very]","[was]","[way]","[we]","[well]","[were]","[what]","[where]","[which]","[while]","[who]","[with]","[would]","[you]","[your]"];

   let test = wordsHidden.join("|");

  let regexorg = "/\b([[:<:]][0-9a-zA-z][[:>:]]|^'|"+test+")";
  var regex = new RegExp("/"+wordsHidden.join("|")+"/", 'g');

  let string = "DLs between data";
  console.log(string.replace(regex,''));

Это регулярное выражение для действия enter image description here

есть ли способ обработать каждую часть массива как целое слово ивернуть целое обработанное слово?

Ответы [ 2 ]

1 голос
/ 29 сентября 2019

Я не уверен, что вы пытаетесь сделать с началом вашего рекс, но я нашел способ удалить определенные строки (обернутые несловесным символом) из строки.

Если вы ПРОСТО сопоставляете точные строки, у вас останутся лишние пробелы, поэтому мой подход заключается в сопоставлении несловарного символа с каждой стороны каждого слова, сопоставляя каждое продолжающееся слово, которое оно находит в списке.Если мы не будем цеплять подобные слова, мы не будем ловить соседние слова (поскольку каждое из них будет пытаться сопоставить несловесные символы вокруг себя, и они будут сталкиваться, и мы будем пропускать соседние совпадения)

wordsHidden=["about","after","all","also","an","and","another","any","are","as","at","be","because","been","before","being","between","both","but","by","came","can","come","could","did","do","each","for","from","get","got","had","has","have","he","her","here","him","himself","his","how","if","in","into","is","it","like","make","many","me","might","more","most","much","must","my","never","now","of","on","only","or","other","our","out","over","said","same","see","should","since","some","still","such","take","than","that","the","their","them","then","there","these","they","this","those","through","to","too","under","up","very","was","way","we","well","were","what","where","which","while","who","with","would","you","your"];
rexString = "\\W((" + wordsHidden.join("\\W)|(") + "\\W))+";
console.log(rexString);
regex = new RegExp(rexString, 'g');

string = "This is the Stackoverflow's Data and its into many your your you your about you sites";
match = regex.exec(string);
matches = [];
while (match != null) {
  match.lastIndex = regex.lastIndex;
  matches.push(match);
  match = regex.exec(string);
}

cutString = string;
// iterate through matches backwards from end of string to start,
// so we don't shift our indexes as we delete parts of the string)
for (i = matches.length - 1; i >= 0; i--) {
  match = matches[i];
  beforeMatch = cutString.substr(0, match.lastIndex - match[0].length);
  afterMatch = cutString.substr(match.lastIndex - 1); //leave the trailing "space", might be some other character
  console.log(beforeMatch); console.log(match[0]); console.log(afterMatch);
  cutString = beforeMatch + afterMatch;
}
console.log(cutString);
This goes from
"This is the Stackoverflow's Data and its into many your your you your about you sites" to
"This Stackoverflow's Data its sites"
with all the matching words stripped (is, the, and, into, many, your, you, about)
0 голосов
/ 29 сентября 2019

Вам нужно переписать wordsHidden, чтобы не включать [] вокруг каждого слова, иначе оно будет просто соответствовать одному символу, встречающемуся в одном из слов в массиве.Затем вам нужно проверить наличие любого слова (или одной цифры / символа) в границах слов, отметив, что мы не хотим удалять один символ, если это происходит после ':

let wordsHidden=["about","after","all","also","an","and","another","any","are","as","at","be","because","been","before","being","between","both","but","by","came","can","come","could","did","do","each","for","from","get","got","had","has","have","he","her","here","him","himself","his","how","if","in","into","is","it","like","make","many","me","might","more","most","much","must","my","never","now","of","on","only","or","other","our","out","over","said","same","see","should","since","some","still","such","take","than","that","the","their","them","then","there","these","they","this","those","through","to","too","under","up","very","was","way","we","well","were","what","where","which","while","who","with","would","you","your"];
let regex = new RegExp("\\b([^'][0-9a-z]|" + wordsHidden.join('|') + ')\\b', 'gi');

string = "This is the Stackoverflow's Data and its many sites";
console.log(string.replace(regex, ''));
Добро пожаловать на сайт PullRequest, где вы можете задавать вопросы и получать ответы от других членов сообщества.
...