Я использую запрос ниже, чтобы идентифицировать ненужные документы в моей коллекции:
db.mycoll.find(
{ $and: [
{"Text": { $regex:'Error' }},
{"Language": { $eq: "English" }},
] })
Это дает мне все документы, которые в виде строки "Ошибка", включают несколько чистых, которые отличаютсяиз http / database / php ошибки. Я вручную искал эти шаблоны и создал список строк, таких как:
var junk =["MySQL Query Error",
"Fatal error",
"Uncaught Error",
"Forgot your password",
"Failed to connect to localhost",
"RuntimeException",
"Error message",
"Error Sql",
"Error 404",.....]
Теперь я пытаюсь написать запрос, который будет искать эти ненужные строки в моей коллекции так, чтобы везде, где я мог найтилюбую из вышеперечисленных нежелательных строк в моей строке значений ( как подстрока и не точное совпадение ), я просто отрицаю этот документ!
Например,
Документы:
{
"_id" : NumberInt(441868),
"Text" : "404 Error",
"newLanguage" : "English"
}
{
"_id" : NumberInt(5860039),
"Text" : "France orders the withdrawal of infant milk due to risk of salmonellosis\nHEALTH 1 week ago Elpais 33\nThe authorities publish a list of lots of Lactalis that would affect more than a dozen countries, including Colombia, Peru or the United Kingdom\nComments\nLoading ...\nPlease, Insert your name Please, Enter your E-Mail Please write your comments Error Happened\nRelated Posts",
"newLanguage" : "English"
}
....
Поле «Текст» имеет следующие значения:
Неверные документы:
"500 Error",
"400 Error",
"==============================================================================\nError Sql : UPDATE mynews_art SET view_cnt = view_cnt + 1 WHERE site_id = 12265 AND art_no =\nError Msg : You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '' at line 1\n==============================================================================\n==============================================================================\nError Sql : SELECT chgdate, pubdate FROM mynews_art WHERE site_id = 12265 AND art_no =\nError Msg : You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '' at line 1\n==============================================================================",
"*\nError 404, Page not found\n??? ??? ?????? ??? ?????? ???????? ???? ????? www.alayam.com ?????? ??????? ???? ?????? ?????? ?????? ????? ?? ??????? ??? ?????? ????????\n? ???? ?? ??? ?????",
"Server Error in '/' Application.\nRuntime Error\nDescription: An application error occurred on the server. The current custom error settings for this application prevent the details of the application error from being viewed remotely (for security reasons). It could, however, be viewed by browsers running on the local server machine.\nDetails: To enable the details of this specific error message to be viewable on remote machines, please create a <customErrors> tag within a \"web.config\" configuration file located in the root directory of the current web application. This <customErrors> tag should then have its \"mode\" attribute set to \"Off\".\n<!-- Web.Config Configuration File --> <configuration> <system.web> <customErrors mode=\"Off\"/> </system.web> </configuration>\nNotes: The current error page you are seeing can be replaced by a custom error page by modifying the \"defaultRedirect\" attribute of the application's <customErrors> configuration tag to point to a custom error page URL.\n<!-- Web.Config Configuration File --> <configuration> <system.web> <customErrors mode=\"RemoteOnly\" defaultRedirect=\"mycustompage.htm\"/> </system.web> </configuration>",
"Error 301 Moved permanently HTTP\nMoved permanently HTTPS",
....
Хорошие документы: * 1021Оператор *
"Quebec whooping cough: 5 things to know\nDeaths and explosion of cases in Mauricie prompt warning from health officials\nCBC News\nPosted: Nov 30, 2015 10:02 PM ET Last Updated: Dec 01, 2015 9:31 AM ET\nQuebec health officials say it's crucial for babies to follow the vaccination schedule for the whooping cough.Typo or Error Send Feedback\nTo encourage thoughtful and respectful conversations, first and last names will appear with each submission to CBC/Radio-Canada's online communities"
"Refugee crisis: Croatia lifts border blockade with Serbia\nThomson Reuters\nPosted: Sep 25, 2015 12:02 PM ET Last Updated: Sep 25, 2015 12:07 PM ET\nMigrants wait to cross into Croatia through the Serbian border on Friday in Bapska, Croatia. More than 40,000 migrants have crossed into Croatia from Serbia since Tuesday last week, and the Croatian government has said it can't cope with the flow.
Typo or Error Send Feedback\nTo encourage thoughtful and respectful conversations."
.....
$IN
ищет точное совпадение, а оператор $regrex
не допускает список в качестве аргумента !. Любые предложения о том, какой оператор будет работать в этом сценарии и как?