Какие настройки мне нужно искать, чтобы снизить нагрузку на ЦП на сервере, который постоянно запрашивает много текстовых файлов. На сервере есть миллионы поддоменов, отсюда миллионы файлов robots.txt и ads.txt, которые запрашиваются Google. Проблема в том, что всякий раз, когда Google забивает сервер, запрашивающий их (что бывает часто), загрузка ЦП возрастает с 5-10% до 40-80%. В системе 12 (общих) процессоров и 48 ГБ оперативной памяти. На самом деле это излишество (на сайте обычно 40-100 активных пользователей), но он мне нужен для обработки текстовых файлов при доступе, иначе сайт выйдет из строя с ошибками 5xx!
Сервер работает под управлением Ubuntu.
Моя текущая nginx конфигурация (/etc/nginx/nginx.conf) выглядит так:
user forge;
worker_processes auto;
pid /run/nginx.pid;
include /etc/nginx/modules-enabled/*.conf;
events {
worker_connections 12288;
multi_accept on;
}
http {
limit_req_zone $binary_remote_addr zone=one:10m rate=2r/s;
##
# Basic Settings
##
sendfile on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 65;
types_hash_max_size 2048;
# server_tokens off;
server_names_hash_bucket_size 128;
# server_name_in_redirect off;
include /etc/nginx/mime.types;
default_type application/octet-stream;
##
# SSL Settings
##
ssl_protocols TLSv1 TLSv1.1 TLSv1.2; # Dropping SSLv3, ref: POODLE
ssl_prefer_server_ciphers on;
##
# Logging Settings
##
access_log /var/log/nginx/access.log;
error_log /var/log/nginx/error.log;
##
# Gzip Settings
##
gzip on;
# gzip_vary on;
# gzip_proxied any;
# gzip_comp_level 6;
# gzip_buffers 16 8k;
# gzip_http_version 1.1;
# gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript;
##
# Virtual Host Configs
##
include /etc/nginx/conf.d/*.conf;
include /etc/nginx/sites-enabled/*;
}
И site-specifici c config, который загружается через конфигурацию выше:
server {
listen 80;
listen [::]:80;
listen 443 ssl http2;
listen [::]:443 ssl http2;
server_name www.example.com;
return 301 $scheme://example.com$request_uri;
}
# Get the request_uri without the query string
map $http_host$request_uri $pageCache {
default "nonexistent";
"~^(?<subdomain1>.{4})(?<subdomain2>.*)\.example\.com(?<folder1>.*?)\/?(\?.*)?$" /storage/page-cache/subdomains/$subdomain1/$subdomain1$subdomain2$folder1/1.html;
"~^example\.com(?<folder1>.*?)\/?(\?.*)?$" /storage/page-cache$folder1/1.html;
}
server {
listen 80;
listen [::]:80;
listen 443 ssl http2;
listen [::]:443 ssl http2;
server_name .example.com;
root /home/forge/example.com/public;
# Block Bad Bots (more intense)
if ($http_user_agent ~* (bingbot|360Spider|80legs.com|Abonti|AcoonBot|Acunetix|adbeat_bot|AddThis.com|adidxbot|ADmantX|AhrefsBot|AngloINFO|Antelope|Applebot|BaiduSpider|BeetleBot|billigerbot|binlar|bitlybot|BlackWidow|BLP_bbot|BoardReader|Bolt\ 0|BOT\ for\ JCE|Bot\ mailto\:craftbot@yahoo\.com|casper|CazoodleBot|CCBot|checkprivacy|ChinaClaw|chromeframe|Clerkbot|Cliqzbot|clshttp|CommonCrawler|comodo|CPython|crawler4j|Crawlera|CRAZYWEBCRAWLER|Curious|Curl|Custo|CWS_proxy|Default\ Browser\ 0|diavol|DigExt|Digincore|DIIbot|discobot|DISCo|DoCoMo|DotBot|Download\ Demon|DTS.Agent|EasouSpider|eCatch|ecxi|EirGrabber|Elmer|EmailCollector|EmailSiphon|EmailWolf|Exabot|ExaleadCloudView|ExpertSearchSpider|ExpertSearch|Express\ WebPictures|ExtractorPro|extract|EyeNetIE|Ezooms|F2S|FastSeek|feedfinder|FeedlyBot|FHscan|finbot|Flamingo_SearchEngine|FlappyBot|FlashGet|flicky|Flipboard|g00g1e|Genieo|genieo|GetRight|GetWeb\!|GigablastOpenSource|GozaikBot|Go\!Zilla|Go\-Ahead\-Got\-It|GrabNet|grab|Grafula|GrapeshotCrawler|GTB5|GT\:\:WWW|Guzzle|harvest|heritrix|HMView|HomePageBot|HTTP\:\:Lite|HTTrack|HubSpot|ia_archiver|icarus6|IDBot|id\-search|IlseBot|Image\ Stripper|Image\ Sucker|Indigonet|Indy\ Library|integromedb|InterGET|InternetSeer\.com|Internet\ Ninja|IRLbot|ISC\ Systems\ iRc\ Search\ 2\.1|jakarta|Java|JetCar|JobdiggerSpider|JOC\ Web\ Spider|Jooblebot|kanagawa|KINGSpider|kmccrew|larbin|LeechFTP|libwww|Lingewoud|LinkChecker|linkdexbot|LinksCrawler|LinksManager\.com_bot|linkwalker|LinqiaRSSBot|LivelapBot|ltx71|LubbersBot|lwp\-trivial|Mail.RU_Bot|masscan|Mass\ Downloader|maverick|Maxthon$|Mediatoolkitbot|MegaIndex|MegaIndex|megaindex|MFC_Tear_Sample|Microsoft\ URL\ Control|microsoft\.url|MIDown\ tool|miner|Missigua\ Locator|Mister\ PiX|mj12bot|Mozilla.*Indy|Mozilla.*NEWT|MSFrontPage|Navroad|NearSite|NetAnts|netEstate|NetSpider|NetZIP|Net\ Vampire|NextGenSearchBot|nutch|Octopus|Offline\ Explorer|Offline\ Navigator|OpenindexSpider|OpenWebSpider|OrangeBot|Owlin|PageGrabber|PagesInventory|panopta|panscient\.com|Papa\ Foto|pavuk|pcBrowser|PECL\:\:HTTP|PeoplePal|Photon|PHPCrawl|planetwork|PleaseCrawl|PNAMAIN.EXE|PodcastPartyBot|prijsbest|proximic|psbot|purebot|pycurl|QuerySeekerSpider|R6_CommentReader|R6_FeedFetcher|RealDownload|ReGet|Riddler|Rippers\ 0|rogerbot|RSSingBot|rv\:1.9.1|RyzeCrawler|SafeSearch|SBIder|Scrapy|Scrapy|Screaming|SeaMonkey$|search.goo.ne.jp|SearchmetricsBot|search_robot|SemrushBot|Semrush|SentiBot|SEOkicks|SeznamBot|ShowyouBot|SightupBot|SISTRIX|sitecheck\.internetseer\.com|siteexplorer.info|SiteSnagger|skygrid|Slackbot|Slurp|SmartDownload|Snoopy|Sogou|Sosospider|spaumbot|Steeler|sucker|SuperBot|Superfeedr|SuperHTTP|SurdotlyBot|Surfbot|tAkeOut|Teleport\ Pro|TinEye-bot|TinEye|Toata\ dragostea\ mea\ pentru\ diavola|Toplistbot|trendictionbot|TurnitinBot|turnit|Twitterbot|URI\:\:Fetch|urllib|Vagabondo|Vagabondo|vikspider|VoidEYE|VoilaBot|WBSearchBot|webalta|WebAuto|WebBandit|WebCollage|WebCopier|WebFetch|WebGo\ IS|WebLeacher|WebReaper|WebSauger|subdomain\ eXtractor|subdomain\ Quester|WebStripper|WebWhacker|WebZIP|Web\ Image\ Collector|Web\ Sucker|Wells\ Search\ II|WEP\ Search|WeSEE|Wget|Widow|WinInet|woobot|woopingbot|worldwebheritage.org|Wotbox|WPScan|WWWOFFLE|WWW\-Mechanize|Xaldon\ WebSpider|XoviBot|yacybot|YandexBot|Yandex|YisouSpider|zermelo|Zeus|zh-CN|ZmEu|ZumBot|ZyBorg) ) {
return 444;
}
# FORGE SSL (DO NOT REMOVE!)
ssl_certificate /etc/nginx/ssl/example.com/123/server.crt;
ssl_certificate_key /etc/nginx/ssl/example.com/123/server.key;
ssl_protocols TLSv1.2;
ssl_ciphers BLAHBLAH;
ssl_prefer_server_ciphers on;
ssl_dhparam /etc/nginx/dhparams.pem;
add_header X-Frame-Options "SAMEORIGIN";
add_header X-XSS-Protection "1; mode=block";
add_header X-Content-Type-Options "nosniff";
index index.html index.htm index.php;
charset utf-8;
# FORGE CONFIG (DO NOT REMOVE!)
include forge-conf/example.com/server/*;
location / {
limit_req zone=one burst=10 nodelay;
if ($http_user_agent ~* "^.*wkhtmltoimage.*$"){
return 403;
}
try_files $pageCache $uri $uri/ /index.php?$query_string;
}
location = /favicon.ico {
access_log off;
log_not_found off;
root /home/forge/example.com/public;
}
location = /robots.txt {
access_log off;
log_not_found off;
#root /home/forge/example.com/public;
add_header Content-Type text/plain;
return 200 "#robots txt code here";
}
location = /ads.txt {
access_log off;
log_not_found off;
#root /home/forge/example.com/public;
add_header Content-Type text/plain;
return 200 "#adcodehere";
}
location ~* \.(js)$ {
expires 3d;
}
access_log off;
error_log /var/log/nginx/example.com-error.log error;
error_page 404 /index.php;
location ~ \.php$ {
fastcgi_split_path_info ^(.+\.php)(/.+)$;
fastcgi_pass unix:/var/run/php/php7.3-fpm.sock;
fastcgi_index index.php;
include fastcgi_params;
# Tweaks
fastcgi_buffers 8 16k; # increase the buffer size for PHP-FTP
fastcgi_buffer_size 32k; # increase the buffer size for PHP-FTP
fastcgi_connect_timeout 60;
fastcgi_send_timeout 300;
fastcgi_read_timeout 300;
}
location ~ /\.(?!well-known).* {
deny all;
}
}
# FORGE CONFIG (DO NOT REMOVE!)
include forge-conf/example.com/after/*;