У меня есть кластер AWSasticsearch со следующими настройками:
curl -s 'https://***..es.amazonaws.com/_cluster/settings' | jq SIGINT(2)|SIGINT(2)|0 ↵ 10017 14:38:29
{
"persistent": {
"cluster": {
"routing": {
"allocation": {
"cluster_concurrent_rebalance": "2",
"node_concurrent_recoveries": "2",
"disk": {
"watermark": {
"low": "15.0gb",
"flood_stage": "5.0gb",
"high": "10.0gb"
}
},
"node_initial_primaries_recoveries": "4"
}
},
"blocks": {
"create_index": "false"
}
},
"indices": {
"recovery": {
"max_bytes_per_sec": "40mb"
}
}
},
"transient": {
"cluster": {
"routing": {
"allocation": {
"cluster_concurrent_rebalance": "2",
"node_concurrent_recoveries": "2",
"disk": {
"watermark": {
"low": "15.0gb",
"flood_stage": "5.0gb",
"high": "10.0gb"
}
},
"exclude": {
"_ip": "10.212.32.62,10.212.31.186"
},
"node_initial_primaries_recoveries": "4",
"awareness": {}
}
}
},
"indices": {
"recovery": {
"max_bytes_per_sec": "40mb"
}
}
}
}
здоровье возвращается
curl -s 'https://***..es.amazonaws.com/_cluster/health?pretty' | jq ✔ 10018 14:38:50
{
"cluster_name": "***",
"status": "red",
"timed_out": false,
"number_of_nodes": 13,
"number_of_data_nodes": 10,
"active_primary_shards": 3116,
"active_shards": 3562,
"relocating_shards": 0,
"initializing_shards": 16,
"unassigned_shards": 9214,
"delayed_unassigned_shards": 0,
"number_of_pending_tasks": 83,
"number_of_in_flight_fetch": 1974,
"task_max_waiting_in_queue_millis": 49831498,
"active_shards_percent_as_number": 27.84552845528455
}
копать дальше, я получаю
curl -s 'https://***..es.amazonaws.com/_nodes/stats' | jq ✔ 10019 14:43:41
{
"_nodes": {
"total": 13,
"successful": 12,
"failed": 1,
"failures": [
{
"type": "failed_node_exception",
"reason": "Failed node [o3Fb21UVQx2rwwm2ZiVu7w]",
"caused_by": {
"type": "exception",
"reason": "failed to refresh store stats",
"caused_by": {
"type": "file_system_exception",
"reason": "/hdd1/mnt/env/root/apollo/env/swift-eu-west-1-prod-ES_6_3AMI-ES2-p001/var/es/data/nodes/0/indices/wvMTt2eiSfSDFVQbuGDEeQ/1/index: Too many open files"
}
}
}
]
},
иоткрытые файлы:
curl -s -XGET 'https://***..es.amazonaws.com/_cat/nodes?v&h=ip,fdc,fdm'
ip fdc fdm
x.x.x.x 70014 128000
x.x.x.x 950 128000
x.x.x.x 915 128000
x.x.x.x 949 128000
x.x.x.x 950 128000
x.x.x.x 954 128000
x.x.x.x 9124 128000
x.x.x.x
x.x.x.x 36916 128000
x.x.x.x 951 128000
x.x.x.x 919 128000
x.x.x.x 948 128000
x.x.x.x 950 128000
Любые советы о том, как решить эту проблему, высоко ценится.