Агрегация ElasticSearch - группировка и фильтрация максимального количества этих документов и группировка снова - PullRequest
0 голосов
/ 26 февраля 2020

Я пытаюсь отправить запрос вasticsearch (ES), чтобы выполнить для меня вычисление с условием, следующим за данными примера,

{user_id: 1, score: 10, status: passed, date: 2020-02-26},
{user_id: 1, score: 10, status: passed, date: 2020-02-25},
{user_id: 1, score: 8, status: passed, date: 2020-02-25},
{user_id: 1, score: 4, status: failed, date: 2020-02-27},
{user_id: 2, score: 5, status: passed, date: 2020-02-26},
{user_id: 2, score: 6, status: passed, date: 2020-02-25},
{user_id: 3, score: 1, status: failed, date: 2020-02-25},
{user_id: 3, score: 1, status: failed, date: 2020-02-25},
{user_id: 4, score: 7, status: passed, date: 2020-02-25}
{user_id: 4, score: 2, status: failed, date: 2020-02-26}
{user_id: 5, score: 0, status: failed, date: 2020-02-25}

Я хотел бы получить максимальную оценку (не оценку es ) каждого user_id для подсчета количества статусов в каждой продолжительности (снова группировать по статусам) или шаг: сгруппировать по user_id получить рекордную оценку и сгруппировать по этим статусам

пример вывода:

{
"passed":3
"failed":2
"date": 2020-02-25
},
{
"passed":2
"failed":1
"date": 2020-02-26
},
{
"passed":0
"failed":1
"date": 2020-02-27
}

даже, я играю с SQL все еще не легко решить, потому что я не могу получить row_number записи, что результат может быть неправильным (пожалуйста, исправьте меня, если я ошибаюсь)

с SQL Это как

// this nested query give a wrong answer, just to be an example
// I still playing with it 
select user_id, status from table t1 inner join (
select user_id, max(score) 
from table
where date = '2020-02-25'
group by user_id, max(score)
) table t2 on t1.user_id = t2.user_id and t1.score = t2.score 
group by t1.status

причина, я хочу сделать это, потому что моя работа хранит данные в es, и я думаю, что было бы хорошо, если бы es мог сделать это также чтобы снизить нагрузку на производительность серверной части.

Заранее спасибо и поэтому открыты для любых предложений

1 Ответ

0 голосов
/ 27 февраля 2020

Я использовал термин агрегация , чтобы получить количество документов по дате и статусу. Дайте мне знать, если это работает для вас.

Пример отображения

PUT testindex13
{
  "mappings": {
    "properties": {
      "userid": {
        "type": "keyword"
      },
      "score": {
        "type": "integer"
      },
      "status": {
        "type": "text",
        "fields": {
          "keyword": {
            "type": "keyword"
          }
        }
      },
      "date":{
        "type": "date",
        "format": "yyyy-MM-dd"
      }
    }
  }
}

Данные:

"hits" : [
      {
        "_index" : "testindex13",
        "_type" : "_doc",
        "_id" : "9XtAhXABqQAB2FLojE2i",
        "_score" : 1.0,
        "_source" : {
          "user_id" : 1,
          "score" : 10,
          "status" : "passed",
          "date" : "2020-02-26"
        }
      },
      {
        "_index" : "testindex13",
        "_type" : "_doc",
        "_id" : "9ntAhXABqQAB2FLojU0l",
        "_score" : 1.0,
        "_source" : {
          "user_id" : 1,
          "score" : 10,
          "status" : "passed",
          "date" : "2020-02-25"
        }
      },
      {
        "_index" : "testindex13",
        "_type" : "_doc",
        "_id" : "93tAhXABqQAB2FLojk2P",
        "_score" : 1.0,
        "_source" : {
          "user_id" : 1,
          "score" : 8,
          "status" : "passed",
          "date" : "2020-02-25"
        }
      },
      {
        "_index" : "testindex13",
        "_type" : "_doc",
        "_id" : "-HtAhXABqQAB2FLoj02H",
        "_score" : 1.0,
        "_source" : {
          "user_id" : 1,
          "score" : 4,
          "status" : "failed",
          "date" : "2020-02-27"
        }
      },
      {
        "_index" : "testindex13",
        "_type" : "_doc",
        "_id" : "-XtAhXABqQAB2FLokE0m",
        "_score" : 1.0,
        "_source" : {
          "user_id" : 2,
          "score" : 5,
          "status" : "passed",
          "date" : "2020-02-26"
        }
      },
      {
        "_index" : "testindex13",
        "_type" : "_doc",
        "_id" : "-ntAhXABqQAB2FLokU2x",
        "_score" : 1.0,
        "_source" : {
          "user_id" : 2,
          "score" : 6,
          "status" : "passed",
          "date" : "2020-02-25"
        }
      },
      {
        "_index" : "testindex13",
        "_type" : "_doc",
        "_id" : "-3tEhXABqQAB2FLoS00O",
        "_score" : 1.0,
        "_source" : {
          "user_id" : 1,
          "score" : 10,
          "status" : "passed",
          "date" : "2020-02-26"
        }
      },
      {
        "_index" : "testindex13",
        "_type" : "_doc",
        "_id" : "_HtEhXABqQAB2FLoTE0W",
        "_score" : 1.0,
        "_source" : {
          "user_id" : 3,
          "score" : 1,
          "status" : "failed",
          "date" : "2020-02-25"
        }
      },
      {
        "_index" : "testindex13",
        "_type" : "_doc",
        "_id" : "_XtEhXABqQAB2FLoTE2L",
        "_score" : 1.0,
        "_source" : {
          "user_id" : 3,
          "score" : 1,
          "status" : "failed",
          "date" : "2020-02-25"
        }
      },
      {
        "_index" : "testindex13",
        "_type" : "_doc",
        "_id" : "_ntEhXABqQAB2FLoTU2P",
        "_score" : 1.0,
        "_source" : {
          "user_id" : 4,
          "score" : 7,
          "status" : "passed",
          "date" : "2020-02-25"
        }
      }
    ]

Запрос:

GET testindex13/_search
{
  "size": 0, 
  "aggs": {
    "Date": {
      "terms": {
        "field": "date"
      },
      "aggs": {
        "Status": {
          "terms": {
            "field": "status.keyword",
            "size": 10
          },
          "aggs": {
            "Count": {
              "value_count": {
                "field": "user_id"
              }
            }
          }
        }
      }
    }
  }
}

Результат:

  "hits" : {
    "total" : {
      "value" : 12,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "Date" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : 1582588800000,
          "key_as_string" : "2020-02-25",
          "doc_count" : 7,
          "Status" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
                "key" : "passed",
                "doc_count" : 4,
                "Count" : {
                  "value" : 4
                }
              },
              {
                "key" : "failed",
                "doc_count" : 3,
                "Count" : {
                  "value" : 3
                }
              }
            ]
          }
        },
        {
          "key" : 1582675200000,
          "key_as_string" : "2020-02-26",
          "doc_count" : 4,
          "Status" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
                "key" : "passed",
                "doc_count" : 3,
                "Count" : {
                  "value" : 3
                }
              },
              {
                "key" : "failed",
                "doc_count" : 1,
                "Count" : {
                  "value" : 1
                }
              }
            ]
          }
        },
        {
          "key" : 1582761600000,
          "key_as_string" : "2020-02-27",
          "doc_count" : 1,
          "Status" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
                "key" : "failed",
                "doc_count" : 1,
                "Count" : {
                  "value" : 1
                }
              }
            ]
          }
        }
      ]
    }
  }
}

РЕДАКТИРОВАТЬ 2. используя сортировку, чтобы вернуть наибольшее количество пользователей по количеству баллов

GET testindex13/_search
{
  "size": 0,
  "aggs": {
    "Date": {
      "terms": {
        "field": "date"
      },
      "aggs": {
        "Status": {
          "terms": {
            "field": "status.keyword",
            "size": 10
          },
          "aggs": {
            "Users":{
              "terms": {
                "field": "user_id",
                "size": 10
              },
              "aggs": {
                "TopDocuments": {
                  "top_hits": {
                    "size": 1,
                    "sort": {"score":{"order" : "desc"}}
                  }
                }
              }
            },
            "Count": {
              "value_count": {
                "field": "user_id"
              }
            }
          }
        }
      }
    }
  }
}

Результат:

 "aggregations" : {
    "Date" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : 1582588800000,
          "key_as_string" : "2020-02-25",
          "doc_count" : 7,
          "Status" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
                "key" : "passed",
                "doc_count" : 4,
                "Count" : {
                  "value" : 4
                },
                "Users" : {
                  "doc_count_error_upper_bound" : 0,
                  "sum_other_doc_count" : 0,
                  "buckets" : [
                    {
                      "key" : 1,
                      "doc_count" : 2,
                      "TopDocuments" : {
                        "hits" : {
                          "total" : {
                            "value" : 2,
                            "relation" : "eq"
                          },
                          "max_score" : null,
                          "hits" : [
                            {
                              "_index" : "testindex13",
                              "_type" : "_doc",
                              "_id" : "9ntAhXABqQAB2FLojU0l",
                              "_score" : null,
                              "_source" : {
                                "user_id" : 1,
                                "score" : 10,
                                "status" : "passed",
                                "date" : "2020-02-25"
                              },
                              "sort" : [
                                10
                              ]
                            }
                          ]
                        }
                      }
                    },
                    {
                      "key" : 2,
                      "doc_count" : 1,
                      "TopDocuments" : {
                        "hits" : {
                          "total" : {
                            "value" : 1,
                            "relation" : "eq"
                          },
                          "max_score" : null,
                          "hits" : [
                            {
                              "_index" : "testindex13",
                              "_type" : "_doc",
                              "_id" : "-ntAhXABqQAB2FLokU2x",
                              "_score" : null,
                              "_source" : {
                                "user_id" : 2,
                                "score" : 6,
                                "status" : "passed",
                                "date" : "2020-02-25"
                              },
                              "sort" : [
                                6
                              ]
                            }
                          ]
                        }
                      }
                    },
                    {
                      "key" : 4,
                      "doc_count" : 1,
                      "TopDocuments" : {
                        "hits" : {
                          "total" : {
                            "value" : 1,
                            "relation" : "eq"
                          },
                          "max_score" : null,
                          "hits" : [
                            {
                              "_index" : "testindex13",
                              "_type" : "_doc",
                              "_id" : "_ntEhXABqQAB2FLoTU2P",
                              "_score" : null,
                              "_source" : {
                                "user_id" : 4,
                                "score" : 7,
                                "status" : "passed",
                                "date" : "2020-02-25"
                              },
                              "sort" : [
                                7
                              ]
                            }
                          ]
                        }
                      }
                    }
                  ]
                }
              },
              {
                "key" : "failed",
                "doc_count" : 3,
                "Count" : {
                  "value" : 3
                },
                "Users" : {
                  "doc_count_error_upper_bound" : 0,
                  "sum_other_doc_count" : 0,
                  "buckets" : [
                    {
                      "key" : 3,
                      "doc_count" : 2,
                      "TopDocuments" : {
                        "hits" : {
                          "total" : {
                            "value" : 2,
                            "relation" : "eq"
                          },
                          "max_score" : null,
                          "hits" : [
                            {
                              "_index" : "testindex13",
                              "_type" : "_doc",
                              "_id" : "_HtEhXABqQAB2FLoTE0W",
                              "_score" : null,
                              "_source" : {
                                "user_id" : 3,
                                "score" : 1,
                                "status" : "failed",
                                "date" : "2020-02-25"
                              },
                              "sort" : [
                                1
                              ]
                            }
                          ]
                        }
                      }
                    },
                    {
                      "key" : 5,
                      "doc_count" : 1,
                      "TopDocuments" : {
                        "hits" : {
                          "total" : {
                            "value" : 1,
                            "relation" : "eq"
                          },
                          "max_score" : null,
                          "hits" : [
                            {
                              "_index" : "testindex13",
                              "_type" : "_doc",
                              "_id" : "AHtEhXABqQAB2FLoT04D",
                              "_score" : null,
                              "_source" : {
                                "user_id" : 5,
                                "score" : 0,
                                "status" : "failed",
                                "date" : "2020-02-25"
                              },
                              "sort" : [
                                0
                              ]
                            }
                          ]
                        }
                      }
                    }
                  ]
                }
              }
            ]
          }
        },
        {
          "key" : 1582675200000,
          "key_as_string" : "2020-02-26",
          "doc_count" : 4,
          "Status" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
                "key" : "passed",
                "doc_count" : 3,
                "Count" : {
                  "value" : 3
                },
                "Users" : {
                  "doc_count_error_upper_bound" : 0,
                  "sum_other_doc_count" : 0,
                  "buckets" : [
                    {
                      "key" : 1,
                      "doc_count" : 2,
                      "TopDocuments" : {
                        "hits" : {
                          "total" : {
                            "value" : 2,
                            "relation" : "eq"
                          },
                          "max_score" : null,
                          "hits" : [
                            {
                              "_index" : "testindex13",
                              "_type" : "_doc",
                              "_id" : "9XtAhXABqQAB2FLojE2i",
                              "_score" : null,
                              "_source" : {
                                "user_id" : 1,
                                "score" : 10,
                                "status" : "passed",
                                "date" : "2020-02-26"
                              },
                              "sort" : [
                                10
                              ]
                            }
                          ]
                        }
                      }
                    },
                    {
                      "key" : 2,
                      "doc_count" : 1,
                      "TopDocuments" : {
                        "hits" : {
                          "total" : {
                            "value" : 1,
                            "relation" : "eq"
                          },
                          "max_score" : null,
                          "hits" : [
                            {
                              "_index" : "testindex13",
                              "_type" : "_doc",
                              "_id" : "-XtAhXABqQAB2FLokE0m",
                              "_score" : null,
                              "_source" : {
                                "user_id" : 2,
                                "score" : 5,
                                "status" : "passed",
                                "date" : "2020-02-26"
                              },
                              "sort" : [
                                5
                              ]
                            }
                          ]
                        }
                      }
                    }
                  ]
                }
              },
              {
                "key" : "failed",
                "doc_count" : 1,
                "Count" : {
                  "value" : 1
                },
                "Users" : {
                  "doc_count_error_upper_bound" : 0,
                  "sum_other_doc_count" : 0,
                  "buckets" : [
                    {
                      "key" : 4,
                      "doc_count" : 1,
                      "TopDocuments" : {
                        "hits" : {
                          "total" : {
                            "value" : 1,
                            "relation" : "eq"
                          },
                          "max_score" : null,
                          "hits" : [
                            {
                              "_index" : "testindex13",
                              "_type" : "_doc",
                              "_id" : "_3tEhXABqQAB2FLoTU3v",
                              "_score" : null,
                              "_source" : {
                                "user_id" : 4,
                                "score" : 2,
                                "status" : "failed",
                                "date" : "2020-02-26"
                              },
                              "sort" : [
                                2
                              ]
                            }
                          ]
                        }
                      }
                    }
                  ]
                }
              }
            ]
          }
        },
        {
          "key" : 1582761600000,
          "key_as_string" : "2020-02-27",
          "doc_count" : 1,
          "Status" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
                "key" : "failed",
                "doc_count" : 1,
                "Count" : {
                  "value" : 1
                },
                "Users" : {
                  "doc_count_error_upper_bound" : 0,
                  "sum_other_doc_count" : 0,
                  "buckets" : [
                    {
                      "key" : 1,
                      "doc_count" : 1,
                      "TopDocuments" : {
                        "hits" : {
                          "total" : {
                            "value" : 1,
                            "relation" : "eq"
                          },
                          "max_score" : null,
                          "hits" : [
                            {
                              "_index" : "testindex13",
                              "_type" : "_doc",
                              "_id" : "-HtAhXABqQAB2FLoj02H",
                              "_score" : null,
                              "_source" : {
                                "user_id" : 1,
                                "score" : 4,
                                "status" : "failed",
                                "date" : "2020-02-27"
                              },
                              "sort" : [
                                4
                              ]
                            }
                          ]
                        }
                      }
                    }
                  ]
                }
              }
            ]
          }
        }
      ]
    }
  }
...