Извлечь несколько полей, используя скаляры в JQ - PullRequest
2 голосов
/ 21 марта 2019

Я пытаюсь выбрать только определенные поля из файла JSON и их полный путь (результаты получены из Elasticsearch).

Мой JSON-файл:

{
  "_index": "ships",
  "_type": "doc",
  "_id": "c36806c10a96a3968c07c6a222cfc818",
  "_score": 0.057158414,
  "_source": {
    "user_email": "admin@example.com",
    "current_send_date": 1552557382,
    "next_send_date": 1570798063,
    "data_name": "atari",
    "statistics": {
      "game_mode": "engineer",
      "opened_game": 0,
      "user_score": 0,
      "space_1": {
        "ship_send_priority": 10,
        "ssl_required": "true",
        "ship_send_delay": 15,
        "user_score": 0,
        "template1": {
          "current_ship_status": "sent",
          "current_ship_date": "4324242",
          "checked_link_before_clicked": 0
        },
        "template2": {
          "current_ship_status": "sent",
          "current_ship_date": "4324242",
          "checked_payload": 0
        }
      }
    }
  }
}

Я преобразовываю ключи в один вкладыш:

<file jq -c 'paths(scalars) as $p | [$p, getpath($p)]'
[["_index"],"ships"]
[["_type"],"doc"]
[["_id"],"c36806c10a96a3968c07c6a222cfc818"]
[["_score"],0.057158414]
[["_source","user_email"],"admin@example.com"]
[["_source","current_send_date"],1552557382]
[["_source","next_send_date"],1570798063]
[["_source","data_name"],"atari"]
[["_source","statistics","game_mode"],"engineer"]
[["_source","statistics","opened_game"],0]
[["_source","statistics","user_score"],0]
[["_source","statistics","space_1","ship_send_priority"],10]
[["_source","statistics","space_1","ssl_required"],"true"]
[["_source","statistics","space_1","ship_send_delay"],15]
[["_source","statistics","space_1","user_score"],0]
[["_source","statistics","space_1","template1","current_ship_status"],"sent"]
[["_source","statistics","space_1","template1","current_ship_date"],"4324242"]
[["_source","statistics","space_1","template1","checked_link_before_clicked"],0]
[["_source","statistics","space_1","template2","current_ship_status"],"sent"]
[["_source","statistics","space_1","template2","current_ship_date"],"4324242"]
[["_source","statistics","space_1","template2","checked_payload"],0]

Затем я передаю вывод в grep для извлечения всех полей, которые яwant:

<file jq -c 'paths(scalars) as $p | [$p, getpath($p)]'  | grep -e '"_index"\|current_send_date\|current_send_date\|ship_send_delay\|ship_send_priority\|current_ship_status'
[["_index"],"ships"]
[["_source","current_send_date"],1552557382]
[["_source","statistics","space_1","ship_send_priority"],10]
[["_source","statistics","space_1","ship_send_delay"],15]
[["_source","statistics","space_1","template1","current_ship_status"],"sent"]
[["_source","statistics","space_1","template2","current_ship_status"],"sent"]

В конце я передаю вывод grep для sed и очистки символов, которые мне не нужны, в результате чего я хочу:

<file jq -c 'paths(scalars) as $p | [$p, getpath($p)]'  | grep -e '"_index"\|current_send_date\|current_send_date\|ship_send_delay\|ship_send_priority\|current_ship_status' | sed -e 's/\[\["//g' -e 's/","/./g' -e 's/"],"/=/g' -e 's/"],/=/g' -e 's/]$//g' -e 's/"$//g'

_index=ships
_source.current_send_date=1552557382
_source.statistics.space_1.ship_send_priority=10
_source.statistics.space_1.ship_send_delay=15
_source.statistics.space_1.template1.current_ship_status=sent
_source.statistics.space_1.template2.current_ship_status=sent

Я ищулучший способ по крайней мере извлечь поля из JQ, не используя grep.Я могу жить с подготовкой контента, используя SED, но я чувствую, что должен быть лучший способ получить поля, которые я хочу, не используя grep.Я считаю, что должен быть какой-то выбор (.mykey | .mykey1 | .mykey2), который может это сделать.

1 Ответ

2 голосов
/ 21 марта 2019

Используйте join и интерполяцию строк (\(...)):

$ jq -r 'paths(scalars) as $p | "\($p|join("."))=\(getpath($p))"' file
_index=ships
_type=doc
_id=c36806c10a96a3968c07c6a222cfc818
_score=0.057158414
_source.user_email=admin@example.com
_source.current_send_date=1552557382
_source.next_send_date=1570798063
_source.data_name=atari
_source.statistics.game_mode=engineer
_source.statistics.opened_game=0
_source.statistics.user_score=0
_source.statistics.space_1.ship_send_priority=10
_source.statistics.space_1.ssl_required=true
_source.statistics.space_1.ship_send_delay=15
_source.statistics.space_1.user_score=0
_source.statistics.space_1.template1.current_ship_status=sent
_source.statistics.space_1.template1.current_ship_date=4324242
_source.statistics.space_1.template1.checked_link_before_clicked=0
_source.statistics.space_1.template2.current_ship_status=sent
_source.statistics.space_1.template2.current_ship_date=4324242
_source.statistics.space_1.template2.checked_payload=0

На самом деле вам даже не нужен grep, если у вас последняя версия jq, попробуйте это:

(paths(scalars) | select(IN(.[];
    "_index",
    "current_send_data",
    "ship_send_delay",
    "ship_send_priority",
    "current_ship_status"
))) as $p | "\($p|join("."))=\(getpath($p))"
...