Заменить несколько строк в файле, пометив их - PullRequest
0 голосов
/ 27 сентября 2018

Я хотел бы заменить несколько строк в файле, например, IP-адрес, и хотел бы пометить их так, чтобы любое повторное вхождение было отмечено одним и тем же именем.

Например, если этомой файл:

2018-09-13 19:00:00,317 INFO  -util.SSHUtil: Waiting for channel close
2018-09-13 19:00:01,317 INFO  -util.SSHUtil: Waiting for channel close
2018-09-13 19:00:01,891 INFO  -filters.BasicAuthFilter: Client IP:192.168.100.98
2018-09-13 19:00:01,891 INFO  -filters.BasicAuthFilter: Validating token ... 
2018-09-13 19:00:01,892 INFO  -authentication.Tokenization: Token:192.168.100.98:20180913_183401is present in map
2018-09-13 19:00:01,892 INFO  -configure.ConfigStatusCollector: status.
2018-09-13 19:00:01,909 INFO  -filters.BasicAuthFilter: Client IP:192.168.100.98
2018-09-13 19:00:01,909 INFO  -filters.BasicAuthFilter: Validating token ... 
2018-09-13 19:00:01,910 INFO  -authentication.Tokenization: Token:192.168.100.98:20180913_183401is present in map
2018-09-13 19:00:01,910 INFO  -restadapter.ConfigStatusService: configuration status.
2018-09-13 19:00:01,910 INFO  -configure.Collector: Getting configuration status.
2018-09-13 19:00:02,318 INFO  -util.SSHUtil: Processing the ssh command execution results standard output.
2018-09-13 19:00:02,318 INFO  -util.SSHUtil: Processing the ssh command execution standard error.
2018-09-13 19:00:02,318 INFO  -util.SSHUtil: Remote command using SSH execution status: Host     : [10.2.251.129]   User     : [root]   Password : [***********]    Command  : [shell ntpdate -u 132.132.0.88]  STATUS   : [0]
2018-09-13 19:00:02,318 INFO  -util.SSHUtil:    STDOUT   : [Shell access is granted to root
            14 Sep 01:00:01 ntpdate[16063]: adjust time server 132.132.0.88 offset 0.353427 sec
]
2018-09-13 19:00:02,318 INFO  -util.SSHUtil:    STDERR   : []
2018-09-13 19:00:02,318 INFO  -util.SSHUtil: Successfully executed remote command using SSH.
2018-09-13 19:00:02,318 INFO  Successfully executed the command on VCenter :10.2.251.129

Это должно стать:

2018-09-13 19:00:00,317 INFO  -util.SSHUtil: Waiting for channel close
2018-09-13 19:00:01,317 INFO  -util.SSHUtil: Waiting for channel close
2018-09-13 19:00:01,891 INFO  -filters.BasicAuthFilter: Client IP:IP_1
2018-09-13 19:00:01,891 INFO  -filters.BasicAuthFilter: Validating token ... 
2018-09-13 19:00:01,892 INFO  -authentication.Tokenization: Token:IP_1:20180913_183401is present in map
2018-09-13 19:00:01,892 INFO  -configure.ConfigStatusCollector: status.
2018-09-13 19:00:01,909 INFO  -filters.BasicAuthFilter: Client IP:IP_1
2018-09-13 19:00:01,909 INFO  -filters.BasicAuthFilter: Validating token ... 
2018-09-13 19:00:01,910 INFO  -authentication.Tokenization: Token:IP_1:20180913_183401is present in map
2018-09-13 19:00:01,910 INFO  -restadapter.ConfigStatusService: configuration status.
2018-09-13 19:00:01,910 INFO  -configure.Collector: Getting configuration status.
2018-09-13 19:00:02,318 INFO  -util.SSHUtil: Processing the ssh command execution results standard output.
2018-09-13 19:00:02,318 INFO  -util.SSHUtil: Processing the ssh command execution standard error.
2018-09-13 19:00:02,318 INFO  -util.SSHUtil: Remote command using SSH execution status: Host     : [IP_2]   User     : [root]   Password : [***********]    Command  : [shell ntpdate -u IP_3]  STATUS   : [0]
2018-09-13 19:00:02,318 INFO  -util.SSHUtil:    STDOUT   : [Shell access is granted to root
        14 Sep 01:00:01 ntpdate[16063]: adjust time server IP_3 offset 0.353427 sec]
2018-09-13 19:00:02,318 INFO  -util.SSHUtil:    STDERR   : []
2018-09-13 19:00:02,318 INFO  -util.SSHUtil: Successfully executedremote command using SSH.
2018-09-13 19:00:02,318 INFO  Successfully executed the command on VCenter :IP_2

Приведенный ниже скрипт на самом деле делает то, что я хочу, но затем его специфический файл:

import typing, re
def change_ips(ips:typing.List[str]) -> typing.Generator[str, None, None]:
   val = {}
   count = 1
   for i in ips:
     if i not in val:
       yield f'IP_{count}'
       val[i] = count
       count += 1
     else:
       yield f'IP_{val[i]}'


with open(r'server.log') as f:
  content = f.read()
  with open(r'logfile2.txt', 'w') as f1:

    f1.write(re.sub('\d+\.\d+\.\d+\.\d+', '{}', content).format(*change_ips(re.findall('\d+\.\d+\.\d+\.\d+', content))))

Это работаетно тогда он специфичен для каждого файла и не работает с другими файлами журнала, я хотел бы сделать его таким, чтобы любой файл с IP-адресом в любой строке работал бы не с конкретным файлом журнала.

Пример, где это не работает:

2018-09-15 15:58:20,083 INFO  [Timer-0]-util.SSHUtil:   STDERR   : []
2018-09-15 15:58:20,083 INFO  [Timer-0]-util.SSHUtil: Successfully executed remote command using SSH.
2018-09-15 15:58:20,083 INFO  [Timer-0]-dashboard.KBDash: getProcessSummary -->  processing output line

2018-09-15 15:58:20,083 INFO  [Timer-0]-dashboard.KBDash: getProcessSummary -->  processing output line
---------------------------------------------------------------------
2018-09-15 15:58:20,083 INFO  [Timer-0]-dashboard.KBDash: getProcessSummary -->  processing output line
Validate [33mKBDash2121 Node[0m installation BEGIN:
2018-09-15 15:58:20,083 INFO  [Timer-0]-dashboard.KBDash: getProcessSummary -->  processing output line
Show KBDash2121 system configuration:  [33m1.1.2.371[0m
2018-09-15 15:58:20,083 INFO  [Timer-0]-dashboard.KBDash: getProcessSummary -->  processing output line
*****************************************************************
2018-09-15 15:58:20,090 INFO  [Timer-0]-util.SSHUtil: Connecting to host [10.60.9.44] using provided credentials.
2018-09-15 15:58:20,083 INFO  [Timer-0]-dashboard.KBDash: getProcessSummary -->  processing output line
    "cis_url"               : "https://localhost:441/cis/v1.1",
2018-09-15 15:58:20,083 INFO  [Timer-0]-dashboard.KBDash: getProcessSummary -->  processing output line
    "app_name"              : "KBDash2121",
2018-09-15 15:58:20,083 INFO  [Timer-0]-dashboard.KBDash: getProcessSummary -->  processing output line
    "node_name"             : "idpa-1-dps",
2018-09-15 15:59:40,093 ERROR [Timer-0]-dashboard.DPSDashboard: Unable to validate ssh credential.Host 10.60.9.44 is not reachable.
2018-09-15 15:59:40,093 ERROR [Timer-0]-dashboard.DPSDashboard: loadDataNodeStatus --> unable to find data node process statuscom.common.exception.ApplianceException: Host 10.60.9.44 is not reachable.
2018-09-15 15:58:20,083 INFO  [Timer-0]-dashboard.KBDash: getProcessSummary -->  processing output line
    "system_index_name"     : "system",
2018-09-15 15:58:20,083 INFO  [Timer-0]-dashboard.KBDash: getProcessSummary -->  processing output line
    "worker_id"             : "aWRwYS0xLWRwc3wwMDo1MDo1Njo5RDoyRDo4RSA=",
2018-09-15 15:58:20,083 INFO  [Timer-0]-dashboard.KBDash: getProcessSummary -->  processing output line
    "work_base_folder": "/mnt/KBDash2121_work",
2018-09-15 15:58:20,083 INFO  [Timer-0]-dashboard.KBDash: getProcessSummary -->  processing output line
    "service_work_folder"                          : "tmp/dpworker",
2018-09-15 15:58:20,084 INFO  [Timer-0]-dashboard.KBDash: getProcessSummary -->  processing output line
    "web_download_folder"   : "tmp/dpweb",
2018-09-15 15:58:20,084 INFO  [Timer-0]-dashboard.KBDash: getProcessSummary -->  processing output line
    "admin_api_url"         : "https://localhost:448/admin_api/v1",
2018-09-15 15:58:20,084 INFO  [Timer-0]-dashboard.KBDash: getProcessSummary -->  processing output line
    "search_api_url"        : "https://localhost:449/search_api/v1",
2018-09-15 15:58:20,084 INFO  [Timer-0]-dashboard.KBDash: getProcessSummary -->  processing output line
*****************************************************************
2018-09-15 15:58:20,084 INFO  [Timer-0]-dashboard.KBDash: getProcessSummary -->  processing output line
[32mDirectory: /usr/local/KBDash2121 has been created [0m
2018-09-15 15:58:20,084 INFO  [Timer-0]-dashboard.KBDash: getProcessSummary -->  processing output line
[32mFile: /usr/local/KBDash2121/etc/system.conf has been created [0m
2018-09-15 15:58:20,084 INFO  [Timer-0]-dashboard.KBDash: getProcessSummary -->  processing output line
[32mService: dpworker is on[0m
2018-09-15 15:58:20,084 INFO  [Timer-0]-dashboard.KBDash: getProcessSummary -->  processing output line
[32mService: nginx is on[0m
2018-09-15 15:58:20,084 INFO  [Timer-0]-dashboard.KBDash: getProcessSummary -->  processing output line
[32mProccess: WorkerService is running[0m
2018-09-15 15:58:20,084 INFO  [Timer-0]-dashboard.KBDash: getProcessSummary -->  processing output line
[32mProccess: nginx is running[0m
2018-09-15 15:58:20,084 INFO  [Timer-0]-dashboard.KBDash: getProcessSummary -->  processing output line
[33mchecking admin api url:https://localhost:448......
2018-09-15 15:58:20,084 INFO  [Timer-0]-dashboard.KBDash: getProcessSummary -->  processing output line
[32mOk: {"status":200,"name":"myspace","version":"1.1.2.371","cis":"online","tagline":"none"}[0m
2018-09-15 15:59:40,106 INFO  [Timer-0]-util.SSHUtil: Connecting to host [10.60.9.59] using provided credentials.
2018-09-15 15:59:40,209 INFO  [Timer-0]-util.SSHUtil: Connected to host [10.60.9.59] using provided credentials.

1 Ответ

0 голосов
/ 27 сентября 2018

Вы можете сохранить массив уникальных IP-адресов и использовать их индекс в массиве в качестве значения подстановки.

В приведенном ниже коде \1 в replace_func относится к первому совпадению в регулярном выражении.Мы ищем это в массиве (добавляем, если необходимо), форматируем его должным образом и возвращаем для использования в качестве значения подстановки для re.sub ниже.

Примерно так:

import fileinput
import re

ips = []

def replace_func(match):
    ip = match.expand(r'\1')
    if ip not in ips:
        ips.append(ip)
    return 'IP_%s' % ips.index(ip)

with fileinput.input('server.log', inplace=True, backup='.bak') as file:
    for line in file:
        print(re.sub(r'(\d+\.\d+\.\d+\.\d+)', replace_func, line), end='')
Добро пожаловать на сайт PullRequest, где вы можете задавать вопросы и получать ответы от других членов сообщества.
...