Поиск по шаблону в файле Excel и получение строки идентификатора Python - PullRequest
0 голосов
/ 26 мая 2018

Я новичок в питоне, пандах.Я пытаюсь прочитать CSV-файл с помощью панд, но я получаю ошибку синтаксического анализа CSV.Это мой шаблон csv top, см. Это (Невозможно отобразить изображение, пожалуйста, обратитесь по ссылке).

https://i.stack.imgur.com/HIBoj.jpg

--------------------------------------------------------------------------------------------------------------------------------------------------,,,,,,,
                                                     Data Records,,,,,,,
--------------------------------------------------------------------------------------------------------------------------------------------------,,,,,,,
ABC : - xxxxxxxxxxx,,,,,,,
Type :- xxxxxxxxxxx,,,,,,,
Date :- xxxxxxxxxx,,,,,,,
Till Date :- xxxxxxxxxx,,,,,,,
Report Index :- xxxxxxxxxx,,,,,,,
Report Date :- 01-Jul-2017 11:18:41 AM,,,,,,,
--------------------------------------------------------------------------------------------------------------------------------------------------,,,,,,,
A PARTY, B PARTY, DATE, TIME, DURATION, ID, ID_A, TYPE
--------------------------------------------------------------------------------------------------------------------------------------------------,,,,,,,
XXXXXXXX,XXXXXXXX, 26-JAN-2017, 11:51:54,1,123456788889999, -, ZXC
XXXXXXXX,XXXXXXXX, 26-JAN-2017, 11:52:06,1,123456788889999, -, QWE
XXXXXXXX,XXXXXXXX, 26-JAN-2017, 11:52:11,1,123456788889999, -, RRR
XXXXXXXX,XXXXXXXX, 26-JAN-2017, 11:52:12,1,123456788889999, -, BGF
XXXXXXXX,XXXXXXXX, 26-JAN-2017, 11:52:25,1,123456788889999, -, OOO
XXXXXXXX,XXXXXXXX, 26-JAN-2017, 11:53:23,1,123456788889999, -, BGF
XXXXXXXX,XXXXXXXX, 26-JAN-2017, 11:54:00,1,123456788889999, -, NBG
XXXXXXXX,XXXXXXXX, 26-JAN-2017, 11:54:38,1,123456788889999, -, BGFD
XXXXXXXX,XXXXXXXX, 26-JAN-2017, 11:54:39,1,123456788889999, -, OIU
XXXXXXXX,XXXXXXXX, 26-JAN-2017, 12:03:14,1,123456788889999, -, BGF
XXXXXXXX,XXXXXXXX, 26-JAN-2017, 12:07:43,1,123456788889999, -, GGG
XXXXXXXX,XXXXXXXX, 26-JAN-2017, 12:11:53,1,123456788889555, -, VVVV
XXXXXXXX,XXXXXXXX, 26-JAN-2017, 12:13:12,1,123456788889555, -, VVVV
XXXXXXXX,XXXXXXXX, 26-JAN-2017, 12:13:12,1,123456788889555, -, VVVV
XXXXXXXX,XXXXXXXX, 26-JAN-2017, 12:13:44,1,123456788889555, -, VVVV
XXXXXXXX,XXXXXXXX, 26-JAN-2017, 12:13:44,1,123456788889555, -, VVVV
,,,,,,,
,,,,,,,
,,,,,,,
Note :- This is a System generated Report.,,,,,,,

Над общим текстом редактируется, мой исходный файл имеет более 1000 строк.

И ошибка -

Traceback (most recent call last):<br>
  File "<stdin>", line 1, in <module><br>
  File "/home/xxxxx/.local/lib/python2.7/site-packages/pandas/io/parsers.py", line 678, in parser_f
    return _read(filepath_or_buffer, kwds)<br>
  File "/home/xxxxx/.local/lib/python2.7/site-packages/pandas/io/parsers.py", line 446, in _read
    data = parser.read(nrows)<br>
  File "/home/xxxxx/.local/lib/python2.7/site-packages/pandas/io/parsers.py", line 1036, in read
    ret = self._engine.read(nrows)<br>
  File "/home/xxxxx/.local/lib/python2.7/site-packages/pandas/io/parsers.py", line 1848, in read
    data = self._reader.read(nrows)<br>
  File "pandas/_libs/parsers.pyx", line 876, in pandas._libs.parsers.TextReader.read<br>
  File "pandas/_libs/parsers.pyx", line 891, in pandas._libs.parsers.TextReader._read_low_memory<br>
  File "pandas/_libs/parsers.pyx", line 945, in pandas._libs.parsers.TextReader._read_rows<br>
  File "pandas/_libs/parsers.pyx", line 932, in pandas._libs.parsers.TextReader._tokenize_rows<br>
  File "pandas/_libs/parsers.pyx", line 2112, in pandas._libs.parsers.raise_parser_error<br>
pandas.errors.ParserError: Error tokenizing data. C error: Expected 1 fields in line 11, saw 13

Когда я пропускаю строки 12, все работает отлично.Итак, перед чтением этого файла я хочу найти шаблон '---' и получить идентификатор строки, затем я могу пропустить строки во время чтения с помощью функции 'read_csv'.

import pandas as pd 
ff = pd.read_csv("test.csv")
ff

Вот мой код.

Заранее спасибо.

1 Ответ

0 голосов
/ 26 мая 2018

Я не получил ошибку:

python.exe temp.py
   -------------------------------------------------------------------------------------------------------------------------------------------------- Unnamed: 1    Unnamed: 2 Unnamed: 3 Unnamed: 4       Unnamed: 5 Unnamed: 6 Unnamed: 7
0                                                 ...                                                                                                        NaN           NaN        NaN        NaN              NaN        NaN        NaN
1   ----------------------------------------------...                                                                                                        NaN           NaN        NaN        NaN              NaN        NaN        NaN
2                                 ABC : - xxxxxxxxxxx                                                                                                        NaN           NaN        NaN        NaN              NaN        NaN        NaN
3                                 Type :- xxxxxxxxxxx                                                                                                        NaN           NaN        NaN        NaN              NaN        NaN        NaN
4                                  Date :- xxxxxxxxxx                                                                                                        NaN           NaN        NaN        NaN              NaN        NaN        NaN
5                             Till Date :- xxxxxxxxxx                                                                                                        NaN           NaN        NaN        NaN              NaN        NaN        NaN
6                          Report Index :- xxxxxxxxxx                                                                                                        NaN           NaN        NaN        NaN              NaN        NaN        NaN
7              Report Date :- 01-Jul-2017 11:18:41 AM                                                                                                        NaN           NaN        NaN        NaN              NaN        NaN        NaN
8   ----------------------------------------------...                                                                                                        NaN           NaN        NaN        NaN              NaN        NaN        NaN
9                                             A PARTY                                                                                                    B PARTY          DATE       TIME   DURATION               ID       ID_A       TYPE
10  ----------------------------------------------...                                                                                                        NaN           NaN        NaN        NaN              NaN        NaN        NaN
11                                           XXXXXXXX                                                                                                   XXXXXXXX   26-JAN-2017   11:51:54          1  123456788889999          -        ZXC
12                                           XXXXXXXX                                                                                                   XXXXXXXX   26-JAN-2017   11:52:06          1  123456788889999          -        QWE
13                                           XXXXXXXX                                                                                                   XXXXXXXX   26-JAN-2017   11:52:11          1  123456788889999          -        RRR
14                                           XXXXXXXX                                                                                                   XXXXXXXX   26-JAN-2017   11:52:12          1  123456788889999          -        BGF
15                                           XXXXXXXX                                                                                                   XXXXXXXX   26-JAN-2017   11:52:25          1  123456788889999          -        OOO
16                                           XXXXXXXX                                                                                                   XXXXXXXX   26-JAN-2017   11:53:23          1  123456788889999          -        BGF
17                                           XXXXXXXX                                                                                                   XXXXXXXX   26-JAN-2017   11:54:00          1  123456788889999          -        NBG
18                                           XXXXXXXX                                                                                                   XXXXXXXX   26-JAN-2017   11:54:38          1  123456788889999          -       BGFD
19                                           XXXXXXXX                                                                                                   XXXXXXXX   26-JAN-2017   11:54:39          1  123456788889999          -        OIU
20                                           XXXXXXXX                                                                                                   XXXXXXXX   26-JAN-2017   12:03:14          1  123456788889999          -        BGF
21                                           XXXXXXXX                                                                                                   XXXXXXXX   26-JAN-2017   12:07:43          1  123456788889999          -        GGG
22                                           XXXXXXXX                                                                                                   XXXXXXXX   26-JAN-2017   12:11:53          1  123456788889555          -       VVVV
23                                           XXXXXXXX                                                                                                   XXXXXXXX   26-JAN-2017   12:13:12          1  123456788889555          -       VVVV
24                                           XXXXXXXX                                                                                                   XXXXXXXX   26-JAN-2017   12:13:12          1  123456788889555          -       VVVV
25                                           XXXXXXXX                                                                                                   XXXXXXXX   26-JAN-2017   12:13:44          1  123456788889555          -       VVVV
26                                           XXXXXXXX                                                                                                   XXXXXXXX   26-JAN-2017   12:13:44          1  123456788889555          -       VVVV
27                                                NaN                                                                                                        NaN           NaN        NaN        NaN              NaN        NaN        NaN
28                                                NaN                                                                                                        NaN           NaN        NaN        NaN              NaN        NaN        NaN
29                                                NaN                                                                                                        NaN           NaN        NaN        NaN              NaN        NaN        NaN
30         Note :- This is a System generated Report.                                                                                                        NaN           NaN        NaN        NaN              NaN        NaN        NaN

Process finished with exit code 0

Мои настройки Python 3.6.5 и

Modules and their version

Добро пожаловать на сайт PullRequest, где вы можете задавать вопросы и получать ответы от других членов сообщества.
...