Как автоматизировать выравнивание сложенного файла Excel с помощью Python? - PullRequest
0 голосов
/ 21 февраля 2020

Я пытаюсь выровнять электронную таблицу Excel из файла, содержащего несколько записей для одного объекта с разными датами, в файл, содержащий одну строку для каждого уникального объекта (несколько столбцов для дат).

Например, я бы начал со следующего (пожалуйста, запустите код, чтобы увидеть таблицу):

<style type="text/css">
  table.tableizer-table {
    font-size: 12px;
    border: 1px solid #CCC;
    font-family: Arial, Helvetica, sans-serif;
  }
  
  .tableizer-table td {
    padding: 4px;
    margin: 3px;
    border: 1px solid #CCC;
  }
  
  .tableizer-table th {
    background-color: #104E8B;
    color: #FFF;
    font-weight: bold;
  }
</style>
<table class="tableizer-table">
  <thead>
    <tr class="tableizer-firstrow">
      <th>ID</th>
      <th>Name</th>
      <th>Creation Date</th>
      <th>Breakdown Date</th>
      <th>Breakdown Cause</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>145450</td>
      <td>Elena</td>
      <td>12/3/2019</td>
      <td>1/31/2011</td>
      <td>Program Error</td>
    </tr>
    <tr>
      <td>145450</td>
      <td>Elena</td>
      <td>2/12/2015</td>
      <td>12/10/2013</td>
      <td>Weather</td>
    </tr>
    <tr>
      <td>215961</td>
      <td>LiJin</td>
      <td>7/24/2019</td>
      <td>5/29/2015</td>
      <td>Weather</td>
    </tr>
    <tr>
      <td>160058</td>
      <td>Devin</td>
      <td>12/24/2018</td>
      <td>1/1/2016</td>
      <td>Program Error</td>
    </tr>
    <tr>
      <td>160058</td>
      <td>Devin</td>
      <td>8/8/2018</td>
      <td>12/31/2016</td>
      <td>Cheap Material</td>
    </tr>
    <tr>
      <td>160058</td>
      <td>Devin</td>
      <td>4/22/2019</td>
      <td>12/13/2017</td>
      <td>Cheap Material</td>
    </tr>
    <tr>
      <td>160058</td>
      <td>Devin</td>
      <td>1/9/2016</td>
      <td>8/4/2018</td>
      <td>Program Error</td>
    </tr>
    <tr>
      <td>145450</td>
      <td>Elena</td>
      <td>1/28/2010</td>
      <td>6/30/2019</td>
      <td>Weather</td>
    </tr>
    <tr>
      <td>145450</td>
      <td>Elena</td>
      <td>6/6/2016</td>
      <td>12/17/2019</td>
      <td>Weather</td>
    </tr>
    <tr>
      <td>189066</td>
      <td>Bobby</td>
      <td>1/14/2020</td>
      <td>1/1/2020</td>
      <td>Gained Sentience</td>
    </tr>
    <tr>
      <td>160058</td>
      <td>Devin</td>
      <td>11/11/2011</td>
      <td>2/10/2020</td>
      <td>Program Error</td>
    </tr>
    <tr>
      <td>181946</td>
      <td>Maya</td>
      <td>2/15/2020</td>
      <td>2/19/2020</td>
      <td>Program Error</td>
    </tr>
    <tr>
      <td>160058</td>
      <td>Devin</td>
      <td>10/1/2010</td>
      <td>&nbsp;</td>
      <td>&nbsp;</td>
    </tr>
    <tr>
      <td>189066</td>
      <td>Bobby</td>
      <td>9/4/2011</td>
      <td>&nbsp;</td>
      <td>&nbsp;</td>
    </tr>
    <tr>
      <td>203759</td>
      <td>Surya</td>
      <td>6/24/2012</td>
      <td>&nbsp;</td>
      <td>&nbsp;</td>
    </tr>
    <tr>
      <td>181946</td>
      <td>Maya</td>
      <td>3/30/2014</td>
      <td>&nbsp;</td>
      <td>&nbsp;</td>
    </tr>
    <tr>
      <td>215961</td>
      <td>LiJin</td>
      <td>5/5/2015</td>
      <td>&nbsp;</td>
      <td>&nbsp;</td>
    </tr>
    <tr>
      <td>203759</td>
      <td>Surya</td>
      <td>11/30/2016</td>
      <td>&nbsp;</td>
      <td>&nbsp;</td>
    </tr>
    <tr>
      <td>181946</td>
      <td>Maya</td>
      <td>3/21/2017</td>
      <td>&nbsp;</td>
      <td>&nbsp;</td>
    </tr>
    <tr>
      <td>181946</td>
      <td>Maya</td>
      <td>5/20/2017</td>
      <td>&nbsp;</td>
      <td>&nbsp;</td>
    </tr>
    <tr>
      <td>203759</td>
      <td>Surya</td>
      <td>6/11/2017</td>
      <td>&nbsp;</td>
      <td>&nbsp;</td>
    </tr>
    <tr>
      <td>145450</td>
      <td>Elena</td>
      <td>9/10/2017</td>
      <td>&nbsp;</td>
      <td>&nbsp;</td>
    </tr>
    <tr>
      <td>181946</td>
      <td>Maya</td>
      <td>3/13/2019</td>
      <td>&nbsp;</td>
      <td>&nbsp;</td>
    </tr>
    <tr>
      <td>160058</td>
      <td>Devin</td>
      <td>2/1/2020</td>
      <td>&nbsp;</td>
      <td></td>
    </tr>
  </tbody>
</table>

Мне удалось использовать формулы в Excel, чтобы вручную вывести следующую уплощенную версию:

<style type="text/css">
  table.tableizer-table {
    font-size: 12px;
    border: 1px solid #CCC;
    font-family: Arial, Helvetica, sans-serif;
  }
  
  .tableizer-table td {
    padding: 4px;
    margin: 3px;
    border: 1px solid #CCC;
  }
  
  .tableizer-table th {
    background-color: #104E8B;
    color: #FFF;
    font-weight: bold;
  }
</style>
<table class="tableizer-table">
  <thead>
    <tr class="tableizer-firstrow">
      <th>ID</th>
      <th>Name</th>
      <th>Creation Date 1</th>
      <th>Creation Date 2</th>
      <th>Creation Date 3</th>
      <th>Creation Date 4</th>
      <th>Creation Date 5</th>
      <th>Creation Date 6</th>
      <th>Creation Date 7</th>
      <th>Breakdown Date 1</th>
      <th>Breakdown Date 2</th>
      <th>Breakdown Date 3</th>
      <th>Breakdown Date 4</th>
      <th>Breakdown Date 5</th>
      <th>Breakdown Cause 1</th>
      <th>Breakdown Cause 2</th>
      <th>Breakdown Cause 3</th>
      <th>Breakdown Cause 4</th>
      <th>Breakdown Cause 5</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>145450</td>
      <td>Elena</td>
      <td>1/28/2010</td>
      <td>2/12/2015</td>
      <td>6/6/2016</td>
      <td>9/10/2017</td>
      <td>12/3/2019</td>
      <td>NA</td>
      <td>NA</td>
      <td>1/31/2011</td>
      <td>12/10/2013</td>
      <td>6/30/2019</td>
      <td>12/17/2019</td>
      <td>NA</td>
      <td>Program Error</td>
      <td>Weather</td>
      <td>Weather</td>
      <td>Weather</td>
      <td>NA</td>
    </tr>
    <tr>
      <td>160058</td>
      <td>Devin</td>
      <td>10/1/2010</td>
      <td>11/11/2011</td>
      <td>1/9/2016</td>
      <td>8/8/2018</td>
      <td>12/24/2018</td>
      <td>4/22/2019</td>
      <td>2/1/2020</td>
      <td>1/1/2016</td>
      <td>12/31/2016</td>
      <td>12/13/2017</td>
      <td>8/4/2018</td>
      <td>2/10/2020</td>
      <td>Program Error</td>
      <td>Cheap Material</td>
      <td>Cheap Material</td>
      <td>Program Error</td>
      <td>Program Error</td>
    </tr>
    <tr>
      <td>181946</td>
      <td>Maya</td>
      <td>3/30/2014</td>
      <td>3/21/2017</td>
      <td>5/20/2017</td>
      <td>3/13/2019</td>
      <td>2/15/2020</td>
      <td>NA</td>
      <td>NA</td>
      <td>2/19/2020</td>
      <td>NA</td>
      <td>NA</td>
      <td>NA</td>
      <td>NA</td>
      <td>Program Error</td>
      <td>NA</td>
      <td>NA</td>
      <td>NA</td>
      <td>NA</td>
    </tr>
    <tr>
      <td>189066</td>
      <td>Bobby</td>
      <td>9/4/2011</td>
      <td>1/14/2020</td>
      <td>NA</td>
      <td>NA</td>
      <td>NA</td>
      <td>NA</td>
      <td>NA</td>
      <td>1/1/2020</td>
      <td>NA</td>
      <td>NA</td>
      <td>NA</td>
      <td>NA</td>
      <td>Gained Sentience</td>
      <td>NA</td>
      <td>NA</td>
      <td>NA</td>
      <td>NA</td>
    </tr>
    <tr>
      <td>203759</td>
      <td>Surya</td>
      <td>6/24/2012</td>
      <td>11/30/2016</td>
      <td>6/11/2017</td>
      <td>NA</td>
      <td>NA</td>
      <td>NA</td>
      <td>NA</td>
      <td>NA</td>
      <td>NA</td>
      <td>NA</td>
      <td>NA</td>
      <td>NA</td>
      <td>NA</td>
      <td>NA</td>
      <td>NA</td>
      <td>NA</td>
      <td>NA</td>
    </tr>
    <tr>
      <td>215961</td>
      <td>LiJin</td>
      <td>5/5/2015</td>
      <td>7/24/2019</td>
      <td>NA</td>
      <td>NA</td>
      <td>NA</td>
      <td>NA</td>
      <td>NA</td>
      <td>5/29/2015</td>
      <td>NA</td>
      <td>NA</td>
      <td>NA</td>
      <td>NA</td>
      <td>Weather</td>
      <td>NA</td>
      <td>NA</td>
      <td>NA</td>
      <td>NA</td>
    </tr>
  </tbody>
</table>

Пожалуйста, взгляните на рисунки ниже, чтобы лучше понять, что я имею в виду.

Вот с чего бы я начал:

enter image description here

Вот что я хотел бы получить в итоге:

enter image description here

...