Как обернуть первые 10 запятых двойными кавычками? - PullRequest
0 голосов
/ 28 марта 2019

Вот мой файл input.csv

dealerid,address,city,state,zip,vin,stocknumber,type,color,year,make,model,trim,bodystyle,fueltype,mileage,transmission,interiorcolor,interiorfabric,price,titlestatus,warranty,options_text,cylinders,engine,engineaspiration,enginetext,drivetrain,transmissiontext,mpgcity,mpghighway,features_text,vdc_url,images
TS06095298,999 wanna Road,Windsor,CT,06095,22HDT13S922218113,298,Used,Red,2002,OLDSMOBILE,BRAVADA,,,,136000,AUTOMATIC,,,2200,Clear,Available,"This vehicle is offered for sale by a verified private seller and features: FREE vehicle history & title report. Original window sticker available. Seller`s identity, email and phone verified. Secure cashless transactions. No cash needed. Pay securely by debit card or ACH. Bill of Sale and receipt issued for completed transactions. Vehicle financing options may be available.",,,,,,,,,,https://www.example.com/listings/298,"https://www.example.com/rails/00008.jpg,https://www.example.com/rails/AM00010.jpg"
TS06095298,999 wanna Road,Windsor,CT,06095,22HDT13S922123453,307,Used,Brown,2008,HONDA,599,,,,217538,AUTOMATIC,,,3500,Clear,Available,"This vehicle is offered for sale by a verified private seller and features: FREE vehicle history & title report. Original window sticker available. Seller`s identity, email and phone verified. Secure cashless transactions. No cash needed. Pay securely by debit card or ACH. Bill of Sale and receipt issued for completed transactions. Vehicle financing options may be available.",,,,,,,,,,https://www.example.com/listings/211,"https://www.example.com/rails/00008.jpg,https://www.example.com/rails/AM00010.jpg"  

Мне нужно обернуть все столбцы двойными кавычками, чтобы я получил файл примерно так:

"dealerid","address","city","state","zip","vin","stocknumber","type","color","year","make","model","trim","bodystyle","fueltype","mileage","transmission","interiorcolor","interiorfabric","price","titlestatus","warranty","options_text","cylinders","engine","engineaspiration","enginetext","drivetrain","transmissiontext","mpgcity","mpghighway","features_text","vdc_url","images"
"TS06095298","999 wanna Road,Windsor","CT","06095","22HDT13S922218113","298","Used","Red","2002","OLDSMOBILE","BRAVADA","","","","136000,AUTOMATIC","","","2200","Clear","Available","This vehicle is offered for sale by a verified private seller and features: FREE vehicle history & title report. Original window sticker available. Seller`s identity, email and phone verified. Secure cashless transactions. No cash needed. Pay securely by debit card or ACH. Bill of Sale and receipt issued for completed transactions. Vehicle financing options may be available.","","","","","","","","","","https://www.example.com/listings/298","https://www.example.com/rails/00008.jpg,https://www.example.com/rails/AM00010.jpg"
"TS06095298","999 wanna Road,Windsor","CT","06095","22HDT13S922123453","307","Used","Brown","2008","HONDA","599","","","","217538","AUTOMATIC","","","3500","Clear","Available","This vehicle is offered for sale by a verified private seller and features: FREE vehicle history & title report. Original window sticker available. Seller`s identity, email and phone verified. Secure cashless transactions. No cash needed. Pay securely by debit card or ACH. Bill of Sale and receipt issued for completed transactions. Vehicle financing options may be available.","","","","","","","","","","https://www.example.com/listings/211","https://www.example.com/rails/00008.jpg,https://www.example.com/rails/AM00010.jpg"

Файлдовольно постоянный с теми же пропущенными данными из определенных столбцов.

Столбец изображений и текстовый столбец функций уже обернуты.

Видя, что ту же информацию не хватало повсюду, я решил добавить двойные кавычкиначало каждой строки и начинал заменять запятые на двойные кавычки, но начинал сталкиваться с некоторыми проблемами.

Вот что у меня так далеко.Я знаю, что код не очень эффективен, но это начало.

#!/bin/bash

#- Temp Directories
tmp_dir="$(mktemp -d -t 'csv.XXXXX' || mktemp -d 2>/dev/null)"
tmp_input1="${tmp_dir}/temp_input1.csv"
tmp_input2="${tmp_dir}/temp_input2.csv"
tmp_input3="${tmp_dir}/temp_input3.csv"

#- Variables
client="00000"
wDir="$(pwd)"
ftpDir="${wDir}/.clientftp"
clientDir="${ftpDir}/${client}"
csvFile="${clientDir}/final.csv"
inputCsv="${wDir}/input.csv"

#  Lets Begin
cd "$wDir" || exit

      cp "$inputCsv" "$tmp_input1"
      dos2unix "$tmp_input1"

      #  place first line to a temp file , surrounding commas with double quotes , adding double quotes to the front and end of line
      head -1 "$tmp_input1" | sed -e 's/,/","/g;s/.*/"&"/' > "$tmp_input2"

      #  place remainding lines to a temp file
      sed 1,1d "$tmp_input1" | sed "s/^/\"/" > "$tmp_input3"
      sed -i 's/",,,,,,,,,,https/","","","","","","","","","","https/g' "$tmp_input3"
      sed -i 's/,Clear,Available,"/","Clear","Available","/g' "$tmp_input3"
      sed -i 's/,,,,/","","","","/g' "$tmp_input3"
      sed -i 's/,,,/","","","/g' "$tmp_input3"

      #  Create final file
      cat "$tmp_input2" > "$csvFile"
      cat "$tmp_input3" >> "$csvFile"

      rm -rf "$tmp_dir"

      { clear; echo ""; echo "";  echo "nano $csvFile"; echo ""; }

nano "$csvFile"

Этот скрипт выдает:

"dealerid","address","city","state","zip","vin","stocknumber","type","color","year","make","model","trim","bodystyle","fueltype","mileage","transmission","interiorcolor","interiorfabric","price","titlestatus","warranty","options_text","cylinders","engine","engineaspiration","enginetext","drivetrain","transmissiontext","mpgcity","mpghighway","features_text","vdc_url","images"
"TS06095298,999 wanna Road,Windsor,CT,06095,22HDT13S922218113,298,Used,Red,2002,OLDSMOBILE,BRAVADA","","","","136000,AUTOMATIC","","","2200","Clear","Available","This vehicle is offered for sale by a verified private seller and features: FREE vehicle history & title report. Original window sticker available. Seller`s identity, email and phone verified. Secure cashless transactions. No cash needed. Pay securely by debit card or ACH. Bill of Sale and receipt issued for completed transactions. Vehicle financing options may be available.","","","","","","","","","","https://www.example.com/listings/298,"https://www.example.com/rails/00008.jpg,https://www.example.com/rails/AM00010.jpg"
"TS06095298,999 wanna Road,Windsor,CT,06095,22HDT13S922123453,307,Used,Brown,2008,HONDA,599","","","","217538,AUTOMATIC","","","3500","Clear","Available","This vehicle is offered for sale by a verified private seller and features: FREE vehicle history & title report. Original window sticker available. Seller`s identity, email and phone verified. Secure cashless transactions. No cash needed. Pay securely by debit card or ACH. Bill of Sale and receipt issued for completed transactions. Vehicle financing options may be available.","","","","","","","","","","https://www.example.com/listings/211,"https://www.example.com/rails/00008.jpg,https://www.example.com/rails/AM00010.jpg"

Так что теперь у меня есть несколько проблем:
1- vdc_urlв столбце нет закрывающей двойной кавычки
2 - первые 10 запятых необходимо заключить в двойные кавычки

Последний столбец может содержать более 3 изображений

Любая помощь приветствуется.

Ответы [ 2 ]

1 голос
/ 28 марта 2019

Мне нравится ruby ​​для быстрых конверсий CVS:

ruby -rcsv -e '
    out = CSV.instance($stdout, {force_quotes: true})
    CSV.foreach(ARGV.shift) {|row| out << row}
' input.csv

убедитесь, что у вас нет пробелов в конце строки.

csvkit также будет хорошим решением.

0 голосов
/ 28 марта 2019

С GNU awk для FPAT:

$ awk -v FPAT='[^,]*|"[^"]*"' -v OFS=',' '
    { for (i=1;i<=NF;i++) {gsub(/^"|"$/,"",$i); $i="\"" $i "\""} }
1' file
"dealerid","address","city","state","zip","vin","stocknumber","type","color","year","make","model","trim","bodystyle","fueltype","mileage","transmission","interiorcolor","interiorfabric","price","titlestatus","warranty","options_text","cylinders","engine","engineaspiration","enginetext","drivetrain","transmissiontext","mpgcity","mpghighway","features_text","vdc_url","images"
"TS06095298","999 wanna Road","Windsor","CT","06095","22HDT13S922218113","298","Used","Red","2002","OLDSMOBILE","BRAVADA","","","","136000","AUTOMATIC","","","2200","Clear","Available","This vehicle is offered for sale by a verified private seller and features: FREE vehicle history & title report. Original window sticker available. Seller`s identity, email and phone verified. Secure cashless transactions. No cash needed. Pay securely by debit card or ACH. Bill of Sale and receipt issued for completed transactions. Vehicle financing options may be available.","","","","","","","","","","https://www.example.com/listings/298","https://www.example.com/rails/00008.jpg,https://www.example.com/rails/AM00010.jpg"
"TS06095298","999 wanna Road","Windsor","CT","06095","22HDT13S922123453","307","Used","Brown","2008","HONDA","599","","","","217538","AUTOMATIC","","","3500","Clear","Available","This vehicle is offered for sale by a verified private seller and features: FREE vehicle history & title report. Original window sticker available. Seller`s identity, email and phone verified. Secure cashless transactions. No cash needed. Pay securely by debit card or ACH. Bill of Sale and receipt issued for completed transactions. Vehicle financing options may be available.","","","","","","","","","","https://www.example.com/listings/211","https://www.example.com/rails/00008.jpg,https://www.example.com/rails/AM00010.jpg"
Добро пожаловать на сайт PullRequest, где вы можете задавать вопросы и получать ответы от других членов сообщества.
...