NB : это основано на предположении, что результатом является просто разрезанный ZIP-файл без каких-либо дополнительных заголовков или чего-либо еще.
Если вы проверяете документы, ZipFile
может быть передан файлоподобный объект для использования при вводе / выводе.Следовательно, мы должны иметь возможность предоставить ему наш собственный объект, который реализует необходимое подмножество протокола и который разбивает выходные данные на несколько файлов.
Как оказалось, нам нужно реализовать только 3 функции:
tell()
- просто вернуть количество записанных байтов write(str)
- записать в файл до максимальной емкостиПосле полного открытия нового файла повторяйте до тех пор, пока не будут записаны все данные flush()
- очистите текущий открытый файл
Сценарий прототипа
import random
import zipfile
def get_random_data(length):
return "".join([chr(random.randrange(256)) for i in range(length)])
class MultiFile(object):
def __init__(self, file_name, max_file_size):
self.current_position = 0
self.file_name = file_name
self.max_file_size = max_file_size
self.current_file = None
self.open_next_file()
@property
def current_file_no(self):
return self.current_position / self.max_file_size
@property
def current_file_size(self):
return self.current_position % self.max_file_size
@property
def current_file_capacity(self):
return self.max_file_size - self.current_file_size
def open_next_file(self):
file_name = "%s.%03d" % (self.file_name, self.current_file_no + 1)
print "* Opening file '%s'..." % file_name
if self.current_file is not None:
self.current_file.close()
self.current_file = open(file_name, 'wb')
def tell(self):
print "MultiFile::Tell -> %d" % self.current_position
return self.current_position
def write(self, data):
start, end = 0, len(data)
print "MultiFile::Write (%d bytes)" % len(data)
while start < end:
current_block_size = min(end - start, self.current_file_capacity)
self.current_file.write(data[start:start+current_block_size])
print "* Wrote %d bytes." % current_block_size
start += current_block_size
self.current_position += current_block_size
if self.current_file_capacity == self.max_file_size:
self.open_next_file()
print "* Capacity = %d" % self.current_file_capacity
def flush(self):
print "MultiFile::Flush"
self.current_file.flush()
mfo = MultiFile('splitzip.zip', 2**18)
zf = zipfile.ZipFile(mfo, mode='w', compression=zipfile.ZIP_DEFLATED)
for i in range(4):
filename = 'test%04d.txt' % i
print "Adding file '%s'..." % filename
zf.writestr(filename, get_random_data(2**17))
Вывод трассировки
* Opening file 'splitzip.zip.001'...
Adding file 'test0000.txt'...
MultiFile::Tell -> 0
MultiFile::Write (42 bytes)
* Wrote 42 bytes.
* Capacity = 262102
MultiFile::Write (131112 bytes)
* Wrote 131112 bytes.
* Capacity = 130990
MultiFile::Flush
Adding file 'test0001.txt'...
MultiFile::Tell -> 131154
MultiFile::Write (42 bytes)
* Wrote 42 bytes.
* Capacity = 130948
MultiFile::Write (131112 bytes)
* Wrote 130948 bytes.
* Opening file 'splitzip.zip.002'...
* Capacity = 262144
* Wrote 164 bytes.
* Capacity = 261980
MultiFile::Flush
Adding file 'test0002.txt'...
MultiFile::Tell -> 262308
MultiFile::Write (42 bytes)
* Wrote 42 bytes.
* Capacity = 261938
MultiFile::Write (131112 bytes)
* Wrote 131112 bytes.
* Capacity = 130826
MultiFile::Flush
Adding file 'test0003.txt'...
MultiFile::Tell -> 393462
MultiFile::Write (42 bytes)
* Wrote 42 bytes.
* Capacity = 130784
MultiFile::Write (131112 bytes)
* Wrote 130784 bytes.
* Opening file 'splitzip.zip.003'...
* Capacity = 262144
* Wrote 328 bytes.
* Capacity = 261816
MultiFile::Flush
MultiFile::Tell -> 524616
MultiFile::Write (46 bytes)
* Wrote 46 bytes.
* Capacity = 261770
MultiFile::Write (12 bytes)
* Wrote 12 bytes.
* Capacity = 261758
MultiFile::Write (0 bytes)
MultiFile::Write (0 bytes)
MultiFile::Write (46 bytes)
* Wrote 46 bytes.
* Capacity = 261712
MultiFile::Write (12 bytes)
* Wrote 12 bytes.
* Capacity = 261700
MultiFile::Write (0 bytes)
MultiFile::Write (0 bytes)
MultiFile::Write (46 bytes)
* Wrote 46 bytes.
* Capacity = 261654
MultiFile::Write (12 bytes)
* Wrote 12 bytes.
* Capacity = 261642
MultiFile::Write (0 bytes)
MultiFile::Write (0 bytes)
MultiFile::Write (46 bytes)
* Wrote 46 bytes.
* Capacity = 261596
MultiFile::Write (12 bytes)
* Wrote 12 bytes.
* Capacity = 261584
MultiFile::Write (0 bytes)
MultiFile::Write (0 bytes)
MultiFile::Tell -> 524848
MultiFile::Write (22 bytes)
* Wrote 22 bytes.
* Capacity = 261562
MultiFile::Write (0 bytes)
MultiFile::Flush
Список каталогов
-rw-r--r-- 1 2228 Feb 21 23:44 splitzip.py
-rw-r--r-- 1 262144 Feb 22 00:07 splitzip.zip.001
-rw-r--r-- 1 262144 Feb 22 00:07 splitzip.zip.002
-rw-r--r-- 1 582 Feb 22 00:07 splitzip.zip.003
Проверка
>7z l splitzip.zip.001
7-Zip [64] 9.20 Copyright (c) 1999-2010 Igor Pavlov 2010-11-18
Listing archive: splitzip.zip.001
--
Path = splitzip.zip.001
Type = Split
Volumes = 3
----
Path = splitzip.zip
Size = 524870
--
Path = splitzip.zip
Type = zip
Physical Size = 524870
Date Time Attr Size Compressed Name
------------------- ----- ------------ ------------ ------------------------
2019-02-22 00:07:34 ..... 131072 131112 test0000.txt
2019-02-22 00:07:34 ..... 131072 131112 test0001.txt
2019-02-22 00:07:36 ..... 131072 131112 test0002.txt
2019-02-22 00:07:36 ..... 131072 131112 test0003.txt
------------------- ----- ------------ ------------ ------------------------
524288 524448 4 files, 0 folders