TextLine не дает байтового смещения строки согласно документации.Вместо этого он дает номер строки.Вывод также вставляется ниже.
TextLine(input).write(Tsv(output))
0 This is the 100th Etext file presented by Project Gutenberg, and
1 is presented in cooperation with World Library, Inc., from their
2 Library of the Future and Shakespeare CDROMS. Project Gutenberg
3 often releases Etexts that are NOT placed in the Public Domain!!
4
5 Shakespeare
6
7 *This Etext has certain copyright implications you should read!*
Похоже, что пример tutorial ясно показывает, что это номер строки, которую он выдает, но документация продолжает указывать смещение байта.Есть ли готовый класс считывания смещения в байтах при ожогах?
/**
Scalding tutorial part 1.
In part 0, we made a copy of hello.txt, but it wasn't a perfect copy:
it was annotated with line numbers.
That's because the data stream coming out of a TextLine source actually
has two fields: one, called "line", has the actual line of text. The other,
called "num", has the line number in the file. When you write these
tuples to a TextLine, it naively outputs them both on each line.
We can ask scalding to select just the "line" field from the pipe, using the
project() method. When we refer to a data stream's fields, we use Scala symbols,
like this: 'line.
To run this job:
scripts/scald.rb --local tutorial/Tutorial1.scala
Check the output:
cat tutorial/data/output1.txt
**/