Could not find dsdgen at /home/troberts/spark-sql-perf/tpcds-kit/tools/dsdgen or //home/troberts/spark-sql-perf/tpcds-kit/tools/dsdgen
Сначала необходимо установить TPCDS.
spark- sql -perf docs из инструмента, который вы использовали:
Before running any query, a dataset needs to be setup by creating a Benchmark object.
Generating the TPCDS data requires dsdgen built and available on the machines.
We have a fork of dsdgen that you will need.
The fork includes changes to generate TPCDS data to stdout, so that this library can pipe them directly to Spark, without intermediate files.
Therefore, this library will not work with the vanilla TPCDS kit.
TPCDS kit needs to be installed on all cluster executor nodes under the same path!
Пожалуйста, настройте инструментарий TPCD C из блоков данных