Проблема доступа Spark-Vertica (подключение, печать схемы, подсчет, но не удалось получить записи) - PullRequest
1 голос
/ 29 сентября 2019

Я могу подключиться к Vertica через EMR (Spark), подсчитать и распечатать схему, но не могу прочитать данные.Способен к telnet из EMR.

Моя версия Vertica

Vertica Database
09.02.0104
vertica.jar
09.01.0100

Команда оболочки Spark

spark-shell --jars /home/hadoop/vertica-9.0.1_spark2.1_scala2.11.jar,/home/hadoop/vertica- 
jdbc-9.0.1-7.jar

Запрос на чтение таблицы Vertica

val source = spark.read.format("com.vertica.spark.datasource.DefaultSource").option("host", 
"verticaHost").option("port", "port").option("db", "DataBase?ssl=true").option("dbschema", 
"Schemna").option("user", "user").option("password", "password").option("table", 
"table").option("numPartitions", "1").option("queryTimeout", "500").load()

Я также добавил .option ("query", "select * from schema.table limit 100") и посчитал, но вместо 100 дал полный счет таблицы.

Вывод

source.printSchema() --> works, gets the schema information
source.count() --> works, gets the count.
source.show() --> Throws below error. 

Ошибка

java.sql.SQLNonTransientConnectionException: [Vertica][VJDBC](100176) Failed to connect to host xxxxxx on port xxxx. Reason: Failed to establish a connection to the primary server or any backup address.
at com.vertica.io.ProtocolStream.<init>(Unknown Source)
at com.vertica.core.VConnection.tryConnect(Unknown Source)
at com.vertica.core.VConnection.connect(Unknown Source)
at com.vertica.jdbc.common.BaseConnectionFactory.doConnect(Unknown Source)
at com.vertica.jdbc.common.AbstractDriver.connect(Unknown Source)
at java.sql.DriverManager.getConnection(DriverManager.java:664)
at java.sql.DriverManager.getConnection(DriverManager.java:208)
at com.vertica.spark.datasource.VerticaDataSourceRDD$.getConnector(VerticaRDD.scala:176)
at com.vertica.spark.datasource.VerticaDataSourceRDD$$anonfun$scanTable$1.apply(VerticaRDD.scala:205)
at com.vertica.spark.datasource.VerticaDataSourceRDD$$anonfun$scanTable$1.apply(VerticaRDD.scala:205)
at com.vertica.spark.datasource.VerticaDataSourceRDD$$anon$1.<init>(VerticaRDD.scala:338)
at com.vertica.spark.datasource.VerticaDataSourceRDD.compute(VerticaRDD.scala:330)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:121)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:402)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:408)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: com.vertica.support.exceptions.NonTransientConnectionException: [Vertica][VJDBC](100176) Failed to connect to host xxxxxx on port xxxx. Reason: Failed to establish a connection to the primary server or any backup address.
... 31 more
Caused by: java.io.IOException: Failed to establish a connection to the primary server or any backup address.
at com.vertica.io.VStream.establishConnection(Unknown Source)
at com.vertica.io.VStream.<init>(Unknown Source)
... 31 more
Caused by: java.net.ConnectException: Connection timed out (Connection timed out)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at java.net.Socket.connect(Socket.java:538)
... 33 more

Как можно считать и распечатать схему, но не прочитать ее.Заранее спасибо!

...