Сбой задания при экспорте данных из HDFS в mysql с использованием sqoop в Cloudera - PullRequest
0 голосов
/ 10 июня 2018

Я экспортирую данные HDFS файла департаментов_экспорта, представленных в каталоге HDFS в каталоге / user / training / sqoop_import /partments_export.Ниже приведены записи, присутствующие в файле.

2,Fitness
3,Footwear
4,Apparel
5,Golf
6,Outdoors
7,Fan Shop
8,Development
1000,Admin
1001,Books

Я хочу экспортировать данные в таблицу mysql с именем департаментов_экспорта (департамент_идентификатора, отдел_имя varchar).Эта таблица уже содержит следующие данные

mysql> select * from departments_export;
+---------------+-----------------+
| department_id | department_name |
+---------------+-----------------+
|             2 | Fitness         |
|             3 | Footwear        |
|             4 | Apparel         |
|             5 | Golf            |
|             6 | Outdoors        |
|             7 | Fan Shop        |
|             8 | Development     |
|          1000 | Admin           |
+---------------+-----------------+

Я выполняю команду ниже для экспорта sqoop

sqoop export \
--connect "jdbc:mysql://quickstart.cloudera:3306/retail_db" \
--username retail_dba \
--password cloudera \
--table departments_export \
--export-dir /user/training/sqoop_import/departments_export \
--batch \
-m 1 \
--update-key department_id \
--update-mode allowinsert \

Я получаю журналы ниже в командной строке.

Warning: /usr/lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
18/06/10 10:42:39 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.13.0
18/06/10 10:42:40 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
18/06/10 10:42:41 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
18/06/10 10:42:43 INFO tool.CodeGenTool: Beginning code generation
18/06/10 10:42:43 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `departments_export` AS t LIMIT 1
18/06/10 10:42:44 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `departments_export` AS t LIMIT 1
18/06/10 10:42:44 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/lib/hadoop-mapreduce
Note: /tmp/sqoop-cloudera/compile/995860db956fe955c309e42de79ab4f9/departments_export.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
18/06/10 10:42:51 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-cloudera/compile/995860db956fe955c309e42de79ab4f9/departments_export.jar
18/06/10 10:42:51 INFO mapreduce.ExportJobBase: Beginning export of departments_export
18/06/10 10:42:51 INFO Configuration.deprecation: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
18/06/10 10:42:53 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
18/06/10 10:42:57 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative
18/06/10 10:42:57 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative
18/06/10 10:42:57 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
18/06/10 10:42:57 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
18/06/10 10:43:00 WARN hdfs.DFSClient: Caught exception 
java.lang.InterruptedException
    at java.lang.Object.wait(Native Method)
    at java.lang.Thread.join(Thread.java:1281)
    at java.lang.Thread.join(Thread.java:1355)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.closeResponder(DFSOutputStream.java:967)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.endBlock(DFSOutputStream.java:705)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:894)
18/06/10 10:43:00 WARN hdfs.DFSClient: Caught exception 
java.lang.InterruptedException
    at java.lang.Object.wait(Native Method)
    at java.lang.Thread.join(Thread.java:1281)
    at java.lang.Thread.join(Thread.java:1355)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.closeResponder(DFSOutputStream.java:967)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.endBlock(DFSOutputStream.java:705)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:894)
18/06/10 10:43:00 WARN hdfs.DFSClient: Caught exception 
java.lang.InterruptedException
    at java.lang.Object.wait(Native Method)
    at java.lang.Thread.join(Thread.java:1281)
    at java.lang.Thread.join(Thread.java:1355)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.closeResponder(DFSOutputStream.java:967)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.endBlock(DFSOutputStream.java:705)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:894)
18/06/10 10:43:01 WARN hdfs.DFSClient: Caught exception 
java.lang.InterruptedException
    at java.lang.Object.wait(Native Method)
    at java.lang.Thread.join(Thread.java:1281)
    at java.lang.Thread.join(Thread.java:1355)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.closeResponder(DFSOutputStream.java:967)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.endBlock(DFSOutputStream.java:705)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:894)
18/06/10 10:43:01 WARN hdfs.DFSClient: Caught exception 
java.lang.InterruptedException
    at java.lang.Object.wait(Native Method)
    at java.lang.Thread.join(Thread.java:1281)
    at java.lang.Thread.join(Thread.java:1355)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.closeResponder(DFSOutputStream.java:967)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.endBlock(DFSOutputStream.java:705)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:894)
18/06/10 10:43:01 WARN hdfs.DFSClient: Caught exception 
java.lang.InterruptedException
    at java.lang.Object.wait(Native Method)
    at java.lang.Thread.join(Thread.java:1281)
    at java.lang.Thread.join(Thread.java:1355)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.closeResponder(DFSOutputStream.java:967)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.endBlock(DFSOutputStream.java:705)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:894)
18/06/10 10:43:02 WARN hdfs.DFSClient: Caught exception 
java.lang.InterruptedException
    at java.lang.Object.wait(Native Method)
    at java.lang.Thread.join(Thread.java:1281)
    at java.lang.Thread.join(Thread.java:1355)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.closeResponder(DFSOutputStream.java:967)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.endBlock(DFSOutputStream.java:705)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:894)
18/06/10 10:43:02 WARN hdfs.DFSClient: Caught exception 
java.lang.InterruptedException
    at java.lang.Object.wait(Native Method)
    at java.lang.Thread.join(Thread.java:1281)
    at java.lang.Thread.join(Thread.java:1355)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.closeResponder(DFSOutputStream.java:967)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.endBlock(DFSOutputStream.java:705)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:894)
18/06/10 10:43:02 WARN hdfs.DFSClient: Caught exception 
java.lang.InterruptedException
    at java.lang.Object.wait(Native Method)
    at java.lang.Thread.join(Thread.java:1281)
    at java.lang.Thread.join(Thread.java:1355)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.closeResponder(DFSOutputStream.java:967)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.endBlock(DFSOutputStream.java:705)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:894)
18/06/10 10:43:02 WARN hdfs.DFSClient: Caught exception 
java.lang.InterruptedException
    at java.lang.Object.wait(Native Method)
    at java.lang.Thread.join(Thread.java:1281)
    at java.lang.Thread.join(Thread.java:1355)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.closeResponder(DFSOutputStream.java:967)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.endBlock(DFSOutputStream.java:705)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:894)
18/06/10 10:43:02 WARN hdfs.DFSClient: Caught exception 
java.lang.InterruptedException
    at java.lang.Object.wait(Native Method)
    at java.lang.Thread.join(Thread.java:1281)
    at java.lang.Thread.join(Thread.java:1355)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.closeResponder(DFSOutputStream.java:967)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.endBlock(DFSOutputStream.java:705)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:894)
18/06/10 10:43:03 WARN hdfs.DFSClient: Caught exception 
java.lang.InterruptedException
    at java.lang.Object.wait(Native Method)
    at java.lang.Thread.join(Thread.java:1281)
    at java.lang.Thread.join(Thread.java:1355)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.closeResponder(DFSOutputStream.java:967)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.endBlock(DFSOutputStream.java:705)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:894)
18/06/10 10:43:04 WARN hdfs.DFSClient: Caught exception 
java.lang.InterruptedException
    at java.lang.Object.wait(Native Method)
    at java.lang.Thread.join(Thread.java:1281)
    at java.lang.Thread.join(Thread.java:1355)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.closeResponder(DFSOutputStream.java:967)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.endBlock(DFSOutputStream.java:705)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:894)
18/06/10 10:43:04 INFO input.FileInputFormat: Total input paths to process : 1
18/06/10 10:43:04 INFO input.FileInputFormat: Total input paths to process : 1
18/06/10 10:43:04 WARN hdfs.DFSClient: Caught exception 
java.lang.InterruptedException
    at java.lang.Object.wait(Native Method)
    at java.lang.Thread.join(Thread.java:1281)
    at java.lang.Thread.join(Thread.java:1355)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.closeResponder(DFSOutputStream.java:967)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.endBlock(DFSOutputStream.java:705)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:894)
18/06/10 10:43:04 WARN hdfs.DFSClient: Caught exception 
java.lang.InterruptedException
    at java.lang.Object.wait(Native Method)
    at java.lang.Thread.join(Thread.java:1281)
    at java.lang.Thread.join(Thread.java:1355)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.closeResponder(DFSOutputStream.java:967)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.endBlock(DFSOutputStream.java:705)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:894)
18/06/10 10:43:04 INFO mapreduce.JobSubmitter: number of splits:1
18/06/10 10:43:04 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative
18/06/10 10:43:05 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1527924460497_0039
18/06/10 10:43:06 INFO impl.YarnClientImpl: Submitted application application_1527924460497_0039
18/06/10 10:43:06 INFO mapreduce.Job: The url to track the job: http://quickstart.cloudera:8088/proxy/application_1527924460497_0039/
18/06/10 10:43:06 INFO mapreduce.Job: Running job: job_1527924460497_0039
18/06/10 10:43:33 INFO mapreduce.Job: Job job_1527924460497_0039 running in uber mode : false
18/06/10 10:43:33 INFO mapreduce.Job:  map 0% reduce 0%
18/06/10 10:43:59 INFO mapreduce.Job:  map 100% reduce 0%
18/06/10 10:43:59 INFO mapreduce.Job: Job job_1527924460497_0039 failed with state FAILED due to: Task failed task_1527924460497_0039_m_000000
Job failed as tasks failed. failedMaps:1 failedReduces:0

18/06/10 10:43:59 INFO mapreduce.Job: Counters: 8
    Job Counters 
        Failed map tasks=1
        Launched map tasks=1
        Data-local map tasks=1
        Total time spent by all maps in occupied slots (ms)=22735
        Total time spent by all reduces in occupied slots (ms)=0
        Total time spent by all map tasks (ms)=22735
        Total vcore-milliseconds taken by all map tasks=22735
        Total megabyte-milliseconds taken by all map tasks=23280640
18/06/10 10:43:59 WARN mapreduce.Counters: Group FileSystemCounters is deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead
18/06/10 10:43:59 INFO mapreduce.ExportJobBase: Transferred 0 bytes in 62.3015 seconds (0 bytes/sec)
18/06/10 10:43:59 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
18/06/10 10:43:59 INFO mapreduce.ExportJobBase: Exported 0 records.
18/06/10 10:43:59 ERROR tool.ExportTool: Error during export: 
Export job failed!
    at org.apache.sqoop.mapreduce.ExportJobBase.runExport(ExportJobBase.java:439)
    at org.apache.sqoop.manager.SqlManager.updateTable(SqlManager.java:965)
    at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:70)
    at org.apache.sqoop.tool.ExportTool.run(ExportTool.java:99)
    at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
    at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
    at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243)
    at org.apache.sqoop.Sqoop.main(Sqoop.java:252)

1 Ответ

0 голосов
/ 11 июня 2018

Есть ли у вас первичный ключ в таблице департаментов_экспорта, если нет, создайте первичный ключ для отдела_ид.

Попробуйте это.

sqoop export \ --connect "jdbc: mysql: // quickstart.cloudera: 3306 / retail_db "\ --пользовательское имя retail_dba \ - пароль парольpartments.txt \ --fields-terminated-by = ',' -m 1 \

Я пробовал, и он работает нормально.

...