Как уменьшить ненужные потоки в задачах MapReduce - PullRequest
0 голосов
/ 07 ноября 2019

Я пишу очень простое задание mapreduce, чтобы проверить, сколько параллельных потоков запущено, код ниже:

    System.out.println("--42-- thread num in main is : " + Thread.activeCount());
    JobConf jconf2 = new JobConf(TestPermission.class);
    System.out.println("--44-- thread num in main is : " + Thread.activeCount());

    jconf2.setMapOutputKeyClass(IntWritable.class);
    jconf2.setMapOutputValueClass(NullWritable.class);
    System.out.println("--48-- thread num in main is : " + Thread.activeCount());

    jconf2.setMapperClass(Map1.class);
    System.out.println("--55-- thread num in main is : " + Thread.activeCount());

    jconf2.setInputFormat(TextInputFormat.class);
    System.out.println("--58-- thread num in main is : " + Thread.activeCount());
    jconf2.setOutputFormat(NothingOutputFormat.class);
    System.out.println("--60-- thread num in main is : " + Thread.activeCount());

    jconf2.set("mapred.map.tasks", String.valueOf(1));
    System.out.println("--63-- thread num in main is : " + Thread.activeCount());
    jconf2.set("mapred.reduce.tasks", String.valueOf(0));
    System.out.println("--65-- thread num in main is : " + Thread.activeCount());

    jconf2.set("mapreduce.reduce.shuffle.parallelcopies", "0");
    System.out.println("--68-- thread num in main is : " + Thread.activeCount());
    jconf2.set("mapreduce.tasktracker.http.threads", "0");
    System.out.println("--70-- thread num in main is : " + Thread.activeCount());
    jconf2.set("mapreduce.jobtracker.jobinit.threads", "0");
    System.out.println("--72-- thread num in main is : " + Thread.activeCount());


    TextInputFormat.setInputPaths(jconf2, new Path(args[0]));
    System.out.println("--76-- thread num in main is : " + Thread.activeCount());
    NothingOutputFormat.setOutputPath(jconf2, new Path(args[1]));
    System.out.println("--78-- thread num in main is : " + Thread.activeCount());

    Path output_path = new Path(args[1]);
    FileSystem fs = output_path.getFileSystem(jconf2);
    System.out.println("--82-- thread num in main is : " + Thread.activeCount());
    FsPermission fper = new FsPermission(
            FsAction.ALL, //user action
            FsAction.ALL, //group action
            FsAction.ALL);

    giveAllAthority(fs, output_path, fper);

    System.out.println("---90---thread num in main is : " + Thread.activeCount());

    JobClient.runJob(jconf2);
    System.out.println("--93-- thread num in main is : " + Thread.activeCount());

Когда я запускаю задание, результат будет ниже:

--42-- thread num in main is : 1
--44-- thread num in main is : 1
--48-- thread num in main is : 1
--55-- thread num in main is : 1
--58-- thread num in main is : 1
--60-- thread num in main is : 1
--63-- thread num in main is : 1
--65-- thread num in main is : 1
--68-- thread num in main is : 1
--70-- thread num in main is : 1
--72-- thread num in main is : 1
--76-- thread num in main is : 3
--78-- thread num in main is : 3
--82-- thread num in main is : 3
---90---thread num in main is : 6

    ............

--93-- thread num in main is : 13

Я в замешательстве , Почему активные потоки увеличиваются в строке 76/90/93. Моя работа не нуждается в перетасовке, так что нет снижения. Поэтому, если эти темы не нужны, я хочу знать, как их закрыть.

...