Нет удобного способа сделать это в Pig, но JOIN может помочь вам:
Pig:
ch = LOAD 'ch.txt';
ch_all = GROUP ch ALL;
ch_count = FOREACH ch_all GENERATE 'same' AS key, (DOUBLE) COUNT(ch) AS ct;
ca = LOAD 'ca.txt';
ca_all = GROUP ca ALL;
ca_count = FOREACH ca_all GENERATE 'same' AS key, (DOUBLE) COUNT(ca) AS ct;
ca_ch = JOIN ch_count BY key, ca_count BY key;
ca_ch_div = FOREACH ca_ch GENERATE ch_count::ct / ca_count::ct;
DUMP ca_ch_div;
Вывод:
(0.6666666666666666)
Ввод:
cat ch.txt
1
2
cat ca.txt
1
2
3