Bigquery похожий запрос другой вывод - PullRequest
0 голосов
/ 06 ноября 2018

У меня есть 2 стандартных запроса SQL в Bigquery. Это:

Query1:

select sfcase.case_id
, sfuser.user_id
, sfcase_create_date
, sfcase_status
, sfcase_origin
, sfcategory_category1
, sfcategory_category2
, sfcase_priority
, sftime_elapsedmin
, sftime_targetmin
, sfcase_sla_closemin
, if(count(sfcomment.parentid)=0,"0"
,if(count(sfcomment.parentid)=1,"1"
,if(count(sfcomment.parentid)=2,"2"
,"3"))) as comment_response
from(
  select id as case_id
  , timestamp_add(createddate, interval 7 hour) as sfcase_create_date
  , status as sfcase_status
  , origin as sfcase_origin
  , priority as sfcase_priority
  , case when status = 'Closed' then timestamp_diff(timestamp_add(closeddate, interval 7 hour),timestamp_add(createddate, interval 7 hour),minute)
      end as sfcase_sla_closemin
  , case_category__c
  from `some_of_my_dataset.cs_case` 
) sfcase

left join(
  select upper(x1st_category__c) as sfcategory_category1
  , upper(x2nd_category__c) as sfcategory_category2
  , id
  from `some_of_my_dataset.cs_case_category` 
) sfcategory
on sfcategory.id = sfcase.case_category__c

left join(
  select parentid as parentid
  from `some_of_my_dataset.cs_case_comment` 
) sfcomment
on sfcase.case_id = sfcomment.parentid

left join(
  select ELAPSEDTIMEINMINS as sftime_elapsedmin
  , TARGETRESPONSEINMINS as sftime_targetmin
  , caseid
  from `some_of_my_dataset.cs_case_milestone` 
)sftime
on sfcase.case_id = sftime.caseid

left join(
  select id as user_id 
  , createddate
  from `some_of_my_dataset.cs_user` 
)sfuser
on date(sfuser.createddate) = date(sfcase.sfcase_create_date)
group by 1
, 2
, 3
, 4
, 5
, 6
, 7 
, 8 
, 9 
, 10
, 11

Query2:

select sfcase.id as case_id
, sfuser.id as user_id
, timestamp_add(sfcase.createddate, interval 7 hour) as sf_create_date
, sfcase.status as sf_status
, sfcase.origin as sf_origin
, upper(sfcategory.x1st_category__c) as sf_category1
, sfcategory.x2nd_category__c as sf_category2
, sfcase.priority as sf_priority
, sftime.ELAPSEDTIMEINMINS as sf_elapsedresponsemin
, sftime.TARGETRESPONSEINMINS as sf_targetresponsemin
, case when sfcase.status = 'Closed' then timestamp_diff(timestamp_add(sfcase.closeddate, interval 7 hour),timestamp_add(sfcase.createddate, interval 7 hour),minute)
    end as sla_closemin
, if(count(sfcomment.parentid)=0,"0"
,if(count(sfcomment.parentid)=1,"1"
,if(count(sfcomment.parentid)=2,"2"
,"3"))) as comment_response

from `some_of_my_dataset.cs_case` as sfcase
left join `some_of_my_dataset.cs_case_category` as sfcategory
  on sfcategory.id = sfcase.case_category__c
left join `some_of_my_dataset.cs_case_comment`  as sfcomment
  on sfcase.id = sfcomment.parentid
left join `some_of_my_dataset.cs_case_milestone` as sftime
  on sfcase.id = sftime.caseid
left join `some_of_my_dataset.cs_user` as sfuser
  on date(sfuser.createddate) = date(sfcase.createddate)
group by 1
, 2
, 3
, 4
, 5
, 6
, 7 
, 8 
, 9 
, 10
, 11

Я пытался запустить их одновременно. Query1 работает быстрее с меньшим количеством строк данных, тогда как Query2 работает дольше с большим количеством строк данных Оба из Query1 и Query2 имеют 12 столбцов.

Почему они возвращают другой результат? Какой запрос я должен использовать?

обновление: переименовать мой набор данных

Добро пожаловать на сайт PullRequest, где вы можете задавать вопросы и получать ответы от других членов сообщества.
...