hadoop - RANK OVER function in Hive -


i'm trying run query in hive return top 10 url appear more in adimpression table.

select         ranked_mytable.url,         ranked_mytable.cnt          ( select iq.url, iq.cnt, rank() on (partition iq.url order iq.cnt desc) rnk                         ( select url, count(*) cnt                 store.adimpression ai                         inner join zuppa.adgroupcreativesubscription agcs                                 on agcs.id = ai.adgroupcreativesubscriptionid                         inner join zuppa.adgroup ag                                 on ag.id = agcs.adgroupid                 ai.datehour >= '2014-05-15 00:00:00'                         , ag.siteid = 1240                 group url                 ) iq         ) ranked_mytable        ranked_mytable.rnk <= 10  order         ranked_mytable.url,         ranked_mytable.rnk desc  ; 

unfortunately error message stating:

failed: semanticexception [error 10002]: line 26:23 invalid column reference 'rnk' 

i've tried debug , until ranked_mytable sub-queries goes smooth. i've tried comment where ranked_mytable.rnk <= 10 clause error message keep appearing.

hive unable order column not in "output" of select statement. fix it, include column in selected columns:

select         ranked_mytable.url,         ranked_mytable.cnt,         ranked_mytable.rnk          ( select iq.url, iq.cnt, rank() on (partition iq.url order iq.cnt desc) rnk                         ( select url, count(*) cnt                 store.adimpression ai                         inner join zuppa.adgroupcreativesubscription agcs                                 on agcs.id = ai.adgroupcreativesubscriptionid                         inner join zuppa.adgroup ag                                 on ag.id = agcs.adgroupid                 ai.datehour >= '2014-05-15 00:00:00'                         , ag.siteid = 1240                 group url                 ) iq         ) ranked_mytable        ranked_mytable.rnk <= 10  order         ranked_mytable.url,         ranked_mytable.rnk desc  ; 

if don't want 'rnk' column in final output, expect wrap whole thing in inner-query , select out 'url' , 'cnt' fields.


Comments

Popular posts from this blog

how to proxy from https to http with lighttpd -

android - Automated my builds -

python - Flask migration error -