묻고 답해요
141만명의 커뮤니티!! 함께 토론해봐요.
인프런 TOP Writers
-
미해결BigQuery(SQL) 활용편(퍼널 분석, 리텐션 분석)
[빠짝스터디 1주차 과제] ARRAY, STRUCT / PIVOT / 퍼널 연습 문제
1. ARRAY, STRUCT-- 1. array_exercises 테이블에서 각 영화(title)별로 장르(genres)를 UNNEST해서 보여주세요. -- genres 꺼내기 SELECT title , genre FROM `advanced.array_exercises` CROSS JOIN UNNEST(genres) AS genre -- 2. array_exercises 테이블에서 각 영화(title)별로 배우(actor)와 배역(character)을 보여주세요. -- arrary 안의 struct 영화/배우 꺼내기 SELECT title , ac.actor , ac.character FROM `advanced.array_exercises` CROSS JOIN UNNEST(actors) AS ac -- actor.actor도 가능 -- 3. array_exercises 테이블에서 각 영화(title)별로 배우(actor)와 배역(character), 장르(genre) 출력 -- actors, genres 각각 꺼내기 (2번 조인) SELECT -- title, actor, character, genre title , ac.actor , ac.character , genre FROM `advanced.array_exercises` CROSS JOIN UNNEST(actors) AS ac CROSS JOIN UNNEST(genres) AS genre -- 4. 앱로그 데이터(app_logs)의 배열을 풀어주세요. -- event_params 꺼내기 SELECT user_id , event_date , event_name , user_pseudo_id , event_param.key , event_param.value.string_value , event_param.value.int_value FROM `advanced.app_logs` CROSS JOIN UNNEST(event_params) AS event_param -- WHERE event_date = '2022-08-01' -- test 2. PIVOT-- 1. orders 테이블에서 유저(user_id)별로 주문 금액(amount)의 합계를 PIOVT해주세요. -- 날짜(order_date)를 행, user_id를 열, sum(amount) SELECT order_date , SUM(IF(user_id = 1, amount, 0)) AS user_1 , SUM(IF(user_id = 2, amount, 0)) AS user_2 , SUM(IF(user_id = 3, amount, 0)) AS user_3 FROM `advanced.orders` GROUP BY 1 ORDER BY 1 -- 2. orders 테이블에서 날짜(order_date)별로 유저들의 주문 금액(amount) 합계를 PIVOT -- user_id 행, order_date 열, sum(주문 금액), '-' 포함 날짜 별칭은 backtick(``) 활용 SELECT user_id , SUM(IF(order_date = '2023-05-01', amount, 0)) AS `2023-05-01` , SUM(IF(order_date = '2023-05-02', amount, 0)) AS `2023-05-02` , SUM(IF(order_date = '2023-05-03', amount, 0)) AS `2023-05-03` , SUM(IF(order_date = '2023-05-04', amount, 0)) AS `2023-05-04` , SUM(IF(order_date = '2023-05-05', amount, 0)) AS `2023-05-05` FROM `advanced.orders` GROUP BY user_id ORDER BY user_id -- 3. orders 테이블에서 사용자별 날짜별로 주문이 있다면 1, 없으면 0으로 PIOVT -- user_id 행, order_date 열, if(날짜, 1, 0) SELECT user_id , MAX(IF(order_date = '2023-05-01', 1, 0)) AS `2023-05-01` , MAX(IF(order_date = '2023-05-02', 1, 0)) AS `2023-05-02` , MAX(IF(order_date = '2023-05-03', 1, 0)) AS `2023-05-03` , MAX(IF(order_date = '2023-05-04', 1, 0)) AS `2023-05-04` , MAX(IF(order_date = '2023-05-05', 1, 0)) AS `2023-05-05` FROM `advanced.orders` GROUP BY user_id ORDER BY user_id 3. 퍼널3-1. 퍼널별 유저 수-- 1. 퍼널별 유저 수 : 2022-08-01 ~ 2022-08-18, 오픈 퍼널, COUNT(DISTINCT user_pseudo_id) WITH funnels AS ( -- 조건/컬럼 필터링, event_params UNNEST, event name + screen 문자열 컬럼 병합 SELECT CONCAT(event_name, '-', event_param.value.string_value) AS event_name_with_screen , COUNT(DISTINCT user_pseudo_id) AS cnt FROM `advanced.app_logs` CROSS JOIN UNNEST(event_params) AS event_param WHERE event_date BETWEEN '2022-08-01' AND '2022-08-18' AND event_name IN ('screen_view', 'click_payment') AND event_param.key = 'firebase_screen' AND event_param.value.string_value NOT IN ('food_detail', 'search', 'search_result') GROUP BY ALL ) SELECT event_name_with_screen -- step_number 생성 , (CASE WHEN event_name_with_screen = 'screen_view-welcome' THEN 1 WHEN event_name_with_screen = 'screen_view-home' THEN 2 WHEN event_name_with_screen = 'screen_view-food_category' THEN 3 WHEN event_name_with_screen = 'screen_view-restaurant' THEN 4 WHEN event_name_with_screen = 'screen_view-cart' THEN 5 WHEN event_name_with_screen = 'click_payment-cart' THEN 6 END ) AS step_number , cnt FROM funnels ORDER BY step_number3-2. 퍼널별 유저 수(일자별)-- 2. 퍼널별 유저 수(일자별) : 2022-08-01 ~ 2022-08-18, 오픈 퍼널, COUNT(DISTINCT user_pseudo_id) WITH funnels AS ( -- 조건/컬럼 필터링, event_params UNNEST, event name + screen 문자열 컬럼 병합 SELECT -- 일자별 event_date , CONCAT(event_name, '-', event_param.value.string_value) AS event_name_with_screen , COUNT(DISTINCT user_pseudo_id) AS cnt FROM `advanced.app_logs` CROSS JOIN UNNEST(event_params) AS event_param WHERE event_date BETWEEN '2022-08-01' AND '2022-08-18' AND event_name IN ('screen_view', 'click_payment') AND event_param.key = 'firebase_screen' AND event_param.value.string_value NOT IN ('food_detail', 'search', 'search_result') GROUP BY ALL ) SELECT event_date , event_name_with_screen -- step_number 생성 , (CASE WHEN event_name_with_screen = 'screen_view-welcome' THEN 1 WHEN event_name_with_screen = 'screen_view-home' THEN 2 WHEN event_name_with_screen = 'screen_view-food_category' THEN 3 WHEN event_name_with_screen = 'screen_view-restaurant' THEN 4 WHEN event_name_with_screen = 'screen_view-cart' THEN 5 WHEN event_name_with_screen = 'click_payment-cart' THEN 6 END ) AS step_number , cnt FROM funnels ORDER BY event_date, step_number -- 정렬 변경 3-3. 퍼널별 유저 수(일자별 PIVOT)-- 3. 퍼널별 유저 수(일자별) 집계 PIVOT WITH funnels AS ( -- 조건/컬럼 필터링, event_params UNNEST, event name + screen 문자열 컬럼 병합 SELECT -- 일자별 event_date , CONCAT(event_name, '-', event_param.value.string_value) AS event_name_with_screen , COUNT(DISTINCT user_pseudo_id) AS cnt FROM `advanced.app_logs` CROSS JOIN UNNEST(event_params) AS event_param WHERE event_date BETWEEN '2022-08-01' AND '2022-08-18' AND event_name IN ('screen_view', 'click_payment') AND event_param.key = 'firebase_screen' AND event_param.value.string_value NOT IN ('food_detail', 'search', 'search_result') GROUP BY ALL ) SELECT event_date -- event PIVOT , MAX(IF(event_name_with_screen = 'screen_view-welcome', cnt, NULL)) AS `screen_view-welcome` , MAX(IF(event_name_with_screen = 'screen_view-home', cnt, NULL)) AS `screen_view-home` , MAX(IF(event_name_with_screen = 'screen_view-food_category', cnt, NULL)) AS `screen_view-food_category` , MAX(IF(event_name_with_screen = 'screen_view-restaurant', cnt, NULL)) AS `screen_view-restaurant` , MAX(IF(event_name_with_screen = 'screen_view-cart', cnt, NULL)) AS `screen_view-cart` , MAX(IF(event_name_with_screen = 'click_payment-cart', cnt, NULL)) AS `click_payment-cart` FROM funnels GROUP BY ALL ORDER BY event_date
-
미해결BigQuery(SQL) 활용편(퍼널 분석, 리텐션 분석)
[빠짝스터디 1주차 과제] ARRAY, STRUCT / PIVOT / 퍼널 쿼리 연습 문제
ARRAY, STRUCT 연습문제 1) SELECT title, genre FROM advanced.array_exercises CROSS JOIN UNNEST(genres) AS genre 2) SELECT title, actors, actors[SAFE_OFFSET(0)].actor AS frist_actor, actors[SAFE_OFFSET(0)].character AS first_character, actors[SAFE_OFFSET(1)].actor AS second_actor, actors[SAFE_OFFSET(1)].character AS second_character FROM advanced.array_exercises AS ae 3) SELECT title, actor.actor, actor.character FROM advanced.array_exercises AS ae CROSS JOIN UNNEST(actors) AS actor 4) SELECT event_Date, event_timestamp, event_name, event_params, user_id, event_param.key AS key, event_param.value AS value, event_param.value.string_value AS string_value, event_param.value.int_value AS int_value FROM advanced.app_logs_temp CROSS JOIN UNNEST(event_params) AS event_param WHERE event_date = '2022-08-01'PIVOT 연습문제1) SELECT order_date, SUM(IF(user_id = 1, sum_of_amount, 0)) AS user_1, SUM(IF(user_id = 2, sum_of_amount, 0)) AS user_2, SUM(IF(user_id = 3, sum_of_amount, 0)) AS user_3, FROM ( SELECT order_date, user_id, SUM(amount) AS sum_of_amount FROM advanced.orders GROUP BY order_date, user_id ORDER BY order_date) GROUP BY order_date2) SELECT user_id, SUM(IF(order_date = '2023-05-01', sum_of_amount, 0)) AS `2023-05-01`, SUM(IF(order_date = '2023-05-02', sum_of_amount, 0)) AS `2023-05-02`, SUM(IF(order_date = '2023-05-03', sum_of_amount, 0)) AS `2023-05-03`, SUM(IF(order_date = '2023-05-04', sum_of_amount, 0)) AS `2023-05-04`, SUM(IF(order_date = '2023-05-05', sum_of_amount, 0)) AS `2023-05-05` FROM ( SELECT user_id, order_date, SUM(amount) AS sum_of_amount FROM advanced.orders GROUP BY user_id,order_date) GROUP BY user_id 3) SELECT user_id, MAX(IF(order_date = '2023-05-01' AND user_id IS NOT NULL, 1, 0)) AS `2023-05-01`, MAX(IF(order_date = '2023-05-02' AND user_id IS NOT NULL, 1, 0)) AS `2023-05-02`, MAX(IF(order_date = '2023-05-03' AND user_id IS NOT NULL, 1, 0)) AS `2023-05-03`, MAX(IF(order_date = '2023-05-04' AND user_id IS NOT NULL, 1, 0)) AS `2023-05-04`, MAX(IF(order_date = '2023-05-05' AND user_id IS NOT NULL, 1, 0)) AS `2023-05-05` FROM advanced.orders GROUP BY user_id 4) SELECT event_date, event_timestamp, event_name, user_id, user_pseudo_id, MAX(IF(param.key = 'firebase_screen', param.value.string_value, NULL)) AS firebase_screen, MAX(IF(param.key = 'food_id', param.value.string_value, NULL)) AS food_id, MAX(IF(param.key = 'food_id', param.value.int_value, NULL)) AS food_id2, MAX(IF(param.key = 'session_id', param.value.string_value, NULL)) AS session_id FROM advanced.app_logs_temp CROSS JOIN UNNEST(event_params) AS param WHERE event_date = '2022-08-01' GROUP BY ALL 퍼널분석WITH Funnel AS (SELECT event_date, event_timestamp, event_name, user_id, user_pseudo_id, MAX(IF(param.key = 'firebase_screen', param.value.string_value, NULL)) AS firebase_screen, MAX(IF(param.key = 'food_id', param.value.string_value, NULL)) AS food_id, MAX(IF(param.key = 'food_id', param.value.int_value, NULL)) AS food_id2, MAX(IF(param.key = 'session_id', param.value.string_value, NULL)) AS session_id FROM advanced.app_logs_temp CROSS JOIN UNNEST(event_params) AS param WHERE event_date = '2022-08-01' GROUP BY ALL ) SELECT event_date, COUNT(user_id) AS user_cnt FROM Funnel WHERE event_name = 'click_cart' GROUP BY event_date;
-
미해결BigQuery(SQL) 활용편(퍼널 분석, 리텐션 분석)
[빠짝스터디 1주차 과제] ARRAY, STRUCT / PIVOT / 퍼널 쿼리 연습 문제
ARRAY, STRUCT, UNNEST 연습 문제array_exercises 테이블에서 각 영화(title)별로 장르(genres)를 UNNEST해서 보여주세요SELECT title, genre FROM `plucky-catfish-394207.advanced.array_exercises` , unnest(genres) as genre ORDER BY 1array_exercises 테이블에서 각 영화(title)별로 배우(actor)와 배역(character)을 보여주세요. 배우와 배역은 별도의 컬럼으로 나와야 합니다SELECT title, actor.actor, actor.character, FROM `plucky-catfish-394207.advanced.array_exercises` , unnest(genres) as genre ORDER BY 1array_exercises 테이블에서 각 영화(title)별로 배우(actor), 배역(character), 장르 (genre)를 출력하세요. 한 Row에 배우, 배역, 장르가 모두 표시되어야 합니다 SELECT title, actor.actor, genre, FROM `plucky-catfish-394207.advanced.array_exercises` , unnest(genres) as genre , unnest(actors) as actors ORDER BY 1앱 로그 데이터(app_logs)의 배열을 풀어주세요SELECT user_id, event_date, event_name, user_pseudo_id, ep.key, ep.value.string_value, ep.value.int_value FROM `plucky-catfish-394207.advanced.array_exercises` , unnest(event_params) as ep ORDER BY 2 PIVOTorders 테이블에서 유저(user_id)별로 주문 금액(amount)의 합계를 PIVOT해주세요. 날짜(order_date)를 행(Row)으로, user_id를 열(Column)으로 만들어야 합니다SELECT order_date, SUM(IF(user_id = 1, amount, 0) as user_1, SUM(IF(user_id = 2, amount, 0) as user_2, SUM(IF(user_id = 3, amount, 0) as user_3, FROM `plucky-catfish-394207.advanced.array_exercises` , unnest(event_params) as ep GROUP BY 1 ORDER BY 1orders 테이블에서 날짜(order_date)별로 유저들의 주문 금액(amount)의 합계를 PIVOT 해주세요. user_id를 행(Row)으로, order_date를 열(Column)으로 만들어야 합니다 SELECT user_id, SUM(IF(order_date = '2023-05-01', amount, 0) as `2023-05-01`, SUM(IF(order_date = '2023-05-02', amount, 0) as `2023-05-02`, SUM(IF(order_date = '2023-05-03', amount, 0) as `2023-05-03`, SUM(IF(order_date = '2023-05-04', amount, 0) as `2023-05-04`, SUM(IF(order_date = '2023-05-05', amount, 0) as `2023-05-05`, FROM `plucky-catfish-394207.advanced.array_exercises` , unnest(event_params) as ep GROUP BY 1 ORDER BY 1orders 테이블에서 사용자(user_id)별, 날짜(order_date)별로 주문이 있다면 1, 없다면 0으로 PIVOT 해주세요. user_id를 행(Row)으로, order_date를 열(Column)로 만들고 주문을 많이 해도 1로 처리합니다 SELECT user_id, SUM(IF(order_date = '2023-05-01', amount > 0, 1, 0) as `2023-05-01`, SUM(IF(order_date = '2023-05-02', amount > 0, 1, 0) as `2023-05-02`, SUM(IF(order_date = '2023-05-03', amount > 0, 1, 0) as `2023-05-03`, SUM(IF(order_date = '2023-05-04', amount > 0, 1, 0) as `2023-05-04`, SUM(IF(order_date = '2023-05-05', amount > 0, 1, 0) as `2023-05-05`, FROM `plucky-catfish-394207.advanced.array_exercises` , unnest(event_params) as ep GROUP BY 1 ORDER BY 1user_id = 32888이 카트 추가하기(click_cart)를 누를때 어떤 음식(food_id)을 담았나요? SELECT event_date, event_name, event_timestamp, user_id, user_pseudo_id, MAX(IF(ep.key = 'firebase_screen', ep.value.string_value, NULL)) as firebase_screen, MAX(IF(ep.key = 'food_id', ep.value.int_value, NULL)) as food_id, MAX(IF(ep.key = 'session_id', ep.value.int_value, NULL)) as session_id, FROM `plucky-catfish-394207.advanced.array_exercises` , unnest(event_params) as ep WHERE event_date = '2022-08-01' AND user_id = 32888 GROUP BY 1, 2, 3, 4, 5 퍼널 분석WITH base AS ( SELECT event_date event_timestamp, event_name, user_id, user_pseudo_id, platform, MAX(IF(event_param.key = "firebase_screen", event_param.value.string_value, NULL)) AS firebase_screen, MAX(IF(event_param.key = "session_id", event_param.value.string_value, NULL)) AS session_id FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS event_param WHERE event_date BETWEEN "2022-08-01" AND "2022-08-18" GROUP BY ALL ), filter_event_and_concat_event_and_screen AS ( SELECT * EXCEPT(event_name, firebase_screen, event_timestamp), CONCAT(event_name, "-", firebase_screen) AS event_name_with_screen, DATETIME(TIMESTAMP_MICROS(event_timestamp), 'Asia/Seoul') AS event_datetime FROM base WHERE event_name IN ("screen_view", "click_payment") ), add_step_number AS ( SELECT event_date, event_name_with_screen, CASE WHEN event_name_with_screen = "screen_view-welcome" THEN 1 WHEN event_name_with_screen = "screen_view-home" THEN 2 WHEN event_name_with_screen = "screen_view-food_category" THEN 3 WHEN event_name_with_screen = "screen_view-restaurant" THEN 4 WHEN event_name_with_screen = "screen_view-cart" THEN 5 WHEN event_name_with_screen = "click_payment-cart" THEN 6 ELSE NULL END AS step_number, COUNT(DISTINCT user_pseudo_id) AS cnt FROM filter_event_and_concat_event_and_screen GROUP BY ALL HAVING step_number IS NOT NULL
-
미해결BigQuery(SQL) 활용편(퍼널 분석, 리텐션 분석)
[ 인프런 빅쿼리 빠짝스터디 1주차 ] ARRAY, STRUCT 연습 문제 / PIVOT 연습 문제 / 퍼널 쿼리 연습 문제
ARRAY, STRUCT-- array_exercises 테이블에서 각 영화(title)별로 장르(genres)를 UNNEST해서 보여주기. SELECT title, genre FROM advanced.array_exercises CROSS JOIN UNNEST(genres) as genre-- array_exercises 테이블에서 각 영화(title)별로 배우(actor)와 배역(character)을 보여주기. 단, 배우와 배역은 별도의 컬럼으로 나와야 함. SELECT title, actor.actor, actor.character FROM `advanced.array_exercises` CROSS JOIN UNNEST(actors) as actor-- array_exercises 테이블에서 각 영화(title)별로 배우(actor), 배역(character), 장르(genre) 출력하기 한 행에 배우, 배역, 장르가 모두 표시되어야 된다. SELECT title, actor.actor, actor.character, genre FROM `advanced.array_exercises` CROSS JOIN UNNEST(actors) as actor CROSS JOIN UNNEST(genres) as genre -- 앱 로그 데이터 (app_logs)의 배열 풀기 SELECT user_id, event_date, event_name, user_pseudo_id, event_param.key, event_param.value.string_value, event_param.value.int_value FROM advanced.app_logs CROSS JOIN UNNEST(event_params) as event_param limit 500 데이터 PIVOT-- orders 테이블에서 유저(user_id)별로 주문 금액(amount)의 합계를 PIVOT하기. 날짜(order_date)를 행(Row)으로, user_id를 열(Column)으로 만들어야됨. SELECT order_date, SUM(IF(user_id = 1,amount,0)) AS user_1, SUM(IF(user_id = 2,amount,0)) AS user_2, SUM(IF(user_id = 3,amount,0)) AS user_3 FROM advanced.orders GROUP BY order_date ORDER BY order_date asc;-- orders 테이블에서 날짜(order_date)별로 유저들의 주문 금액(amount)의 합계를 PIVOT하기. user_id를 행으로, order_date를 열로 만들어야 됨. SELECT user_id, sum(if(order_date = '2023-05-01',amount, 0)) AS `2023-05-01`, sum(if(order_date = '2023-05-02',amount, 0)) AS `2023-05-02`, sum(if(order_date = '2023-05-03',amount, 0)) AS `2023-05-03`, sum(if(order_date = '2023-05-04',amount, 0)) AS `2023-05-04`, sum(if(order_date = '2023-05-05',amount, 0)) AS `2023-05-05` FROM advanced.orders GROUP BY user_id ORDER BY user_id ASC-- orders 테이블에서 사용자(user_id)별, 날짜(order_date)별로 주문이 있다면 1, 없다면 0으로 피벗하기. user_id를 행으로, order_date를 열로 만들고 주문이 많아도 1로 처리. SELECT user_id, sum(if(order_date = '2023-05-01',1, 0)) AS `2023-05-01`, sum(if(order_date = '2023-05-02',1, 0)) AS `2023-05-02`, sum(if(order_date = '2023-05-03',1, 0)) AS `2023-05-03`, sum(if(order_date = '2023-05-04',1, 0)) AS `2023-05-04`, sum(if(order_date = '2023-05-05',1, 0)) AS `2023-05-05` FROM advanced.orders GROUP BY user_id ORDER BY user_id ASC-- 앱 로그 데이터 배열 PIVOT 하기 ( user_id = 32888이 카트 추가하기 (click_cart)를 누를때 어떤 음식(food_id)을 담았나?) WITH app_pivot AS ( SELECT user_id, event_date, event_name, user_pseudo_id, event_param.key AS key, event_param.value.string_value AS string_value, event_param.value.int_value AS int_value FROM advanced.app_logs CROSS JOIN UNNEST(event_params) as event_param ) SELECT user_id, event_date, event_name, user_pseudo_id, MAX(IF(key = 'firebase_screen', string_value,NULL)) AS `firebase_screen`, MAX(IF(key = 'food_id',int_value,NULL)) AS `food_id`, MAX(IF(key = 'session_id',string_value,NULL)) AS `session_id` FROM app_pivot WHERE user_id = 32888 AND event_name = 'click_cart' GROUP BY user_id,event_date, event_name, user_pseudo_id ORDER BY event_date ASC;퍼널 분석-- 일자별 이벤트 별 집계형태를 PIVOT 형태로 전환하기 WITH param_pivot AS ( SELECT event_date, event_timestamp, event_name, user_id, user_pseudo_id, MAX(IF(key = 'firebase_screen', event_param.value.string_value,NULL)) AS `firebase_screen`, MAX(IF(key = 'food_id',event_param.value.int_value,NULL)) AS `food_id`, MAX(IF(key = 'session_id',event_param.value.string_value,NULL)) AS `session_id` FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS event_param WHERE event_date BETWEEN '2022-08-01' AND '2022-08-18' GROUP BY ALL), event_table AS (SELECT * EXCEPT(event_name,firebase_screen), CONCAT(event_name,'_', firebase_screen) AS event_name_with_screen, FROM param_pivot WHERE event_name IN ('screen_view','click_payment') GROUP BY all ORDER BY event_date ASC ) , final AS (SELECT event_date, event_name_with_screen, (CASE WHEN event_name_with_screen = 'screen_view_welcome' THEN 1 WHEN event_name_with_screen = 'screen_view_home' THEN 2 WHEN event_name_with_screen = 'screen_view_food_category' THEN 3 WHEN event_name_with_screen = 'screen_view_restaurant' THEN 4 WHEN event_name_with_screen = 'screen_view_cart' THEN 5 WHEN event_name_with_screen = 'clik_payment_cart' THEN 6 ELSE NULL END ) AS step_number, COUNT( DISTINCT user_pseudo_id) AS cnt FROM event_table GROUP BY event_date, event_name_with_screen HAVING step_number IS NOT NULL ORDER BY event_date) SELECT event_date, SUM(IF(event_name_with_screen = 'screen_view_welcome', cnt, 0)) AS `screen_view_welcom`, SUM(IF(event_name_with_screen = 'screen_view_home', cnt, 0)) AS `screen_view_home`, SUM(IF(event_name_with_screen = 'screen_view_food_category', cnt, 0)) AS `screen_view_food_category`, SUM(IF(event_name_with_screen = 'screen_view_restaurant', cnt, 0)) AS `screen_view_restaurant`, SUM(IF(event_name_with_screen = 'screen_view_cart', cnt, 0)) AS `screen_view_cart`, SUM(IF(event_name_with_screen = 'click_payment_cart', cnt, 0)) AS `click_payment_cart` FROM final GROUP BY event_date ORDER BY event_date ASC;
-
미해결BigQuery(SQL) 활용편(퍼널 분석, 리텐션 분석)
[바짝스터디 1주차 과제] ARRAY, STRUCT / PIVOT / 퍼널 분석 연습문제
1. ARRAY, STRUCT 연습문제(1) array_exercises 테이블에서 각 영화(title)별로 장르(genres)를 UNNEST해서 보여주세요.SELECT title, genre FROM advanced.array_exercises AS ae CROSS JOIN UNNEST(genres) AS genre*UNNEST(ARRAY_Column) = UNNEST(배열)(2) array_exercises 테이블에서 각 영화(title)별로 배우(actor)와 배역(character)을 보여주세요.배우와 배역은 별도의 컬럼으로 나와야 합니다.SELECT title, actor.actor, actor.character FROM advanced.array_exercises AS ae CROSS JOIN UNNEST(actors) AS actor(3) array_exercises 테이블에서 각 영화(title)별로 배우(actor)와 배역(character), 장르(genre)를 출력하세요. 한 Row에 배우, 배역, 장르가 모두 표시되어야 합니다.SELECT title, actor.actor, actor.character, genre FROM advanced.array_exercises AS ae , UNNEST(actors) AS actor , UNNEST(genres) AS genre*연속해서 CROSS JOIN UNNEST 사용 가능(4) 앱 로그 데이터(app_logs)의 배열을 풀어주세요.SELECT user_id, event_date, event_name, user_pseudo_id, event_param.key AS key, event_param.value.string_value AS string_value, event_param.value.int_value AS int_value FROM advanced.app_logs , UNNEST(event_params) AS event_param -- WHERE -- event_date = "2022-08-01"*실습 시, 파티션 사용 필요2. PIVOT 연습문제(1) orders 테이블에서 유저(user_id)별로 주문 금액(amount)의 합계를 PIVOT 해주세요.날짜(order_date)를 행(row)으로, user_id를 열(column)으로 만들어야 합니다.SELECT order_date, MAX(IF(user_id = 1, sum_of_amount, 0)) AS user_1, MAX(IF(user_id = 2, sum_of_amount, 0)) AS user_2, MAX(IF(user_id = 3, sum_of_amount, 0)) AS user_3 FROM ( SELECT order_date, user_id, SUM(amount) AS sum_of_amount FROM advanced.orders GROUP BY order_date, user_id ORDER BY order_date ) GROUP BY order_date ORDER BY order_date*첫번째 풀이 내 집계 함수 사용 시, GROUP BY 잊지않기*ctrl+d 사용(2) orders 테이블에서 날짜(order_date)별로 유저들의 주문 금액(amount)의 합계를 PIVOT 해주세요.user_id를 행(row)으로, order_date를 열(column)으로 만들어야 합니다.SELECT user_id, MAX(IF(order_date = '2023-05-01', amount, 0)) AS `2023-05-01`, MAX(IF(order_date = '2023-05-02', amount, 0)) AS `2023-05-02`, MAX(IF(order_date = '2023-05-03', amount, 0)) AS `2023-05-03`, MAX(IF(order_date = '2023-05-04', amount, 0)) AS `2023-05-04`, MAX(IF(order_date = '2023-05-05', amount, 0)) AS `2023-05-05`, FROM advanced.orders GROUP BY user_id ORDER BY user_id(3) orders 테이블에서 사용자(user_id)별, 날짜(order_date)별로 주문이 있다면 1, 없다면 0으로 해주세요. user_id를 행(row)으로, order_date를 열(column)로 만들고 주문을 많이 해도 1로 처리합니다.SELECT user_id, MAX(IF(order_date = '2023-05-01', 1, 0)) AS `2023-05-01`, MAX(IF(order_date = '2023-05-02', 1, 0)) AS `2023-05-02`, MAX(IF(order_date = '2023-05-03', 1, 0)) AS `2023-05-03`, MAX(IF(order_date = '2023-05-04', 1, 0)) AS `2023-05-04`, MAX(IF(order_date = '2023-05-05', 1, 0)) AS `2023-05-05`, FROM advanced.orders GROUP BY user_id ORDER BY user_id(4) user_id = 32888이 카트 추가하기(click_cart)를 누를 때 어떤 음식(food_id)을 담았나요?WITH base AS ( SELECT event_date, event_timestamp, event_name, user_id, user_pseudo_id, MAX(IF(event_param.key = 'firebase screen', event_param.value.string_value, NULL)) AS firebase_screen, MAX(IF(event_param.key = 'food_id', event_param.value.int_value, null)) AS food_id, MAX(IF(event_param.key = 'session_id', event_param.value.string_value, null)) AS session_id FROM advanced.app_logs , UNNEST(event_params) AS event_param GROUP BY ALL ) SELECT food_id FROM base WHERE user_id = 32888 AND event_name = 'click_cart' GROUP BY food_id3. 퍼널 분석 연습문제(1) 각 퍼널의 유저 수를 집계 데이터 : 2022-08-01 ~ 2022-08-18WITH base AS( SELECT event_date, event_timestamp, event_name, user_id, user_pseudo_id, platform, MAX(IF(event_param.key = "firebase_screen", event_param.value.string_value, NULL)) AS firebase_screen, MAX(IF(event_param.key = "food_id", event_param.value.int_value, NULL)) AS food_id, MAX(IF(event_param.key = "session_id", event_param.value.int_value, NULL)) AS session_id FROM `advanced.app_logs` CROSS JOIN UNNEST(event_params) AS event_param GROUP BY ALL ), filter_event_and_concat_event_and_screen AS( SELECT * EXCEPT(event_name, firebase_screen), CONCAT(event_name, "-", firebase_screen) AS event_name_with_screen, DATETIME(TIMESTAMP_MICROS(event_timestamp), "Asia/Seoul") AS event_datetime FROM base WHERE event_date BETWEEN "2022-08-01" AND "2022-08-18" AND event_name IN ("screen_view", "click_payment") AND firebase_screen IN ("welcome", "home", "food_category", "restaurant", "cart") ) SELECT event_date, event_name_with_screen, CASE WHEN event_name_with_screen = 'screen_view-welcome' THEN 1 WHEN event_name_with_screen = 'screen_view-home' THEN 2 WHEN event_name_with_screen = 'screen_view-food_category' THEN 3 WHEN event_name_with_screen = 'screen_view-restaurant' THEN 4 WHEN event_name_with_screen = 'screen_view-cart' THEN 5 WHEN event_name_with_screen = 'click_payment-cart' THEN 6 ELSE NULL END AS step_number, COUNT(DISTINCT user_pseudo_id) AS cnt FROM filter_event_and_concat_event_and_screen GROUP BY ALL HAVING step_number IS NOT NULL ORDER BY event_date, step_number
-
미해결BigQuery(SQL) 활용편(퍼널 분석, 리텐션 분석)
[빠짝스터디 1주차 과제] ARRAY, STRUCT 연습문제 / PIVOT 연습문제 / 퍼널별 전환율을 쉽게 구할 수 있도록 PIVOT해보기
연습문제CREATE OR REPLACE TABLE advanced.array_excercised AS #DDL SELECT movie_id, title, actors, genres FROM ( SELECT 1 AS movie_id, 'Avengers: Endgame' AS title, ARRAY<STRUCT<actor STRING, character STRING>>[ STRUCT('Robert Downey Jr.', 'Tony Stark'), STRUCT('Chris Evans', 'Steve Rogers') ] AS actors, ARRAY<STRING>['Action', 'Adventure', 'Drama'] AS genres UNION ALL SELECT 2, 'Inception', ARRAY<STRUCT<actor STRING,character STRING>>[ STRUCT('leonardo DiCaprio', 'Cobb'), STRUCT('Joseph Gordon-Levitt', 'Arthur') ], ARRAY<STRING>['Action', 'Adventure', 'Sci-Fi'] UNION ALL SELECT 3, 'The Dark Knight', ARRAY<STRUCT<actor STRING, character STRING>>[ STRUCT('Christian Bale', 'Bruce Wayne'), STRUCT('Heath Ledger', 'Joker') ], ARRAY<STRING>['Action', 'Crime', 'Drama'] ) → actors라는 STRUCT 구조체를 만들고 그 안에 2개의 필드 actor와 character를 지정하고, STRUCT 구조체 2개를 list처럼 ARRAY에 넣은 것.위의 테이블을 가지고 연습문제 1~4번 진행array_excercises 테이블에서 각 영화(title)별로 장르(genres)를 UNNEST해서 보여주세요.SELECT title, genre FROM advanced.array_excercised CROSS JOIN UNNEST(genres) AS genre array_excercised 테이블에서 각 영화(title)별로 배우(actor)와 배역(character)을 보여주세요. 배우와 배역은 별도의 컬럼으로 나와야 합니다.SELECT title, actor.actor, actor.character FROM advanced.array_excercised CROSS JOIN UNNEST(actors) AS actor actors ARRAY안에 2개의 STRUCT 구조체가 있는 구조이므로 ARRAY 데이터에 접근하는 방법으로 데이터에는 접근 가능actors[SAFE_OFFSET(0)].actor AS first_actorarray_excercises 테이블에서 각 영화(title)별로 배우(actor), 배역(character), 장르(genre)를 출력하세요. 한 Row에 배우, 배역, 장르가 모두 표시되어야 합니다.SELECT title, actor.actor, actor.character, genre FROM advanced.array_excercised CROSS JOIN UNNEST(actors) AS actor CROSS JOIN UNNEST(genres) AS genre *UNNEST를 2번 연속 사용할 수 있다.*쿼리문의 실행순서는 FROM → JOIN → SELECT 이다. UNNEST를 통해 만들어진 actor는 현재 actor.actor가 아니라 actor라는 STRUCT 구조체이므로 구조체에 바로 접근할 수 없다는 에러가 뜰 수 있다.앱 로그 데이터(app_logs)의 배열을 풀어주세요. SELECT event_date, event_timestamp, event_name, event_param.key AS key, event_param.value AS value, user_id, user_pseudo_id, platform FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS event_paramevent_params는 RECORD라고 되어있는데 STRUCT이다. 중첩된 구조라는 의미.PIVOT 연습문제orders 테이블에서 유저(user_id)별로 주문 금액(amount)의 합계를 PIVOT해주세요. 날짜(order_date)를 행(Row)으로, user_id를 열(column)으로 만들어야 합니다.SELECT order_date, SUM(IF(user_id = 1, amount, 0)) AS user_1, SUM(IF(user_id = 2, amount, 0)) AS user_2, SUM(IF(user_id = 3, amount, 0)) AS user_3 FROM advanced.orders GROUP BY order_date ORDER BY order_date 2. orders 테이블에서 날짜(order_date)별로 유저들의 주문 금액(amount)의 합계를 PIVOT해주세요. user_id를 행(Row)으로, order_date를 열(Column)으로 만들어야 합니다.SELECT user_id, SUM(IF(order_date = "2023-05-01", amount, 0)) AS `2023-05-01`, SUM(IF(order_date = "2023-05-02", amount, 0)) AS `2023-05-02`, SUM(IF(order_date = "2023-05-03", amount, 0)) AS `2023-05-03`, SUM(IF(order_date = "2023-05-04", amount, 0)) AS `2023-05-04`, SUM(IF(order_date = "2023-05-05", amount, 0)) AS `2023-05-05` FROM advanced.orders GROUP BY user_id orders 테이블에서 사용자(user_id)별, 날짜(order_date)별로 주문이 있다면 1, 없다면 0으로 PIVOT해주세요. user_id를 행(Row)으로, order_date를 열(Column)로 만들고 주문을 많이 해도 1로 처리합니다.SELECT user_id, MAX(IF(order_date = "2023-05-01", 1, 0)) AS `2023-05-01`, MAX(IF(order_date = "2023-05-02", 1, 0)) AS `2023-05-02`, MAX(IF(order_date = "2023-05-03", 1, 0)) AS `2023-05-03`, MAX(IF(order_date = "2023-05-04", 1, 0)) AS `2023-05-04`, MAX(IF(order_date = "2023-05-05", 1, 0)) AS `2023-05-05` FROM advanced.orders GROUP BY user_id 4. user_id = 32888이 카트 추가하기(click_cart)를 누를 때 어떤 음식(food_id)을 담았나요?WITH base AS( SELECT #* EXCEPT(event_params), event_date, event_timestamp, event_name, event_param.key AS key, event_param.value.string_value AS string_value, event_param.value.int_value AS int_value, user_id, user_pseudo_id, platform FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS event_param ) SELECT user_id, event_date, event_name, MAX(IF(key = 'firebase_screen', string_value, NULL)) AS firebase_screen, MAX(IF(key = 'food_id', int_value, NULL)) AS food_id, MAX(IF(key = 'session_id', string_value, NULL)) AS session_id FROM base WHERE user_id = 32888 AND event_name = 'click_cart' GROUP BY user_id, event_date, event_name퍼널별 전환율을 쉽게 구할 수 있도록 PIVOT 해보기 WITH base AS ( SELECT event_date, event_timestamp, event_name, user_id, user_pseudo_id, platform, #event_param, MAX(IF(event_param.key = 'firebase_screen', event_param.value.string_value, NULL)) AS firebase_screen, #MAX(IF(event_param.key = 'food_id', event_param.value.int_value, NULL)) AS food_id, MAX(IF(event_param.key = 'session_id', event_param.value.string_value, NULL)) AS session_id FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS event_param WHERE #event_date = "2022-08-01" #적은 데이터로 쿼리를 작성하기 위해 만들어둔 조건 event_date BETWEEN "2022-08-01" AND "2022-08-18" GROUP BY ALL ), filter_event_and_concat_event_and_screen AS ( SELECT * EXCEPT(event_name, firebase_screen, event_timestamp), CONCAT(event_name, "-", firebase_screen) AS event_name_with_screen, DATETIME(TIMESTAMP_MICROS(event_timestamp), 'Asia/Seoul') AS event_datetime FROM base WHERE event_name IN ("screen_view", "click_payment") ), funnel_analysis AS ( SELECT event_date, #일자별로 퍼널별 유저수 쿼리 event_name_with_screen, CASE WHEN event_name_with_screen = "screen_view-welcome" THEN 1 WHEN event_name_with_screen = "screen_view-home" THEN 2 WHEN event_name_with_screen = "screen_view-food_category" THEN 3 WHEN event_name_with_screen = "screen_view-restaurant" THEN 4 WHEN event_name_with_screen = "screen_view-cart" THEN 5 WHEN event_name_with_screen = "click_payment-cart" THEN 6 ELSE NULL END AS step_number, COUNT(DISTINCT user_pseudo_id) AS cnt FROM filter_event_and_concat_event_and_screen GROUP BY ALL HAVING step_number IS NOT NULL ORDER BY event_date ) SELECT event_date, MAX(IF (event_name_with_screen = "screen_view-welcome", cnt, NULL)) AS `screen_view-welcome`, MAX(IF (event_name_with_screen = "screen_view-home", cnt, NULL)) AS `screen_view-home`, MAX(IF (event_name_with_screen = "screen_view-food_category", cnt, NULL)) AS `screen_view-food_category`, MAX(IF (event_name_with_screen = "screen_view-restaurant", cnt, NULL)) AS `screen_view-restaurant`, MAX(IF (event_name_with_screen = "screen_view-cart", cnt, NULL)) AS `screen_view-cart`, MAX(IF (event_name_with_screen = "click_payment-cart", cnt, NULL)) AS `click_payment-cart` FROM funnel_analysis GROUP BY ALL ORDER BY event_date
-
미해결BigQuery(SQL) 활용편(퍼널 분석, 리텐션 분석)
[빠짝스터디 1주차 과제] Array, Struct, Pivot, Funnel
1. ARRAY, STRUCT1) array_exercises 테이블에서 각 영화(title)별로 장르(genres)를 UNNEST해서 보여주세요.SELECT title , genre FROM advanced.array_exercises CROSS JOIN UNNEST(genres) AS genre ;2) array_exercises 테이블에서 각 영화(title)별로 배우(actor)와 배역(character)을 보여주세요. (배우와 배역은 별도의 컬럼으로 나와야 합니다)-- 동일한 단어에 대해 선택할 수 있는 함수 : cmd+d SELECT title , actor.actor , actor.character FROM advanced.array_exercises CROSS JOIN UNNEST(actors) AS actor ;3) array_exercises 테이블에서 각 영화(title)별로 배우(actor), 배역(character), 장르(genre)를 출력하세요. 한 Row에 배우, 배역, 장르가 모두 표시되어야 합니다.SELECT title , actor.actor , actor.character , genre FROM advanced.array_exercises CROSS JOIN UNNEST(actors) AS actor CROSS JOIN UNNEST(genres) AS genre ;4) 앱 로그 데이터(app_logs)의 배열을 풀어주세요.SELECT user_id , event_date , event_name , user_pseudo_id , event_param.key , event_param.value.string_value , event_param.value.int_value , platform FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS event_param ;2. PIVOT1) orders 테이블에서 유저(user_id)별로 주문 금액(amount)의 합계를 PIVOT해주세요. 날짜(order_date)를 행(Row)으로, user_id를 열(Column)으로 만들어야 합니다.SELECT order_date , SUM(IF(user_id = 1, amount, NULL)) AS `user_id_1` , SUM(IF(user_id = 2, amount, NULL)) AS `user_id_2` , SUM(IF(user_id = 3, amount, NULL)) AS `user_id_3` FROM advanced.orders GROUP BY 1 ORDER BY 1 ;2) orders 테이블에서 날짜(order_date)별로 유저들의 주문 금액(amount)의 합계를 PIVOT 해주세요. user_id를 행(Row)으로, order_date를 열(Column)으로 만들어야 합니다.SELECT user_id , SUM(IF(order_date = '2023-05-01', amount, 0)) AS `2023-05-01` , SUM(IF(order_date = '2023-05-02', amount, 0)) AS `2023-05-02` , SUM(IF(order_date = '2023-05-03', amount, 0)) AS `2023-05-03` , SUM(IF(order_date = '2023-05-04', amount, 0)) AS `2023-05-04` , SUM(IF(order_date = '2023-05-05', amount, 0)) AS `2023-05-05` FROM advanced.orders GROUP BY 1 ;3) orders 테이블에서 사용자(user_id)별, 날짜(order_date)별로 주문이 있다면 1, 없다면 0으로 PIVOT 해주세요. user_id를 행(Row)으로, order_date를 열(Column)로 만들고 주문을 많이 해도 1로 처리합니다SELECT user_id , MAX(IF(order_date = '2023-05-01', 1, 0)) AS `2023-05-01` , MAX(IF(order_date = '2023-05-02', 1, 0)) AS `2023-05-02` , MAX(IF(order_date = '2023-05-03', 1, 0)) AS `2023-05-03` , MAX(IF(order_date = '2023-05-04', 1, 0)) AS `2023-05-04` , MAX(IF(order_date = '2023-05-05', 1, 0)) AS `2023-05-05` FROM advanced.orders GROUP BY 1 ;4) user_id = 32888이 카트 추가하기(click_cart)를 누를 때 어떤 음식(food_id)을 담았나요?WITH app_logs_info AS ( SELECT user_id , event_name , MAX(IF(event_param.key = 'firebase_screen', event_param.value.string_value, NULL)) AS firebase_screen , MAX(IF(event_param.key = 'food_id', event_param.value.int_value, NULL)) AS food_id FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS event_param GROUP BY 1, 2 ) SELECT food_id FROM app_logs_info WHERE user_id = 32888 AND event_name = 'click_cart' ; 3. Funnel1) 일자별, 이벤트별 집계WITH app_logs_info AS ( SELECT user_id , event_date , event_timestamp , event_name , user_pseudo_id , event_param.key , MAX(IF(event_param.key = 'firebase_screen', event_param.value.string_value, NULL)) AS firebase_screen , MAX(IF(event_param.key = 'food_id', event_param.value.int_value, NULL)) AS food_id FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS event_param WHERE event_date BETWEEN '2022-08-01' AND '2022-08-18' AND event_name IN ('screen_view', 'click_payment') GROUP BY ALL ) , add_step_number AS ( SELECT event_date , DATETIME(TIMESTAMP_MICROS(event_timestamp), 'Asia/Seoul') AS event_date_time , user_id , user_pseudo_id , CONCAT(event_name, '-', firebase_screen) AS event_name_with_screen , CASE CONCAT(event_name, '-', firebase_screen) WHEN 'screen_view-welcome' THEN 1 WHEN 'screen_view-home' THEN 2 WHEN 'screen_view-food_category' THEN 3 WHEN 'screen_view-restaurant' THEN 4 WHEN 'screen_view-cart' THEN 5 WHEN 'click_payment-cart' THEN 6 ELSE NULL END AS step_number FROM app_logs_info ) SELECT event_date , step_number , event_name_with_screen , COUNT(DISTINCT user_pseudo_id) AS user_cnt FROM add_step_number WHERE step_number IS NOT NULL GROUP BY 1, 2, 3 ORDER BY 1, 2 ; 2) 집계 데이터 PIVOTWITH app_logs_info AS ( SELECT user_id , event_date , event_timestamp , event_name , user_pseudo_id , event_param.key , MAX(IF(event_param.key = 'firebase_screen', event_param.value.string_value, NULL)) AS firebase_screen , MAX(IF(event_param.key = 'food_id', event_param.value.int_value, NULL)) AS food_id FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS event_param WHERE event_date BETWEEN '2022-08-01' AND '2022-08-18' GROUP BY ALL ) , add_step_number AS ( SELECT event_date , DATETIME(TIMESTAMP_MICROS(event_timestamp), 'Asia/Seoul') AS event_date_time , user_id , user_pseudo_id , CONCAT(event_name, '-', firebase_screen) AS event_name_with_screen , CASE CONCAT(event_name, '-', firebase_screen) WHEN 'screen_view-welcome' THEN 1 WHEN 'screen_view-home' THEN 2 WHEN 'screen_view-food_category' THEN 3 WHEN 'screen_view-restaurant' THEN 4 WHEN 'screen_view-cart' THEN 5 WHEN 'click_payment-cart' THEN 6 ELSE NULL END AS step_number FROM app_logs_info WHERE event_name IN ('screen_view', 'click_payment') ) , agg_user_cnt AS ( SELECT event_date , step_number , event_name_with_screen , COUNT(DISTINCT user_pseudo_id) AS user_cnt FROM add_step_number WHERE step_number IS NOT NULL GROUP BY 1, 2, 3 ORDER BY 1, 2 ) SELECT event_date , MAX(IF(step_number = 1, user_cnt, NULL)) AS `screen_view-welcome` , MAX(IF(step_number = 2, user_cnt, NULL)) AS `screen_view-home` , MAX(IF(step_number = 3, user_cnt, NULL)) AS `screen_view-food_category` , MAX(IF(step_number = 4, user_cnt, NULL)) AS `screen_view-restaurant` , MAX(IF(step_number = 5, user_cnt, NULL)) AS `screen_view-cart` , MAX(IF(step_number = 6, user_cnt, NULL)) AS `click_payment-cart` FROM agg_user_cnt GROUP BY 1 ORDER BY 1 ;
-
미해결BigQuery(SQL) 활용편(퍼널 분석, 리텐션 분석)
[바짝스터디 1주차 과제] ARRAY, STRUCT 연습문제 / PIVOT 연습문제 / 퍼널 쿼리 연습문제
1. ARRAY, STRUCT 연습문제연습문제 1SELECT title , genre FROM `advanced.array_exercises` CROSS JOIN UNNEST(genres) AS genre 연습문제 2SELECT title , actor.actor , actor.character FROM `advanced.array_exercises` CROSS JOIN UNNEST(actors) AS actor 연습문제 3SELECT title , actor.actor AS actor , actor.character AS character , genre FROM `advanced.array_exercises` CROSS JOIN UNNEST(actors) AS actor CROSS JOIN UNNEST(genres) AS genre[메모]여기서 CROSS JOIN 다음 라인에 WHERE actor = 'Chris Evans' 이렇게 조건을 걸어줄 경우 오류가 발생한다. 오류는 실행 순서와 관련이 있다. 실행 순서: FROM -> JOIN -> SELECT따라서 SELECT 에서 알리아스로 이름 붙인 actor 가 아닌, CROSS JOIN 결과를 사용하여 조건을 만들어야 한다. => WHERE actor.actor = 'Chris Evans' 연습문제 4SELECT user_id , event_date , event_name , user_pseudo_id , parameter.key AS key , parameter.value.string_value AS string_value , parameter.value.int_value AS int_value FROM `inflearn-bigquery-437203.advanced.app_logs` CROSS JOIN UNNEST (event_params) AS parameter WHERE event_date = "2022-08-01" 2. PIVOT 연습문제연습문제 1SELECT order_date , SUM(IF(user_id = 1, amount, 0)) AS user_1 , SUM(IF(user_id = 2, amount, 0)) AS user_2 , SUM(IF(user_id = 3, amount, 0)) AS user_3 FROM `inflearn-bigquery-437203.advanced.orders` GROUP BY order_date ORDER BY order_date 연습문제 2SELECT user_id , SUM(IF(order_date= '2023-05-01', amount, 0)) AS `2023-05-01` , SUM(IF(order_date= '2023-05-02', amount, 0)) AS `2023-05-02` , SUM(IF(order_date= '2023-05-03', amount, 0)) AS `2023-05-03` , SUM(IF(order_date= '2023-05-04', amount, 0)) AS `2023-05-04` , SUM(IF(order_date= '2023-05-05', amount, 0)) AS `2023-05-05` FROM `inflearn-bigquery-437203.advanced.orders` GROUP BY user_id ORDER BY user_id[메모]알리아스로 컬럼명 지정할 때, 영어 제외하고 다른 문자열가 포함될 경우 → backtick (`) 으로 감싸준다. 연습문제 3SELECT user_id , MAX(IF(order_date= '2023-05-01', 1, 0)) AS `2023-05-01` , MAX(IF(order_date= '2023-05-02', 1, 0)) AS `2023-05-02` , MAX(IF(order_date= '2023-05-03', 1, 0)) AS `2023-05-03` , MAX(IF(order_date= '2023-05-04', 1, 0)) AS `2023-05-04` , MAX(IF(order_date= '2023-05-05', 1, 0)) AS `2023-05-05` FROM `inflearn-bigquery-437203.advanced.orders` GROUP BY user_id ORDER BY user_id 연습문제 4SELECT event_date , event_timestamp , event_name , user_id , user_pseudo_id , MAX(IF(params.key = 'firebase_screen', params.value.string_value, NULL)) AS firebase_screen , MAX(IF(params.key = 'food_id', params.value.int_value, NULL)) AS food_id , MAX(IF(params.key = 'session_id', params.value.string_value, NULL)) AS session_id FROM `inflearn-bigquery-437203.advanced.app_logs` CROSS JOIN UNNEST(event_params) AS params WHERE event_date = '2022-08-01' GROUP BY ALL 3. 퍼널 쿼리 연습문제연습문제 1: 각 퍼널별 유저 수 집계-- 퍼널 단계: 6 -- screen_view(welcome) -- screen_view(home) -- screen_view(food_category) -- screen_view(restaurant) -- screen_view(cart) -- click_payment(cart) WITH funnel AS ( SELECT CONCAT(event_name, '-', param.value.string_value) AS event_name_with_screen , CASE WHEN event_name = 'screen_view' AND param.value.string_value = 'welcome' THEN 1 WHEN event_name = 'screen_view' AND param.value.string_value = 'home' THEN 2 WHEN event_name = 'screen_view' AND param.value.string_value = 'food_category' THEN 3 WHEN event_name = 'screen_view' AND param.value.string_value = 'restaurant' THEN 4 WHEN event_name = 'screen_view' AND param.value.string_value = 'cart' THEN 5 WHEN event_name = 'click_payment' AND param.value.string_value = 'cart' THEN 6 END AS step_number , user_pseudo_id FROM `inflearn-bigquery-437203.advanced.app_logs` CROSS JOIN UNNEST(event_params) AS param WHERE event_date BETWEEN '2022-08-01' AND '2022-08-18' AND event_name IN ('screen_view', 'click_payment') AND param.key = 'firebase_screen' AND param.value.string_value IN ('welcome', 'home', 'food_category', 'restaurant', 'cart') ) SELECT event_name_with_screen , MAX(step_number) AS step_number , COUNT(DISTINCT user_pseudo_id) AS cnt FROM funnel GROUP BY event_name_with_screen 연습문제 2: 일자별 각 퍼널의 유저 수 집계-- 퍼널 단계: 6 -- screen_view(welcome) -- screen_view(home) -- screen_view(food_category) -- screen_view(restaurant) -- screen_view(cart) -- click_payment(cart) WITH funnel AS ( SELECT event_date -- 날짜 컬럼 추가 , CONCAT(event_name, '-', param.value.string_value) AS event_name_with_screen , CASE WHEN event_name = 'screen_view' AND param.value.string_value = 'welcome' THEN 1 WHEN event_name = 'screen_view' AND param.value.string_value = 'home' THEN 2 WHEN event_name = 'screen_view' AND param.value.string_value = 'food_category' THEN 3 WHEN event_name = 'screen_view' AND param.value.string_value = 'restaurant' THEN 4 WHEN event_name = 'screen_view' AND param.value.string_value = 'cart' THEN 5 WHEN event_name = 'click_payment' AND param.value.string_value = 'cart' THEN 6 END AS step_number , user_pseudo_id FROM `inflearn-bigquery-437203.advanced.app_logs` CROSS JOIN UNNEST(event_params) AS param WHERE event_date BETWEEN '2022-08-01' AND '2022-08-18' AND event_name IN ('screen_view', 'click_payment') AND param.key = 'firebase_screen' AND param.value.string_value IN ('welcome', 'home', 'food_category', 'restaurant', 'cart') ) SELECT event_date -- 날짜 컬럼 추가 , event_name_with_screen , MAX(step_number) AS step_number , COUNT(DISTINCT user_pseudo_id) AS cnt FROM funnel GROUP BY ALL ORDER BY event_date, step_number 연습문제 3: 일자별 각 퍼널의 유저 수 집계한 결과 → PIVOT 하기-- 퍼널 단계: 6 -- screen_view(welcome) -- screen_view(home) -- screen_view(food_category) -- screen_view(restaurant) -- screen_view(cart) -- click_payment(cart) WITH funnel AS ( SELECT event_date , CONCAT(event_name, '-', param.value.string_value) AS event_name_with_screen , CASE WHEN event_name = 'screen_view' AND param.value.string_value = 'welcome' THEN 1 WHEN event_name = 'screen_view' AND param.value.string_value = 'home' THEN 2 WHEN event_name = 'screen_view' AND param.value.string_value = 'food_category' THEN 3 WHEN event_name = 'screen_view' AND param.value.string_value = 'restaurant' THEN 4 WHEN event_name = 'screen_view' AND param.value.string_value = 'cart' THEN 5 WHEN event_name = 'click_payment' AND param.value.string_value = 'cart' THEN 6 END AS step_number , user_pseudo_id FROM `inflearn-bigquery-437203.advanced.app_logs` CROSS JOIN UNNEST(event_params) AS param WHERE event_date BETWEEN '2022-08-01' AND '2022-08-18' AND event_name IN ('screen_view', 'click_payment') AND param.key = 'firebase_screen' AND param.value.string_value IN ('welcome', 'home', 'food_category', 'restaurant', 'cart') ), -- 일자별 각 퍼널의 유저수 집계 funnel_daily AS( SELECT event_date , event_name_with_screen , MAX(step_number) AS step_number , COUNT(DISTINCT user_pseudo_id) AS cnt FROM funnel GROUP BY ALL ORDER BY event_date, step_number ) -- 일자별 각 퍼널의 유저수 집계 -> 피벗하기 SELECT event_date , SUM(IF(event_name_with_screen = 'screen_view-welcome', cnt, 0)) AS `screen_view-welcome` , SUM(IF(event_name_with_screen = 'screen_view-home', cnt, 0)) AS `scree_view-home` , SUM(IF(event_name_with_screen = 'screen_view-food_category', cnt, 0)) AS `screen_view-food_category` , SUM(IF(event_name_with_screen = 'screen_view-restaurant', cnt, 0)) AS `screen_view-restaurant` , SUM(IF(event_name_with_screen = 'screen_view-cart', cnt, 0)) AS `screen_view-cart` , SUM(IF(event_name_with_screen = 'click_payment-cart', cnt, 0)) AS `click_payment-cart` FROM funnel_daily GROUP BY event_date ORDER BY event_date
-
미해결BigQuery(SQL) 활용편(퍼널 분석, 리텐션 분석)
[빠짝스터디 1주차 과제] ARRAY, STRUCT 연습 문제/ PIVOT 연습문제/ 퍼널 분석 연습 문제
1-4. Array, Struct 연습문제 (1~4번)연습문제1문제array_exercises 테이블에서 각 영화(title)별로 장르(genres)를 UNNEST해서 보여주세요 쿼리select title, genres2 from advanced.array_exercises cross join unnest(genres) as genres2 ;결과 연습문제2문제array_exercises 테이블에서 각 영화(title)별로 배우(actor)와 배역(character)을 보여주세요. 배우와 배역은 별도의 컬럼으로 나와야 합니다쿼리select title, actor.actor, actor.character from advanced.array_exercises cross join unnest(actors) as actor order by movie_id ;결과 연습문제3문제array_exercises 테이블에서 각 영화(title)별로 배우(actor), 배역(character), 장르 (genre)를 출력하세요. 한 Row에 배우, 배역, 장르가 모두 표시되어야 합니다쿼리select title, actor.actor, actor.character genre2 # array<string> from advanced.array_exercises cross join unnest (actors) as actor cross join unnest (genres) as genre order by movie_id ;결과 연습문제4문제앱 로그 데이터(app_logs)의 배열을 풀어주세요쿼리select user_id, event_date, event_name, user_pseudo_id, event_param.key as key , event_param.value.string_value as string_value , event_param.value.int_value as int_value from advanced.app_logs cross join unnest (event_params) as event_param where event_date = "2022-08-01" limit 10 ;결과 1-9. 퍼널 SQL 쿼리 작성하기연습문제1문제orders 테이블에서 유저(user_id)별로 주문 금액(amount)의 합계를 PIVOT해주세요. 날짜(order_date)를 행(Row)으로, user_id를 열(Column)으로 만들어야 합니다쿼리select order_date, sum(if(user_id = 1, amount, 0)) as user_1, sum(if(user_id = 2, amount, 0)) as user_2, sum(if(user_id = 3, amount, 0)) as user_3 from advanced.orders group by order_date order by 1 ;결과연습문제2문제orders 테이블에서 날짜(order_date)별로 유저들의 주문 금액(amount)의 합계를 PIVOT 해주세요. user_id를 행(Row)으로, order_date를 열(Column)으로 만들어야 합니다쿼리select user_id, # 컬럼의 이름을 지정할때, 영어를 제외하고 backtick(`)사용 sum(if(order_date = "2023-05-01", amount, 0)) as `2023-05-01`, sum(if(order_date = "2023-05-02", amount, 0)) as `2023-05-02`, sum(if(order_date = "2023-05-03", amount, 0)) as `2023-05-03`, sum(if(order_date = "2023-05-04", amount, 0)) as `2023-05-04`, sum(if(order_date = "2023-05-05", amount, 0)) as `2023-05-05` from advanced.orders group by user_id order by 1 ;결과연습문제3문제orders 테이블에서 사용자(user_id)별, 날짜(order_date)별로 주문이 있다면 1, 없다면 0으로 PIVOT 해주세요. user_id를 행(Row)으로, order_date를 열(Column)로 만들고 주문을 많이 해도 1로 처리합니다쿼리select user_id, max(if(order_date = "2023-05-01", 1, 0)) as `2023-05-01`, max(if(order_date = "2023-05-02", 1, 0)) as `2023-05-02`, max(if(order_date = "2023-05-03", 1, 0)) as `2023-05-03`, max(if(order_date = "2023-05-04", 1, 0)) as `2023-05-04`, max(if(order_date = "2023-05-05", 1, 0)) as `2023-05-05` from advanced.orders group by user_id order by 1 ;결과연습문제4문제앱 로그 데이터 배열 PIVOT하기 - user_id = 32888이 카트 추가하기(click_cart)를 누를때 어떤 음식(food_id)을 담았나요?쿼리select event_date, event_timestamp, event_name, user_id, user_pseudo_id, max(if(param.key = 'firebase_screen', param.value.string_value, null)) as firebase_screen, -- max(if(param.key = 'food_id', param.value.string_value, null)) as food_id, max(if(param.key = 'food_id', param.value.int_value, null)) as food_id, max(if(param.key = 'session_id', param.value.string_value, null)) as session_id from advanced.app_logs cross join unnest(event_params) as param where event_date = "2022-08-01" and user_id = 32888 and event_name = "click_cart" group by all limit 100 ; 결과퍼널 분석문제step_number별 count, 일자별 퍼널별 유저 수 쿼리쿼리with base as (select event_date, event_timestamp, event_name, user_id, user_pseudo_id, max(if(event_param.key = 'firebase_screen', event_param.value.string_value, null)) as firebase_screen, -- max(if(event_param.key = 'food_id', event_param.value.int_value, null)) as food_id, max(if(event_param.key = 'session_id', event_param.value.string_value, null)) as session_id from advanced.app_logs cross join unnest(event_params) as event_param where 1=1 and event_date between "2022-08-01" and "2022-08-18" group by all ), filter_event_and_concat_event_and_acreen AS ( -- event_name + screen select * except(event_name, firebase_screen, event_timestamp), concat(event_name, "-", firebase_screen) as event_name_with_screen, datetime(timestamp_micros(event_timestamp), "Asia/Seoul") as event_datetime from base where 1=1 and event_name in ("screen_view", "click_payment")) # 일자별로 퍼널별 유저 수 select -- distinct(event_name_with_screen) event_date, event_name_with_screen, case when event_name_with_screen = 'screen_view-welcome' then 1 when event_name_with_screen = 'screen_view-home' then 2 when event_name_with_screen = 'screen_view-food_category' then 3 when event_name_with_screen = 'screen_view-restaurant' then 4 when event_name_with_screen = 'screen_view-cart' then 5 when event_name_with_screen = 'click_payment-cart' then 6 else null end as step_number, count(distinct user_pseudo_id) as cnt from filter_event_and_concat_event_and_acreen group by all having step_number is not null order by event_date ;결과
-
미해결BigQuery(SQL) 활용편(퍼널 분석, 리텐션 분석)
[바짝스터디 1주차 과제] ARRAY, STRUCT 연습 문제 / PIVOT 연습 문제 / 퍼널 쿼리 연습 문제
1. ARRAY, STRUCT 연습문제 연습 문제 1번# 1)array_exercises테이블에서 각 영화(title)별로 장르(genres)를 UNNEST해서 보여주세요 SELECT title, genre FROM `advanced.array_exercises` AS exercise CROSS JOIN UNNEST(genres) AS genre SELECT title, # 기존에 array_exercises에 저장되어 있던 컬럼 genre FROM `advanced.array_exercises` AS ae, UNNEST(genres) AS genre # ARRAY : 같은 타입의 여러 데이터를 저장하고 싶을 때 # ARRAY를 Flatten(평면화) => UNNEST # UNNEST를 할 때는 CROSS JOIN + UNNEST(ARRAY_COLUMN) # UNNEST(ARRAY_COLUMN) AS 새로운 이름 # SELECT 절에서 새로운 이름으로 시작한다. 기존의 ARRAY_COLUMN은 사용하지 않는다! 연습 문제 2번# 2) array_exercises 테이블에서 각 영화(title)별로 배우(actor)와 배역(character)을 보여주세요. 배우와 배역은 별도의 컬럼으로 나와야 합니다 SELECT title, actor.actor, actor.character -- FROM `advanced.array_exercises` AS ae, UNNEST(actors) AS actor FROM `advanced.array_exercises` AS ae CROSS JOIN UNNEST(actors) AS actor # actors = [STRUCT(STRING, STRING)] 연습 문제 3번# 3) array_exercises 테이블에서 각 영화(title)별로 배우(actor), 배역(character), 장르(genre)를 출력하세요. 한 Row에 배우, 배역, 장르가 모두 표시 되어야 합니다 SELECT title, actor.actor, actor.character, genre FROM `advanced.array_exercises` AS ae ,UNNEST(actors) AS actor, UNNEST(genres) as genre -- FROM `advanced.array_exercises` AS ae -- CROSS JOIN UNNEST(actors) AS actor -- CROSS JOIN UNNEST(genres) as genre # 이 문제의 의도 : UNNEST를 2번 연속 사용할 수 있다. # CROSS JOIN => JOIN 연속 2번과 맥락은 동일한데, UNNEST라는 것이 어색할 수 있었다 # 데이터의 중복이 어느정도 생기는데, 그것은 어쩔 수 없는 이슈(CROSS JOIN) -- FROM `advanced.array_exercises` AS ae -- CROSS JOIN UNNEST(actors) AS actor, UNNEST(genres) as genre SELECT title, actor.actor, actor.character, genre FROM `advanced.array_exercises` AS ae ,UNNEST(actors) AS actor, UNNEST(genres) as genre -- FROM `advanced.array_exercises` AS ae -- CROSS JOIN UNNEST(actors) AS actor -- CROSS JOIN UNNEST(genres) as genre # 이 문제의 의도 : UNNEST를 2번 연속 사용할 수 있다. # CROSS JOIN => JOIN 연속 2번과 맥락은 동일한데, UNNEST라는 것이 어색할 수 있었다 # 데이터의 중복이 어느정도 생기는데, 그것은 어쩔 수 없는 이슈(CROSS JOIN) -- FROM `advanced.array_exercises` AS ae -- CROSS JOIN UNNEST(actors) AS actor, UNNEST(genres) as genre WHERE actor.actor = 'Chris Evans' AND genre = 'Action' -- WHERE actor = 'Chris Evans' (X) # 실행 순서 : FROM -> JOIN -> SELECT # actors : ARRAY<STRUCT> => UNNEST => STRUCT # genres : ARRAY<STRING> => STRING 연습 문제 4번# 4) 앱 로그 데이터(app_logs)의 배열을 풀어주세요 SELECT event_date, event_timestamp, event_name, event_param.key as key, event_param.value as value, event_param.value.string_value as string_value, event_param.value.int_value as int_value, -- event_params, user_id, event_param FROM `advanced.app_logs` CROSS JOIN UNNEST(event_params) AS event_param WHERE event_date ='2022-08-01' 2. PIVOTPIVOT 연습 문제 1## SubQuery 방식 SELECT order_date, SUM(IF(user_id = 1, sum_of_amount, NULL)) AS user_1, SUM(IF(user_id = 2, sum_of_amount, NULL)) AS user_2, SUM(IF(user_id = 3, sum_of_amount, NULL)) AS user_3 -- MAX를 써도 동일한 결과 값이 나옴 -- 그룹화 할때 값이 하나밖에 없음 FROM ( SELECT order_date, user_id, # Amount의 합 SUM(amount) AS sum_of_amount FROM advanced.orders GROUP BY order_date, user_id -- ORDER BY order_date ) GROUP BY order_date ORDER BY order_date PIVOT 연습 문제 2# ANY_VALUE 활용 -- ANY_VALUE : 그룹화 할 대상 중에 임의의 값을 선택한다 (NULL)을 제외하고 -- ANY_VALUE에선 나머지 값들이 NULL 이거나 확정적으로 이 값이 나올 것이다 기대할 때 사용한다 SELECT user_id, ANY_VALUE(IF(order_date = PARSE_DATE('%Y-%m-%d', '2023-05-01'), amount, NULL)) AS `2023-05-01`, ANY_VALUE(IF(order_date = date('2023-05-02'), amount, NULL)) AS `2023-05-02`, ANY_VALUE(IF(order_date = date('2023-05-03'), amount, NULL)) AS `2023-05-03`, ANY_VALUE(IF(order_date = date('2023-05-04'), amount, NULL)) AS `2023-05-04`, ANY_VALUE(IF(order_date = date('2023-05-05'), amount, NULL)) AS `2023-05-05` FROM `advanced.orders` GROUP BY user_id ORDER BY user_id PIVOT 연습 문제 3# 3) orders 테이블에서 사용자(user_id)별, 날짜(order_date)별로 주문이 있다면 1, 없다면 0으로 PIVOT 해주세요.user_id를 행(Row)으로, order_date를 열(Column)로 만들고 주문을 많이 해도 1로 처리합니다 SELECT user_id, # amount 대신 1 이라고 표시. IF 문 안에 TRUE 일 때의 값이 항상 특정 컬럼이 아니라 1이라고 할 수도 있음(유무에 따라서) MAX(IF(order_date = PARSE_DATE('%Y-%m-%d', '2023-05-01'), 1, 0)) AS `2023-05-01`, MAX(IF(order_date = date('2023-05-02'), 1, 0)) AS `2023-05-02`, MAX(IF(order_date = date('2023-05-03'), 1, 0)) AS `2023-05-03`, MAX(IF(order_date = date('2023-05-04'), 1, 0)) AS `2023-05-04`, MAX(IF(order_date = date('2023-05-05'), 1, 0)) AS `2023-05-05` FROM `advanced.orders` GROUP BY user_id PIVOT 연습 문제 4 앱 로그 데이터 배열 PIVOT 하기WITH base AS ( SELECT event_date, event_name, user_id, user_pseudo_id, event_timestamp, MAX(IF(param.key = 'firebase_screen', param.value.string_value, NULL)) AS firebase_screen, MAX(IF(param.key = 'food_id', param.value.int_value, NULL)) AS food_id, MAX(IF(param.key = 'session_id', param.value.string_value, NULL)) AS session_id FROM `advanced.app_logs` CROSS JOIN UNNEST(event_params) AS param GROUP BY ALL -- WHERE event_name ='click_cart' ORDER BY user_pseudo_id LIMIT 100 ) SELECT event_date, COUNT(user_id) AS user_cnt FROM base WHERE event_name ='click_cart' GROUP BY event_date ORDER BY event_date 3. 퍼널 분석 -- event_name + screen (필요한 이벤트만 WHERE 조건에 걸어서 사용) -- step_number + COUNT WITH base AS ( SELECT event_date, event_timestamp, event_name, user_id, user_pseudo_id, platform, -- param MAX(IF(param.key = 'firebase_screen', param.value.string_value, NULL)) AS firebase_screen, MAX(IF(param.key = 'food_id', param.value.int_value, NULL)) AS food_id, MAX(IF(param.key = 'session_id', param.value.string_value, NULL)) AS session_id, FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS param WHERE event_date = '2022-08-01' GROUP BY ALL -- LIMIT 100 ), filter_event_and_concat_event_and_screen AS ( SELECT * EXCEPT(event_name, firebase_screen), CONCAT(event_name, "-", firebase_screen) AS event_name_with_screen FROM base WHERE event_name IN ("screen_view", "click_payment") ) SELECT * FROM filter_event_and_concat_event_and_screen 최종 RESULT# 일자 상관 없이 퍼널의 유저 수를 집계한 쿼리 => 일자별로 하기 위해 event_date 추가WITH base AS ( SELECT event_date, event_timestamp, event_name, user_id, user_pseudo_id, platform, -- param MAX(IF(param.key = 'firebase_screen', param.value.string_value, NULL)) AS firebase_screen, MAX(IF(param.key = 'session_id', param.value.string_value, NULL)) AS session_id, FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS param WHERE -- event_date = '2022-08-01' # 적은 데이터로 쿼리를 작성하기 위해 만들어둔 조건 event_date BETWEEN '2022-08-01' AND '2022-08-18' GROUP BY ALL ), filter_event_and_concat_event_and_screen AS ( SELECT * EXCEPT(event_name, firebase_screen), CONCAT(event_name, "-", firebase_screen) AS event_name_with_screen, DATETIME(TIMESTAMP_MICROS(event_timestamp), 'Asia/Seoul') AS event_datetime FROM base WHERE event_name IN ("screen_view", "click_payment") ) -- step_number + COUNT -- step_number : CASE WHEN 을 사용해 숫자 지정, 조건문을 여러 개 하고싶을 때 사용하는 함수 # 일자 상관 없이 퍼널의 유저 수를 집계한 쿼리 => 일자별로 하기 위해 event_date 추가 SELECT event_date, # 일자별로 퍼널별 유저 수 쿼리 event_name_with_screen, -- event_name_with_screen, -- event_datetime, user_pseudo_id, CASE WHEN event_name_with_screen = 'screen_view-welcome' THEN 1 WHEN event_name_with_screen = 'screen_view-home' THEN 2 WHEN event_name_with_screen = 'screen_view-food_category' THEN 3 WHEN event_name_with_screen = 'screen_view-restaurant' THEN 4 WHEN event_name_with_screen = 'screen_view-cart' THEN 5 WHEN event_name_with_screen = 'click_payment-cart' THEN 6 ELSE NULL END AS step_number, COUNT(DISTINCT user_pseudo_id) AS CNT FROM filter_event_and_concat_event_and_screen GROUP BY ALL HAVING step_number IS NOT NULL ORDER BY event_date
-
미해결BigQuery(SQL) 활용편(퍼널 분석, 리텐션 분석)
[ 인프런 빅쿼리 빠짝스터디 1주차 ] ARRAY, STRUCT 연습 문제 / PIVOT 연습 문제 / 퍼널 쿼리 연습 문제
1. ARRAY, STRUCT1) array_exercises 테이블에서 각 영화(title)별로 장르(genres)를 UNNEST해서 보여주세요SELECT title, genre FROM advanced.array_exercises CROSS JOIN UNNEST(genres) AS genre2) array_exercises 테이블에서 각 영화(title)별로 배우(actor)와 배역(character)을 보여주세요. (배우와 배역은 별도의 컬럼으로 나와야 합니다)SELECT title, actor.actor, actor.character FROM advanced.array_exercises CROSS JOIN UNNEST(actors) AS actor3) array_exercises 테이블에서 각 영화(title)별로 배우(actor), 배역(character), 장르(genre)를 출력하세요. 한 Row에 배우, 배역, 장르가 모두 표시되어야 합니다SELECT title, actor.actor, actor.character, genre FROM advanced.array_exercises CROSS JOIN UNNEST(actors) AS actor CROSS JOIN UNNEST(genres) AS genre4) 앱 로그 데이터(app_logs)의 배열을 풀어주세요SELECT user_id, event_date, event_name, user_pseudo_id, event_param.key AS key, event_param.value.string_value AS string_value, event_param.value.int_value AS int_value FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS event_param WHERE event_date = '2022-08-01'2. PIVOT1) orders 테이블에서 유저(user_id)별로 주문 금액(amount)의 합계를 PIVOT해주세요. 날짜(order_date)를 행(Row)으로, user_id를 열(Column)으로 만들어야 합니다SELECT order_date, SUM(IF(user_id = 1, amount, 0)) AS user_1, SUM(IF(user_id = 2, amount, 0)) AS user_2, SUM(IF(user_id = 3, amount, 0)) AS user_3, FROM advanced.orders GROUP BY order_date ORDER BY order_date 2) orders 테이블에서 날짜(order_date)별로 유저들의 주문 금액(amount)의 합계를 PIVOT 해주세요. user_id를 행(Row)으로, order_date를 열(Column)으로 만들어야 합니다-- 2) orders 테이블에서 날짜(order_date)별로 유저들의 주문 금액(amount)의 합계를 PIVOT 해주세요. user_id를 행(Row)으로, order_date를 열(Column)으로 만들어야 합니다 SELECT user_id, SUM(IF(order_date = '2023-05-01', amount, 0)) AS `2023-05-01`, SUM(IF(order_date = '2023-05-02', amount, 0)) AS `2023-05-02`, SUM(IF(order_date = '2023-05-03', amount, 0)) AS `2023-05-03`, SUM(IF(order_date = '2023-05-04', amount, 0)) AS `2023-05-04`, SUM(IF(order_date = '2023-05-05', amount, 0)) AS `2023-05-05`, FROM advanced.orders GROUP BY user_id ORDER BY user_id3) orders 테이블에서 사용자(user_id)별, 날짜(order_date)별로 주문이 있다면 1, 없다면 0으로 PIVOT 해주세요. user_id를 행(Row)으로, order_date를 열(Column)로 만들고 주문을 많이 해도 1로 처리합니다SELECT user_id, MAX(IF(order_date = '2023-05-01', 1, 0)) AS `2023-05-01`, MAX(IF(order_date = '2023-05-02', 1, 0)) AS `2023-05-02`, MAX(IF(order_date = '2023-05-03', 1, 0)) AS `2023-05-03`, MAX(IF(order_date = '2023-05-04', 1, 0)) AS `2023-05-04`, MAX(IF(order_date = '2023-05-05', 1, 0)) AS `2023-05-05`, FROM advanced.orders GROUP BY user_id ORDER BY user_id4) 앱 로그 데이터 배열 PIVOT하기# 앱 로그 PIVOT # 쿼리를 작성하는 목표, 확인할 지표 : user_id = 32888이 카트 추가하기(click_cart)를 누를때 어떤 음식(food_id)을 담았나요? # 쿼리 계산 방법 : UNNEST -> PIVOT # 데이터의 기간 : X # 사용할 테이블 : app_logs # Join KEY : X # 데이터 특징: -- event_params ARRAY, STRUCT / event_params.value ARRAY, STRUCT WITH base AS ( SELECT user_id, event_date, event_name, user_pseudo_id, event_param.key AS key, event_param.value.string_value AS string_value, event_param.value.int_value AS int_value FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS event_param ) SELECT user_id, event_date, event_name, user_pseudo_id, MAX(IF(key = 'firebase_screen', string_value, NULL)) AS firebase_screen, MAX(IF(key = 'food_id', int_value, NULL)) AS food_id, MAX(IF(key = 'session_id', string_value, NULL)) AS session_id FROM base WHERE user_id = 32888 AND event_name = 'click_cart' GROUP BY ALL3. 퍼널WITH base AS ( SELECT event_date, event_timestamp, event_name, user_id, user_pseudo_id, platform, MAX(IF(event_param.key = 'firebase_screen', event_param.value.string_value, NULL)) AS firebase_screen, MAX(IF(event_param.key = 'session_id', event_param.value.string_value, NULL)) AS session_id FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS event_param WHERE event_date BETWEEN '2022-08-01' AND '2022-08-18' GROUP BY ALL ), filter_event_and_concat_event_and_screen AS ( SELECT * EXCEPT(event_name, firebase_screen, event_timestamp), CONCAT(event_name, "-", firebase_screen) AS event_name_with_screen, DATETIME(TIMESTAMP_MICROS(event_timestamp), 'Asia/Seoul') AS event_datetime FROM base WHERE event_name IN ('screen_view', 'click_payment') ), daily_event_summary AS ( SELECT event_date, # 일자별로 퍼널별 유저 수 쿼리 event_name_with_screen, CASE WHEN event_name_with_screen = 'screen_view-welcome' THEN 1 WHEN event_name_with_screen = 'screen_view-home' THEN 2 WHEN event_name_with_screen = 'screen_view-food_category' THEN 3 WHEN event_name_with_screen = 'screen_view-restaurant' THEN 4 WHEN event_name_with_screen = 'screen_view-cart' THEN 5 WHEN event_name_with_screen = 'click_payment-cart' THEN 6 ELSE NULL END AS step_number, COUNT(DISTINCT user_pseudo_id) AS cnt FROM filter_event_and_concat_event_and_screen GROUP BY ALL HAVING step_number IS NOT NULL ) -- 집계한 데이터 PIVOT SELECT event_date, MAX(IF(event_name_with_screen = 'screen_view-welcome', cnt, 0)) AS screen_view_welcome, MAX(IF(event_name_with_screen = 'screen_view-home', cnt, 0)) AS screen_view_home, MAX(IF(event_name_with_screen = 'screen_view-food_category', cnt, 0)) AS screen_view_food_category, MAX(IF(event_name_with_screen = 'screen_view-restaurant', cnt, 0)) AS screen_view_restaurant, MAX(IF(event_name_with_screen = 'screen_view-cart', cnt, 0)) AS screen_view_cart, MAX(IF(event_name_with_screen = 'click_payment-cart', cnt, 0)) AS click_payment_cart FROM daily_event_summary GROUP BY ALL ORDER BY event_date
-
미해결BigQuery(SQL) 활용편(퍼널 분석, 리텐션 분석)
[ 인프런 빅쿼리 빠짝스터디 1주차 ] ARRAY, STRUCT 연습 문제 / PIVOT 연습 문제 / 퍼널 쿼리 연습 문제
1. ARRAY, STRUCT1) array_exercises 테이블에서 각 영화(title)별로 장르(genres)를 UNNEST해서 보여주세요SELECT title , genre FROM advanced.array_exercises CROSS JOIN UNNEST(genres) AS genre2) array_exercises 테이블에서 각 영화(title)별로 배우(actor)와 배역(character)을 보여주세요. (배우와 배역은 별도의 컬럼으로 나와야 합니다)SELECT title, , actor.actor AS actor , actor.character AS character FROM advanced.array_exercises CROSS JOIN UNNEST(actors) AS actor3) array_exercises 테이블에서 각 영화(title)별로 배우(actor), 배역(character), 장르(genre)를 출력하세요. 한 Row에 배우, 배역, 장르가 모두 표시되어야 합니다SELECT title , actor.actor AS actor , actor.character AS character , genre FROM advanced.array_exercises CROSS JOIN UNNEST(actors) AS actor CROSS JOIN UNNEST(genres) AS genre 4) 앱 로그 데이터(app_logs)의 배열을 풀어주세요SELECT user_id , event_date , event_name , user_pseudo_id , params.key AS key , params.value.string_value AS str_value , params.value.int_value AS int_value FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS params WHERE event_date = '2022-08-01' 2. PIVOT1) orders 테이블에서 유저(user_id)별로 주문 금액(amount)의 합계를 PIVOT해주세요. 날짜(order_date)를 행(Row)으로, user_id를 열(Column)으로 만들어야 합니다-- 1) orders 테이블에서 유저(user_id)별로 주문 금액(amount)의 합계를 PIVOT해주세요. 날짜(order_date)를 행(Row)으로, user_id를 열(Column)으로 만들어야 합니다 WITH step1 AS ( SELECT order_date , user_id , sum(amount) AS sum_of_amount FROM advanced.orders GROUP BY ALL ) SELECT order_date , MAX(IF(user_id = 1, sum_of_amount, 0)) AS user_1 , MAX(IF(user_id = 2, sum_of_amount, 0)) AS user_2 , MAX(IF(user_id = 3, sum_of_amount, 0)) AS user_3 FROM step1 GROUP BY order_date ORDER BY order_date2) orders 테이블에서 날짜(order_date)별로 유저들의 주문 금액(amount)의 합계를 PIVOT 해주세요. user_id를 행(Row)으로, order_date를 열(Column)으로 만들어야 합니다SELECT user_id , SUM(IF(order_date = '2023-05-01', amount, 0)) AS `2023-05-01` , SUM(IF(order_date = '2023-05-02', amount, 0)) AS `2023-05-02` , SUM(IF(order_date = '2023-05-03', amount, 0)) AS `2023-05-03` , SUM(IF(order_date = '2023-05-04', amount, 0)) AS `2023-05-04` , SUM(IF(order_date = '2023-05-05', amount, 0)) AS `2023-05-05` FROM advanced.orders GROUP BY user_id ORDER BY user_id3) orders 테이블에서 사용자(user_id)별, 날짜(order_date)별로 주문이 있다면 1, 없다면 0으로 PIVOT 해주세요. user_id를 행(Row)으로, order_date를 열(Column)로 만들고 주문을 많이 해도 1로 처리합니다SELECT user_id , MAX(IF(order_date = '2023-05-01', 1, 0)) AS `2023-05-01` , MAX(IF(order_date = '2023-05-02', 1, 0)) AS `2023-05-02` , MAX(IF(order_date = '2023-05-03', 1, 0)) AS `2023-05-03` , MAX(IF(order_date = '2023-05-04', 1, 0)) AS `2023-05-04` , MAX(IF(order_date = '2023-05-05', 1, 0)) AS `2023-05-05` FROM advanced.orders GROUP BY user_id ORDER BY user_id 4) 앱 로그 데이터 배열 PIVOT하기SELECT user_id , event_date , event_name , user_pseudo_id , MAX(IF(params.key = 'firebase_screen', params.value.string_value, NULL)) AS firebase_screen , MAX(IF(params.key = 'food_id', params.value.int_value, NULL)) AS food_id , MAX(IF(params.key = 'session_id', params.value.string_value, NULL)) AS session_id FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS params WHERE event_date = '2022-08-01' GROUP BY ALL 3. 퍼널 분석 WITH step1 AS ( SELECT event_date , event_timestamp , event_name , user_id , user_pseudo_id , MAX(IF(params.key = 'firebase_screen' , params.value.string_value , NULL)) AS firebase_screen , MAX(IF(params.key = 'session_id' , params.value.string_value , NULL)) AS session_id , platform FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS params WHERE event_date BETWEEN '2022-08-01' AND '2022-08-18' GROUP BY ALL ), step2 AS ( SELECT * EXCEPT(event_timestamp) , CONCAT(event_name, '-', firebase_screen) AS event_name_with_screen , DATETIME(TIMESTAMP_MICROS(event_timestamp), 'Asia/Seoul') AS event_datetime FROM step1 ), step3 AS ( SELECT * , CASE WHEN event_name_with_screen = 'screen_view-welcome' THEN 1 WHEN event_name_with_screen = 'screen_view-home' THEN 2 WHEN event_name_with_screen = 'screen_view-food_category' THEN 3 WHEN event_name_with_screen = 'screen_view-restaurant' THEN 4 WHEN event_name_with_screen = 'screen_view-cart' THEN 5 WHEN event_name_with_screen = 'click_payment-cart' THEN 6 END AS step_number FROM step2 -- 1) 각 퍼널별 유저 수 집계 ), step3_1 AS ( SELECT event_name_with_screen , step_number , COUNT(DISTINCT user_pseudo_id) AS cnt FROM step3 GROUP BY ALL HAVING step_number IS NOT NULL ORDER BY step_number ) -- 2) 일자별 각 퍼널별 유저 수 집계 , step3_2 AS ( SELECT event_date , event_name_with_screen , step_number , COUNT(DISTINCT user_pseudo_id) AS cnt FROM step3 GROUP BY ALL HAVING step_number IS NOT NULL ORDER BY event_date , step_number ) 3) 2) 데이터를 PIVOTSELECT event_date , MAX(IF(event_name_with_screen = 'screen_view-welcome', cnt, NULL)) AS `screen_view-welcome` , MAX(IF(event_name_with_screen = 'screen_view-home', cnt, NULL)) AS `screen_view-home` , MAX(IF(event_name_with_screen = 'screen_view-food_category', cnt, NULL)) AS `screen_view-food_category` , MAX(IF(event_name_with_screen = 'screen_view-restaurant', cnt, NULL)) AS `screen_view-restaurant` , MAX(IF(event_name_with_screen = 'screen_view-cart', cnt, NULL)) AS `screen_view-cart` , MAX(IF(event_name_with_screen = 'click_payment-cart', cnt, NULL)) AS `click_payment-cart` FROM step3_2 GROUP BY event_date ORDER BY event_date
-
미해결BigQuery(SQL) 활용편(퍼널 분석, 리텐션 분석)
[빠짝스터디 1주차 과제] ARRAY, STRUCT 연습 문제/ PIVOT 연습문제/ 퍼널 쿼리 연습 문제
[PART 1] ARRAY, STRUCT 연습문제 Q1. array_exercises 테이블에서 각 영화(title)별로 장르(genres)를 UNNEST해서 보여주세요.-- CROSS JOIN UNNEST(ARRARY_COKUMN) AS 새로운 이름 -> 이후 SELECT 절에서 새로운 이름만 포함하여 쿼리를 실행하여 평면화 가능 SELECT title, -- genres, genre FROM advanced.array_exercises AS ae CROSS JOIN UNNEST(genres) AS genre; Q2. array_exercise 테이블에서 각 영화(title)별로 배우(actors)와 배역(character)을 보여주세요 배우와 배역은 별도의 컬럼으로 나와야 합니다.SELECT title, aa.actor, aa.character FROM advanced.array_exercises AS ae CROSS JOIN UNNEST(actors) AS aa; Q3. array_exercises 테이블에서 각 영화(title)별로 배우(actor), 배역(character), 장르(genre)를 출력하세요. 한 Row에 배우, 배역, 장르가 모두 표시되어야 합니다.SELECT title, aa.actor, aa.character, gg AS genre FROM advanced.array_exercises AS ae CROSS JOIN UNNEST(actors) AS aa CROSS JOIN UNNEST(genres) AS gg; Q4. 앱 로그 데이터(app_logs)의 배열을 풀어주세요. SELECT event_date, event_timestamp, event_name, ep.key, ep.value.string_value, ep.value.int_value, user_id, user_pseudo_id, platform FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS ep WHERE event_date = "2022-08-01"; [PART 2] PIVOT 연습문제 Q1. orders 테이블에서 유저(user_id)별로 주문 금액(amount)의 합계를 PIVOT해주세요. 날짜(order_date)를 행(Row)으로, user_id를 열(Column)으로 만들어야 합니다.SELECT order_date, SUM(IF(user_id = 1, amount, 0)) AS user_1, SUM(IF(user_id = 2, amount, 0)) AS user_2, SUM(IF(user_id = 3, amount, 0)) AS user_3 FROM advanced.orders GROUP BY order_date; Q2. orders 테이블에서 날짜(order_date)별로 유저들의 주문 금액(amount)의 합계를 PIVOT 해주세요. user_id 를 행(Row)으로, order_date를 열(Column)으로 만들어야 합니다.-- 처음 시도에 짠 쿼리 SELECT user_id, SUM(IF(order_date = "2023-05-01", amount, 0)) AS 2023-05-01, SUM(IF(order_date = "2023-05-02", amount, 0)) AS 2023-05-02, SUM(IF(order_date = "2023-05-03", amount, 0)) AS 2023-05-03, SUM(IF(order_date = "2023-05-04", amount, 0)) AS 2023-05-04, SUM(IF(order_date = "2023-05-05", amount, 0)) AS 2023-05-05, FROM advanced.orders GROUP BY user_id;이렇게 쿼리를 짜고 실행해보니 다음과 같은 오류 문구가 나옴Syntax error: Unexpected integer literal "2023" at [3:52]SELECT 문에서 AS 다음에 정의한 새로운 컬럼 명칭에 오류가 있는 것 같은데, “ “ 로 감싸도 오류가 나오고 + 아예 AS 를 빼고 실행했더니 f0/f1/f2 와 같은 임의의 컬럼명이 지정됨-- 강의를 듣고 고친 쿼리 SELECT user_id, SUM(IF(order_date = "2023-05-01", amount, 0)) AS `2023-05-01`, SUM(IF(order_date = "2023-05-02", amount, 0)) AS `2023-05-02`, SUM(IF(order_date = "2023-05-03", amount, 0)) AS `2023-05-03`, SUM(IF(order_date = "2023-05-04", amount, 0)) AS `2023-05-04`, SUM(IF(order_date = "2023-05-05", amount, 0)) AS `2023-05-05`, FROM advanced.orders GROUP BY user_id;alias 로 영어가 아닌 컬럼의 이름을 새로 지정할 때에는, 반드시 backtick(`) 으로 감싸줘야 한다는 점!(c.f.) MAC 에서 backtick 은 영문인 상태로 ₩ 단축키를 누르면 나온다강의에서는 SUM 함수가 아닌 MAX 함수 or ANY_VALUE 함수로 감싸서 쿼리를 작성해주신 점 확인강의 진행 시에는 IF 문으로 먼저 데이터를 확인하고 user_id x order_date 별로 1개의 데이터만 있다는 점을 중간확인 했기 때문단, 실무 상으로는 user_id 와 order_date 가 무수히 많거나 or 데이터 양이 너무 많아 중간 조회를 할 수 없는 상황도 있기 때문에 → 이런 경우에는 MAX 로 가져가는 것이 안전하지 않을까? 하는 생각도 들었음데이터를 보고자 하는 목적이 “각 user_id x order_date 별로 주문금액의 합산” 을 보기 위함이었기 때문! Q3. orders 테이블에서 사용자(user_id)별, 날짜(order_date)별로 주문이 있다면 1, 없다면 0으로 PIVOT 해주세요. user_id를 행(Row)으로, order_date를 열(Column)로 만들고 주문을 많이 해도 1로 처리합니다.-- 처음 시도에 짠 쿼리 SELECT user_id, IF((SUM(IF(order_date = "2023-05-01", amount, NULL))) IS NOT NULL, 1, 0) AS `2023-05-01`, IF((SUM(IF(order_date = "2023-05-02", amount, NULL))) IS NOT NULL, 1, 0) AS `2023-05-02`, IF((SUM(IF(order_date = "2023-05-03", amount, NULL))) IS NOT NULL, 1, 0) AS `2023-05-03`, IF((SUM(IF(order_date = "2023-05-04", amount, NULL))) IS NOT NULL, 1, 0) AS `2023-05-04`, IF((SUM(IF(order_date = "2023-05-05", amount, NULL))) IS NOT NULL, 1, 0) AS `2023-05-05`, FROM advanced.orders GROUP BY user_id;SUM 과 IF 문으로 먼저 쿼리를 짜서 user_id 별 합산값을 확인하고 → 합산값이 NULL 이 아닌 경우에만 1을 다시 값으로 지정할 수 있도록 IF 문으로 감싸둠이렇게 쿼리를 짜도 결과값은 동이하게 나오지만, SELECT 문이 길어진 것 같아서 좀 더 효율적으로 쿼리를 짤 수 있는 방법은 없을지 고민되었음 ㅠ-- 강의를 듣고 고친 쿼리 SELECT user_id, MAX(IF(order_date = "2023-05-01", 1, 0)) AS `2023-05-01`, MAX(IF(order_date = "2023-05-02", 1, 0)) AS `2023-05-02`, MAX(IF(order_date = "2023-05-03", 1, 0)) AS `2023-05-03`, MAX(IF(order_date = "2023-05-04", 1, 0)) AS `2023-05-04`, MAX(IF(order_date = "2023-05-05", 1, 0)) AS `2023-05-05`, FROM advanced.orders GROUP BY user_id;IF 문 안에 들어있는 TRUE 조건이었던 amount 를 1로 바꿔주면 간단히 해결되었을 문제!IF 문 안에 TRUE 일 때 값이 항상 특정 컬럼이 아니라 1이라고도 할 수 있다는 것을 보여주기 위한 문제다만, PIVOT 2번 문제에서 언급했던 바와 같이 만약 user_id x order_date 별로 1개의 값만 존재하는 것이 아니라 여러 값이 존재했더라면 → 3번 쿼리에서 MAX를 썼을 때와 SUM 을 썼을 때의 결과값은 달라졌을 것이라 생각함만약 user_id x order_date 별로 N개의 값이 존재했더라면 → SUM 함수로 감쌌을 때 1의 값이 x N개 합산되어서 나왔을 것이기 때문따라서, 3번 문제의 경우 1 혹은 0 2개의 값으로만 표현해야 했기 때문에, SUM 이 아닌 MAX 함수로 쿼리를 작성하는 것이 필요하다고 생각함단, 만약 “횟수” 를 알고 싶었다면 기존대로 SUM 함수를 사용하면 됨! Q4. 앱로그 데이터 배열 PIVOT 하기 → user_id = 32888 이 카트 추가하기(click_cart)를 누를 때 어떤 음식 (food_id)을 담았는지 구해주세요. key 를 Column 으로 두고, string_value 나 int_value를 Column의 값으로 설정해서 풀어주세요.SELECT event_date, event_timestamp, event_name, user_id, user_pseudo_id, MAX(IF(ep.key = "firebase_screen", ep.value.string_value, NULL)) AS firebase_screen, MAX(IF(ep.key = "food_id", ep.value.int_value, NULL)) AS food_id, MAX(IF(ep.key = "session_id", ep.value.string_value, NULL)) AS session_id FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS ep GROUP BY ALL; [PART 3] 퍼널 연습문제WITH base AS (SELECT event_date, event_timestamp, event_name, user_id, user_pseudo_id, platform, -- event_param, MAX(IF(event_param.key = "firebase_screen", event_param.value.string_value, NULL)) AS firebase_screen, MAX(IF(event_param.key = "food_id", event_param.value.int_value, NULL)) AS food_id, MAX(IF(event_param.key = "session_id", event_param.value.string_value, NULL)) AS session_id, FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS event_param WHERE -- event_date = "2022-08-01" event_date BETWEEN "2022-08-01" AND "2022-08-18" GROUP BY ALL ), filter_event_and_concat_event_and_screen AS ( -- event_name + screen (필요한 이벤트만 WHERE 조건에 걸어서 사용) SELECT * EXCEPT(event_name, firebase_screen, event_timestamp), CONCAT(event_name, "-", firebase_screen) AS event_name_with_screen, DATETIME(TIMESTAMP_MICROS(event_timestamp), 'Asia/Seoul') AS event_datetime FROM base WHERE event_name IN ("screen_view", "click_payment") ) -- step_number + COUNT -- step_number : CASE WHEN을 사용해 숫자 지정 -- 일자별로 퍼널별 유저 수 쿼리 SELECT event_date, event_name_with_screen, CASE WHEN event_name_with_screen = "screen_view-welcome" THEN 1 WHEN event_name_with_screen = "screen_view-home" THEN 2 WHEN event_name_with_screen = "screen_view-food_category" THEN 3 WHEN event_name_with_screen = "screen_view-restaurant" THEN 4 WHEN event_name_with_screen = "screen_view-cart" THEN 5 WHEN event_name_with_screen = "click_payment-cart" THEN 6 ELSE NULL END AS step_number COUNT(DISTINCT user_pseudo_id) AS cnt FROM filter_event_and_concat_event_and_screen GROUP BY ALL HAVING step_number IS NOT NULL ORDER BY event_date
-
미해결BigQuery(SQL) 활용편(퍼널 분석, 리텐션 분석)
[바짝스터디 1주차 과제] ARRAY,STRUCT,PIVOT,FUNNEL
1. ARRAY, STRUCT중요 문법CROSS JOIN UNNEST ( ) 연습문제 1 ) ARRAY 데이터의 기본 추출 SELECT title, genre FROM `advanced.array_exercises` CROSS JOIN UNNEST(genres) AS genre 연습문제 2 ) STRUCT 데이터의 기본 추출 SELECT title, actor FROM `advanced.array_exercises` CROSS JOIN UNNEST(actors) AS actor --위와 같이 추출할시 actor 컬럼과 character 컬럼이명이 명확하게 나오지 않음으로 SELECT title, actor.actor, actor.character FROM `advanced.array_exercises` CROSS JOIN UNNEST(actors) AS actor --위와 같이 명확한 명칭을 적어주면 데이터의 컬럼을 확인하기 좋다 연습문제 3 ) CROSS JOIN 2번 사용 SELECT title, actor.actor, actor.character, genre FROM `advanced.array_exercises` CROSS JOIN UNNEST(actors) AS actor CROSS JOIN UNNEST(genres) AS genre -- OR SELECT title, actor.actor, actor.character, genre FROM `advanced.array_exercises`, UNNEST(actors) AS actor, UNNEST(genres) AS genre 연습문제 4 ) 로그 데이터 풀어보기 SELECT event_date, event_timestamp, event_name, event_param.key AS key, event_param.value.string_value AS string_value, event_param.value.int_value AS int_value, user_id, user_pseudo_id, platform FROM `advanced.app_logs` CROSS JOIN UNNEST (event_params) AS event_param2. PIVOT중요 문법IF(조건 = , TRUE , FALSE ) 컬럼명을 숫자로 설정하고 싶을시 ` 을 사용하여 감싸주어야한다 연습 문제 1 ) PIVOT 기본 SELECT order_date, SUM(IF(user_id = 1, total_amount, 0)) AS user_id_1, SUM(IF(user_id = 2, total_amount, 0)) AS user_id_2, SUM(IF(user_id = 3, total_amount, 0)) AS user_id_3 FROM( SELECT order_date, user_id, SUM(amount) AS total_amount FROM `advanced.orders` GROUP BY order_date , user_id ) GROUP BY order_date ORDER BY order_date 연습문제 2 ) PIVOT 기본 2SELECT user_id, SUM(IF(order_date = '2023-05-01', amount, 0)) AS `2023-05-01`, SUM(IF(order_date = '2023-05-02', amount, 0)) AS `2023-05-02`, SUM(IF(order_date = '2023-05-03', amount, 0)) AS `2023-05-03`, SUM(IF(order_date = '2023-05-04', amount, 0)) AS `2023-05-04`, SUM(IF(order_date = '2023-05-05', amount, 0)) AS `2023-05-05` FROM `advanced.orders` GROUP BY user_id ORDER BY user_id 연습문제 3) TRUE 값의 변화SELECT user_id, MAX(IF(order_date = '2023-05-01', 1, 0)) AS `2023-05-01`, MAX(IF(order_date = '2023-05-02', 1, 0)) AS `2023-05-02`, MAX(IF(order_date = '2023-05-03', 1, 0)) AS `2023-05-03`, MAX(IF(order_date = '2023-05-04', 1, 0)) AS `2023-05-04`, MAX(IF(order_date = '2023-05-05', 1, 0)) AS `2023-05-05` FROM `advanced.orders` GROUP BY user_id ORDER BY user_id 연습문제 4 ) 로그 데이터를 이용한 컬럼 정리SELECT event_date, event_timestamp, event_name, user_id, user_pseudo_id, MAX(IF(param.key = 'firebase_screen', param.value.string_value, NULL)) AS firebase_screen, MAX(IF(param.key = 'food_id', param.value.int_value, NULL)) AS food_id, MAX(IF(param.key = 'session_id', param.value.string_value, NULL)) AS session_id FROM `advanced.app_logs` CROSS JOIN UNNEST (event_params) AS param GROUP BY ALLGROUP BY ALL에 대한 이해,UNNEST 이후 컬럼을 어떤식으로 정리할것인가 3. 퍼널분석퍼널분석 쿼리 WITH logs AS ( SELECT event_date, event_timestamp, event_name, user_id, user_pseudo_id, MAX(IF(param.key = 'firebase_screen', param.value.string_value, NULL)) AS firebase_screen, MAX(IF(param.key = 'food_id', param.value.int_value, NULL)) AS food_id, MAX(IF(param.key = 'session_id', param.value.string_value, NULL)) AS session_id FROM `advanced.app_logs` CROSS JOIN UNNEST (event_params) AS param WHERE event_date BETWEEN '2022-12-01' AND '2022-12-31' GROUP BY ALL ), filter_logs AS ( SELECT * EXCEPT(event_name, firebase_screen, event_timestamp), CONCAT(event_name, '-', firebase_screen) AS event_name_screen, DATETIME(TIMESTAMP_MICROS(event_timestamp)) AS event_time FROM logs WHERE event_name IN ('screen_view', 'click_payment') ) funnel AS ( SELECT event_date, event_name_screen, CASE WHEN event_name_screen = 'screen_view-welcome' THEN 1 WHEN event_name_screen = 'screen_view-home' THEN 2 WHEN event_name_screen = 'screen_view-food_category' THEN 3 WHEN event_name_screen = 'screen_view-restaurant' THEN 4 WHEN event_name_screen = 'screen_view-cart' THEN 5 WHEN event_name_screen = 'click_payment-cart' THEN 6 ELSE NULL END AS flow, COUNT(DISTINCT user_pseudo_id) AS cnt FROM filter_logs GROUP BY ALL HAVING flow IS NOT NULL ORDER BY 1,3 ) PIVOTSELECT event_date, MAX(IF(flow = 1, cnt, 0)) AS screen_view_welcome, MAX(IF(flow = 2, cnt, 0)) AS screen_view_home, MAX(IF(flow = 3, cnt, 0)) AS screen_food_category, MAX(IF(flow = 4, cnt, 0)) AS screen_restaurant, MAX(IF(flow = 5, cnt, 0)) AS screen_cart, MAX(IF(flow = 6, cnt, 0)) AS click_payment_cart, FROM funnel GROUP BY event_date ORDER BY 1
-
미해결BigQuery(SQL) 활용편(퍼널 분석, 리텐션 분석)
[바짝스터디 1주차 과제] ARRAY, STRUCT 연습 문제 / PIVOT 연습 문제
1. ARRAY, STRUCT 연습문제 1)array_exercises테이블에서각영화(title)별로장르(genres)를UNNEST해서 보여주세요 SELECT title, genre FROM advanced.array_exercises CROSS JOIN UNNEST(genres) AS genre; 2)array_exercises테이블에서각영화(title)별로배우(actor)와배역(character)을 보여주세요. 배우와배역은별도의컬럼으로나와야합니다 SELECT title, actors.actor, actors.character FROM advanced.array_exercises CROSS JOIN UNNEST(actors) AS ac; 3)array_exercises테이블에서각영화(title)별로배우(actor),배역(character),장르 (genre)를출력하세요.한Row에배우,배역,장르가모두표시되어야합니다 SELECT title, actor.actor, actor.character, genre FROM `advanced.array_exercises` CROSS JOIN UNNEST(actors) as actor CROSS JOIN UNNEST(genres) as genre; 4)앱로그데이터(app_logs)의배열을풀어주세요 SELECT user_id, event_date, event_name, user_pseudo_id, event_param.key, event_param.value.string_value, event_param.value.int_value FROM `advanced.app_logs` CROSS JOIN UNNEST(event_params) AS event_param 2. PIVOT 연습 문제 1)orders테이블에서유저(user_id)별로주문금액(amount)의합계를PIVOT해주세요. 날짜(order_date)를행(Row)으로,user_id를열(Column)으로만들어야합니다 SELECT order_date, COALESCE(SUM(IF(user_id = 1, amount, null)),0) AS user_1, COALESCE(SUM(IF(user_id = 2, amount, null)),0) AS user_2, COALESCE(SUM(IF(user_id = 3, amount, null)),0) AS user_3 FROM advanced.orders GROUP BY order_date ORDER BY order_date 2)orders테이블에서날짜(order_date)별로유저들의주문금액(amount)의합계를 PIVOT해주세요.user_id를행(Row)으로,order_date를열(Column)으로만들어야 합니다 SELECT user_id, COALESCE(SUM(IF(order_date = '2023-05-01', amount, null)),0) AS `2023-05-01`, COALESCE(SUM(IF(order_date = '2023-05-02', amount, null)),0) AS `2023-05-02`, COALESCE(SUM(IF(order_date = '2023-05-03', amount, null)),0) AS `2023-05-03`, COALESCE(SUM(IF(order_date = '2023-05-04', amount, null)),0) AS `2023-05-04`, COALESCE(SUM(IF(order_date = '2023-05-05', amount, null)),0) AS `2023-05-05`, FROM advanced.orders GROUP BY user_id ORDER BY user_id; 3)orders테이블에서사용자(user_id)별,날짜(order_date)별로주문이있다면1, 없다면0으로PIVOT해주세요.user_id를행(Row)으로,order_date를열(Column)로 만들고주문을많이해도1로처리합니다 SELECT user_id, MAX(IF(order_date = '2023-05-01' AND order_id is not null, 1, 0)) AS `2023-05-01`, MAX(IF(order_date = '2023-05-02' AND order_id is not null, 1, 0)) AS `2023-05-02`, MAX(IF(order_date = '2023-05-03' AND order_id is not null, 1, 0)) AS `2023-05-03`, MAX(IF(order_date = '2023-05-04' AND order_id is not null, 1, 0)) AS `2023-05-04`, MAX(IF(order_date = '2023-05-05' AND order_id is not null, 1, 0)) AS `2023-05-05`, FROM advanced.orders GROUP BY user_id ORDER BY user_id; *) user_id=32888이카트추가하기(click_cart)를누를때어떤음식(food_id)을담았나요? WITH app_order_raw AS ( SELECT user_id, event_date, event_name, user_pseudo_id, pr.key, pr.value.string_value, pr.value.int_value FROM advanced.app_logs, UNNEST(event_params) AS pr WHERE event_date = '2022-08-01' ) SELECT user_id, event_date, event_name, user_pseudo_id, MAX(IF(key = 'firebase_screen', string_value, null)) AS firebase_screen, MAX(IF(key = 'food_id', int_value, null)) AS food_id, MAX(IF(key = 'session_id', string_value, null)) AS session_id, FROM app_order_raw GROUP BY user_id, event_date, event_name, user_pseudo_id; 3. 느낀점 퍼널이 무엇언지에 대해서 자세히 배울 수 있었음(학과 수업에선 퍼널에 대해서 딱히 알려주는 수업이 없어서 이렇게 재대로 배울 기회가 없었다.) 시작하기 앞서 빅쿼리에 대해 아는 것이 적어 잘할 수 있을지 고민이 많이 되었지만 강의 내용이 이해가 잘되어 열심히 노력하면 따라갈 수 있겠다는 생각이 듬 2주차도 잘부탁드립니다~
-
미해결BigQuery(SQL) 활용편(퍼널 분석, 리텐션 분석)
[바짝스터디 1주차 과제] ARRAY, STRUCT 연습 문제 / PIVOT 연습 문제 / 퍼널 쿼리 연습 문제
SELECT movie_id, title, genre FROM `advanced.array_exercises`, CROSS JOIN UNNEST(genres) AS genre SELECT title, actors.actor, actors.character FROM `advanced.array_exercises` CROSS JOIN UNNEST(actors) AS actors SELECT title, actor.actor, actor.character, genre FROM `advanced.array_exercises` CROSS JOIN UNNEST(actors) AS actor CROSS JOIN UNNEST(genres) AS genre SELECT event_date, event_timestamp, event_name, event_param.key AS key, event_param.value.string_value AS string_value, event_param.value.int_value AS int_value, user_id, user_pseudo_id, platform FROM `advanced.app_logs` CROSS JOIN UNNEST(event_params) AS event_paramSELECT order_date, SUM(IF(user_id = 1, amount, 0)) AS user_1, SUM(IF(user_id = 2, amount, 0)) AS user_2, SUM(IF(user_id = 3, amount, 0)) AS user_3 FROM advanced.orders GROUP BY ALL ORDER BY order_date -- 2. orders 테이블에서 날짜(order_date) 별로 유저들의 주문 금액(amount)의 합계를 PIVOT 해주세요. user_id를 행(Row)으로, order_Date를 열(Column)으로 만들어야 합니다. SELECT user_id, SUM(IF(order_date = "2023-05-01", amount, 0)) AS `2023-05-01`, SUM(IF(order_date = "2023-05-02", amount, 0)) AS `2023-05-02`, SUM(IF(order_date = "2023-05-03", amount, 0)) AS `2023-05-03`, SUM(IF(order_date = "2023-05-04", amount, 0)) AS `2023-05-04`, SUM(IF(order_date = "2023-05-05", amount, 0)) AS `2023-05-05`, FROM advanced.orders GROUP BY ALL ORDER BY user_id -- 3. orders 테이블에서 사용자(user_id)별, 날짜(order_date)별로 주문이 있다면 1, 없다면 0으로 PIVOT 해주세요. user_id를 행(Row)으로, order_date를 열(Column)로 만들고 주문을 많이 해도 1로 처리합니다. SELECT user_id, MAX(IF(order_date = "2023-05-01", 1, 0)) AS `2023-05-01`, MAX(IF(order_date = "2023-05-02", 1, 0)) AS `2023-05-02`, MAX(IF(order_date = "2023-05-03", 1, 0)) AS `2023-05-03`, MAX(IF(order_date = "2023-05-04", 1, 0)) AS `2023-05-04`, MAX(IF(order_date = "2023-05-05", 1, 0)) AS `2023-05-05`, FROM advanced.orders GROUP BY ALL ORDER BY user_id -- 4.앱 로그 PIVOT WITH base AS ( SELECT event_date, event_timestamp, event_name, user_id, user_pseudo_id, MAX(IF(param.key = "firebase_screen", param.value.string_value, NULL)) AS firebase_screen, MAX(IF(param.key = "food_id", param.value.int_value, NULL)) AS food_id, MAX(IF(param.key = "session_id", param.value.string_value, NULL)) AS session_id, FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS param GROUP BY ALL ) SELECT event_date, COUNT(user_id) AS user_cnt FROM base WHERE event_name = "click_cart" GROUP BY ALL FUNNEL with base as ( select event_date, event_timestamp, event_name, user_id, user_pseudo_id, platform, -- event_param max(if(event_param.key = 'firebase_screen', event_param.value.string_value, null)) as firebase_screen, -- max(if(event_param.key = 'food_id', event_param.value.int_value, null)) as food_id, max(if(event_param.key = 'session_id', event_param.value.string_value, null)) as session_id from advanced.app_logs cross join unnest(event_params) as event_param where event_date between '2022-08-01' and '2022-08-18' group by all ), filter_event_and_concat_event_and_screen as ( select * except(event_name, firebase_screen, event_timestamp), concat(event_name, '-', firebase_screen) as event_name_with_screen, datetime(timestamp_micros(event_timestamp), 'Asia/Seoul') as event_datetime from base where event_name in ('screen_view', 'click_payment' ) ) select event_date, event_name_with_screen, case when event_name_with_screen = 'screen_view-welcome' then 1 when event_name_with_screen = 'screen_view-home' then 2 when event_name_with_screen = 'screen_view-food_category' then 3 when event_name_with_screen = 'screen_view-restaurant' then 4 when event_name_with_screen = 'screen_view-cart' then 5 when event_name_with_screen = 'click_payment-cart' then 6 else null end as step_number, count(distinct user_pseudo_id) as cnt from filter_event_and_concat_event_and_screen group by all having step_number is not null order by 1, 3 -- 3번 퍼널 문제 with base as ( select event_date, event_timestamp, event_name, user_id, user_pseudo_id, platform, -- event_param max(if(event_param.key = 'firebase_screen', event_param.value.string_value, null)) as firebase_screen, -- max(if(event_param.key = 'food_id', event_param.value.int_value, null)) as food_id, max(if(event_param.key = 'session_id', event_param.value.string_value, null)) as session_id from advanced.app_logs cross join unnest(event_params) as event_param where event_date between '2022-08-01' and '2022-08-18' group by all ), filter_event_and_concat_event_and_screen as ( select * except(event_name, firebase_screen, event_timestamp), concat(event_name, '-', firebase_screen) as event_name_with_screen, datetime(timestamp_micros(event_timestamp), 'Asia/Seoul') as event_datetime from base where event_name in ('screen_view', 'click_payment' ) ) select event_date, event_name_with_screen, case when event_name_with_screen = 'screen_view-welcome' then 1 when event_name_with_screen = 'screen_view-home' then 2 when event_name_with_screen = 'screen_view-food_category' then 3 when event_name_with_screen = 'screen_view-restaurant' then 4 when event_name_with_screen = 'screen_view-cart' then 5 when event_name_with_screen = 'click_payment-cart' then 6 else null end as step_number, count(distinct user_pseudo_id) as cnt from filter_event_and_concat_event_and_screen group by all having step_number is not null order by 1, 3 SELECT event_date, MAX(IF(event_name_with_screen ="screen_view-welcome", cnt, NULL)) AS `screen_view-welcome`, MAX(IF(event_name_with_screen ="screen_view-home", cnt, NULL)) AS `screen_view-home`, MAX(IF(event_name_with_screen ="screen_view-food_category", cnt, NULL)) AS `screen_view-food_category`, MAX(IF(event_name_with_screen ="screen_view-restaurant", cnt, NULL)) AS `screen_view-restaurant`, MAX(IF(event_name_with_screen ="screen_view-cart", cnt, NULL)) AS `screen_view-cart`, MAX(IF(event_name_with_screen ="click_payment-cart", cnt, NULL)) AS `click_payment-cart` FROM main GROUP BY ALL ORDER BY all
-
미해결
[빠짝스터디 1주차 과제] ARRAY STRUCT, PIVOT, 퍼널 쿼리 연습문제
1주차.ARRAY(배열), STRUCT(구조체) 연습문제ARRAY = 같은 타입의(숫자, 문자 등) 여러 값을 하나의 컬럼에 저장할 수 있는 자료음식 메뉴판, 음악 플레이리스트-즐겨찾기한 음악 등STRUCT = 서로 다른 타입의 여러 값을 하나의 컬럼에 저장할 수 있는 자료주소록, 영화 정보 등✅ ARRAY, STRUCT 연습문제 1# 연습문제 1번. # array_exercises 테이블에서 각 영화(title) 별로 장르(genres)를 UNNEST해서 보여주세요. SELECT title, genre FROM advanced.array_exercises CROSS JOIN UNNEST(genres) as genre✅ ARRAY, STRUCT 연습문제 2# 연습문제 2번. # array_exercises 테이블에서 각 영화(title) 별로 배우(actor)와 배역(character)을 보여주세요. # 배우와 배역은 별도의 컬럼으로 나와야합니다. SELECT title, actor.actor, actor.character FROM advanced.array_exercises CROSS JOIN UNNEST(actors) as actor✅ ARRAY, STRUCT 연습문제 3# 연습문제 3번. # array_exercises 테이블에서 각 영화(title) 별로 배우(actor), 배역(character), 장르(genres)를 출력. # 한 Row에 배우, 배역, 장르가 모두 표시되어야합니다. SELECT title, actor.actor as actor, actor.character as character, genre FROM advanced.array_exercises CROSS JOIN UNNEST(genres) as genre CROSS JOIN UNNEST(actors) as actor ORDER BY 1, 2 desc✅ ARRAY, STRUCT 연습문제 4# 연습문제 4번. # app_logs 데이터의 배열을 풀어라. # 데이터 탐색, group by 활용, 하루 사용자 집계, 어떤 이벤트가 있는지 등. WITH base AS ( SELECT event_date, event_name, evnent_parm.key AS key, evnent_parm.value.string_value AS string_value, evnent_parm.value.int_value AS int_value, user_id FROM advanced.app_logs, UNNEST(event_params) AS evnent_parm WHERE event_date = '2022-08-01') SELECT event_date, event_name, count(distinct user_id) as cnt FROM base GROUP BY ALL order by cnt desc click_banner, cart, food, food_category, login, payment, recommend_extra_food, recommend_food, restaurant, restaruant_nearby, search / request_search, screen_view, view_recommend_extra_food → 총 14가지의 event_name.기간 내 전체 사용자(중복 제거) → 총 49,678명, click_payment 를 실행한 이용자 총 11,467명.2022-08-01 하루 사용자 129명. 8월 한달 사용자 6,424명. 8월 한달 click_payment 810명.→ event_name을 screen_view로 설정했지만, 기간 내 전체 사용자, 0801 하루 사용자, 8월 한달 사용자의 수는 같았다(user_id 사용). screen_view는 앱에 들어오면 바로 체크가 되는 기본 단계임을 알 수 있음. 앱에 들어오고 바로 다음이 중요하다고 생각.8월 한달 간의 ‘screen_view’ 이벤트의 분포가 월초 → 월말로 증가하는 형태가 나타남. screen_view 이벤트가 월말이 될 수록 커짐. 다른 달도 확인해봐야함. → 스프레드 시트로 월별로 확인해보면 좋을듯??✅ Pivot 연습문제 1# 데이터 PIVOT 연습문제 1번. # orders 테이블에서 유저(user_id)별로 주문 금액(amount)의 합계를 PIVOT 해주세요. # 날짜(order_rate)를 행으로, user_id를 열으로 만들어야 합니다. # 기대하는 OUTPUT : order_date | user_1 | user_2 | user_3 SELECT order_date, SUM(IF(user_id = 1, amount, 0 )) AS user_1, SUM(IF(user_id = 2, amount, 0 )) AS user_2, SUM(IF(user_id = 3, amount, 0 )) AS user_3 FROM advanced.orders GROUP BY order_date ORDER BY order_date✅ Pivot 연습문제 2# 데이터 PIVOT 연습문제 2번. # orders 테이블에서 날짜별로 유저들의 주문 금액의 합계를 PIVOT 해주세요. # user_id 를 행으로, order_date를 열으로 만들어야합니다. # 기대하는 OUTPUT : user_id | order_date | -- `(백틱, 숫자 1 왼쪽)을 쓰면 영어 이외에도 한글도 쓸 수 있다는 점!! SELECT user_id, SUM(IF(order_date = '2023-05-01', amount, 0 )) AS `2023-05-01`, SUM(IF(order_date = '2023-05-02', amount, 0 )) AS `2023-05-02`, SUM(IF(order_date = '2023-05-03', amount, 0 )) AS `2023-05-03`, SUM(IF(order_date = '2023-05-04', amount, 0 )) AS `2023-05-04`, SUM(IF(order_date = '2023-05-05', amount, 0 )) AS `2023-05-05` FROM advanced.orders GROUP BY user_id ORDER BY user_id✅ Pivot 연습문제 3# 데이터 PIVOT 연습문제 3번. # orders 테이블에서 사용자별, 날짜별로 주문이 있다면 1, 없다면 0으로 PIVOT 해주세요. # user_id를 행으로, order_date를 열로 만들고 주문을 많이 해도 1로 처리 합니다. -- AMOUNT 대신에 1로 표시!! SELECT user_id, MAX(IF(order_date = '2023-05-01',1,0)) AS `2023-05-01`, MAX(IF(order_date = '2023-05-02',1,0)) AS `2023-05-02`, MAX(IF(order_date = '2023-05-03',1,0)) AS `2023-05-03`, MAX(IF(order_date = '2023-05-04',1,0)) AS `2023-05-04`, MAX(IF(order_date = '2023-05-05',1,0)) AS `2023-05-05` FROM advanced.orders GROUP BY user_id ORDER BY user_id✅ Pivot 연습문제 4# 앱 로그 데이터 PIVOT 하기 # user_id = 32888 카트 추가하기(click_cart)를 누를 때 어떤 음식(food_id)을 담았나요? WITH base AS ( SELECT event_date, event_timestamp, event_name, user_id, user_pseudo_id, MAX(IF(event_param.key = 'firebase_screen',event_param.value.string_value, NULL)) AS firebase_screen, MAX(IF(event_param.key = 'food_id',event_param.value.int_value, NULL)) AS food_id, MAX(IF(event_param.key = 'session_id',event_param.value.string_value, NULL)) AS session_id, FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS event_param GROUP BY ALL ) SELECT user_id, event_date, COUNT(user_id) AS user_cnt, food_id FROM base WHERE user_id = 32888 and event_name = 'click_cart' GROUP BY ALL✅ 퍼널 분석 (Funnel) 연습문제# screen_view : welcome, home, food_category, cart, click_payment + step_number # 데이터의 기간 : 2022-08-01 ~ 2022-08-18 # 사용할 테이블 : 앱 로그 데이터, GA/firebase 데이터 -> UNNEST -> PIVOT # 기본이 되는 데이터프레임 만듬(base), 피벗으로 ARRAY 데이터 풀어줌. WITH base AS ( SELECT event_date, event_timestamp, event_name, user_id, user_pseudo_id, platform, event_param, MAX(IF(event_param.key = 'firebase_screen', event_param.value.string_value, NULL)) AS firebase_screen, # food_id는 null 모두 null 값으로 필요없어서 주석처리함. -- MAX(IF(event_param.key = 'food_id', event_param.value.int_value, NULL)) AS food_id, MAX(IF(event_param.key = 'session_id', event_param.value.string_value, NULL)) AS session_id FROM advanced.app_logs CROSS JOIN UNNEST(e vent_params) AS event_param WHERE event_date between '2022-08-01'and '2022-08-18' GROUP BY ALL ), fliter_event_and_concat_event_and_screen AS( SELECT * EXCEPT(event_name, firebase_screen, event_timestamp), CONCAT(event_name,'-',firebase_screen) AS event_name_with_screen, DATETIME(TIMESTAMP_MICROS(event_timestamp), 'Asia/Seoul') AS event_datetime FROM base WHERE event_name IN ('screen_view', 'click_payment') ) # step_number 만들어주는 case when 구문 select event_date, event_name_with_screen, CASE WHEN event_name_with_screen = 'screen_view-welcome' THEN 1 WHEN event_name_with_screen = 'screen_view-home' THEN 2 WHEN event_name_with_screen = 'screen_view-food_category' THEN 3 WHEN event_name_with_screen = 'screen_view-restaurant' THEN 4 WHEN event_name_with_screen = 'screen_view-cart' THEN 5 WHEN event_name_with_screen = 'click_payment-cart' THEN 6 ELSE NULL END as step_number, COUNT(distinct user_pseudo_id) AS cnt FROM fliter_event_and_concat_event_and_screen GROUP BY ALL HAVING step_number IS NOT NULL # having으로 6가지를 제외한 나머지 이벤트는 제외 ORDER BY 1 # 바로 위의 커리를 with문으로 감싸서 'CTE'라는 테이블을 만듬. # 집계한 데이터를 PIVOT. 일자별 각 이벤트 네임의 횟수 확인 가능. SELECT event_date, MAX(IF(event_name_with_screen ='screen_view-welcome',cnt,NULL )) as screen_view_welcome, MAX(IF(event_name_with_screen ='screen_view-home',cnt,NULL )) as screen_view_home, MAX(IF(event_name_with_screen ='screen_view-food_category',cnt,NULL )) as screen_view_food_category, MAX(IF(event_name_with_screen ='screen_view-restaurant',cnt,NULL )) as screen_view_restaurant, MAX(IF(event_name_with_screen ='screen_view-cart',cnt,NULL )) as screen_view_cart, MAX(IF(event_name_with_screen ='click_payment-cart',cnt,NULL )) as click_payment_cart, FROM CTE GROUP BY ALL ORDER BY 1
-
미해결BigQuery(SQL) 활용편(퍼널 분석, 리텐션 분석)
[바짝스터디 1주차 과제] ARRAY, STRUCT 연습 문제 / PIVOT 연습 문제 / 퍼널 쿼리 연습 문제
# ARRAY, STRUCT 연습 문제 1. array_exercises 테이블에서 각 영화(title) 별로 장르(genres)를 UNNEST 해서 보여주세요 SELECT title, genre FROM advanced.array_exercises CROSS JOIN UNNEST(genres) AS genre# ARRAY, STRUCT 연습 문제 2. array_exercises 테이블에서 각 영화(title)별로 배우(actor)와 배역(character)을 보여주세요. 배우와 배역은 별도의 컬럼으로 나와야 합니다. SELECT title, actor.actor, actor.character FROM advanced.array_exercises CROSS JOIN UNNEST(actors) AS actor# ARRAY, STRUCT 연습 문제 3. array_exercises 테이블에서 각 영화(title)별로 배우(actor), 배역(character), 장르(genre)를 출력하세요. 한 Row에 배우, 배역, 장르가 모두 표시되어야 합니다 SELECT title, actor.actor, actor.character, genre FROM advanced.array_exercises CROSS JOIN UNNEST(actors) AS actor, UNNEST(genres) AS genre# ARRAY, STRUCT 연습 문제 4. 앱 로그 데이터(app_logs)의 배열을 풀어주세요 SELECT user_id, event_date, event_name, user_pseudo_id, event_param.key, event_param.value.string_value, event_param.value.int_value FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS event_param# PIVOT 연습 문제 1. orders 테이블에서 유저(user_id)별로 주문 금액(amount)의 합계를 PIVOT 해주세요. 날짜(order_date)를 행(Row)으로, user_id를 열(Column)으로 만들어아 합니다 -- SELECT -- order_date, -- MAX(IF(user_id = 1, amount, 0)) AS user_1, -- MAX(IF(user_id = 2, amount, 0)) AS user_2, -- MAX(IF(user_id = 3, amount, 0)) AS user_3 -- FROM advanced.orders -- GROUP BY -- order_date -- ORDER BY -- order_date SELECT order_date, MAX(IF(user_id = 1, sum_of_amount, 0)) AS user_1, MAX(IF(user_id = 2, sum_of_amount, 0)) AS user_2, MAX(IF(user_id = 3, sum_of_amount, 0)) AS user_3 FROM ( SELECT order_date, user_id, MAX(amount) AS sum_of_amount FROM advanced.orders GROUP BY order_date, user_id ) GROUP BY order_date ORDER BY order_date# PIVOT 연습 문제 2. orders 테이블에서 날짜(order_date)별로 유저들의 주문 금액(amount)의 합계를 PIVOT 해주세요. user_id를 행(Row)으로, order_date를 열(Column)으로 만들어야 합니다 SELECT user_id, SUM(IF(order_date = '2023-05-01', amount, 0)) AS `2023-05-01`, SUM(IF(order_date = '2023-05-02', amount, 0)) AS `2023-05-02`, SUM(IF(order_date = '2023-05-03', amount, 0)) AS `2023-05-03`, SUM(IF(order_date = '2023-05-04', amount, 0)) AS `2023-05-04`, SUM(IF(order_date = '2023-05-05', amount, 0)) AS `2023-05-05` FROM advanced.orders GROUP BY user_id ORDER BY user_id# PIVOT 연습 문제 3) orders 테이블에서 사용자(user_id)별, 날짜(order_date)별로 주문이 있다면 1, 없다면 0으로 PIVOT 해주세요. user_id를 행(Row)으로, order_date를 열(Column)로 만들고 주문을 많이 해도 1로 처리합니다 SELECT user_id, MAX(IF(order_date = '2023-05-01', amount, 0)) AS `2023-05-01`, MAX(IF(order_date = '2023-05-02', amount, 0)) AS `2023-05-02`, MAX(IF(order_date = '2023-05-03', amount, 0)) AS `2023-05-03`, MAX(IF(order_date = '2023-05-04', amount, 0)) AS `2023-05-04`, MAX(IF(order_date = '2023-05-05', amount, 0)) AS `2023-05-05` FROM ( SELECT user_id, order_date, CASE WHEN amount IS NULL THEN 0 ELSE 1 END AS amount FROM advanced.orders ) GROUP BY user_id ORDER BY user_id# PIVOT 연습 문제 version 1 -- WITH sub AS ( -- SELECT -- user_id, -- event_date, -- event_name, -- event_timestamp, -- user_pseudo_id, -- event_param.key AS key, -- event_param.value.string_value AS string_value, -- event_param.value.int_value AS int_value -- FROM ( -- SELECT -- * -- FROM advanced.app_logs -- WHERE -- event_date = '2022-08-01') -- CROSS JOIN UNNEST(event_params) AS event_param -- ) -- SELECT -- event_date, -- COUNT(user_id) AS user_cnt -- FROM ( -- SELECT -- user_id, -- event_date, -- event_name, -- event_timestamp, -- user_pseudo_id, -- MAX(IF(sub.key = 'firebase_screen', sub.string_value, NULL)) AS firebase_screen, -- MAX(IF(sub.key = 'food_id', sub.int_value, NULL)) AS food_id, -- MAX(IF(sub.key = 'session_id', sub.string_value, NULL)) AS session_id -- FROM sub -- GROUP BY -- user_id, -- event_date, -- event_name, -- event_timestamp, -- user_pseudo_id -- ) -- WHERE -- event_name = 'click_cart' # AND food_id = 1544 -- GROUP BY -- event_date # PIVOT 연습 문제 version 2 WITH base AS ( SELECT -- * EXCEPT(event_params), # * EXCEPT(컬럼) : 컬럼을 제외하고 모두 다 보여줘 event_date, event_timestamp, event_name, user_id, user_pseudo_id, MAX(IF(param.key = 'firebase_screen', param.value.string_value, NULL)) AS firebase_screen, MAX(IF(param.key = 'food_id', param.value.int_value, NULL)) AS food_id, MAX(IF(param.key = 'session_id', param.value.string_value, NULL)) AS session_id, FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS param WHERE event_date = "2022-08-01" GROUP BY ALL ) SELECT event_date, COUNT(user_id) AS user_cnt FROM base WHERE event_name = 'click_cart' # AND food_id = 1544 GROUP BY event_date# 퍼널 쿼리 연습 문제 퍼널 별 전환율 WITH base AS ( SELECT event_date, event_timestamp, event_name, user_id, user_pseudo_id, platform, MAX(IF(event_param.key = 'firebase_screen', event_param.value.string_value, NULL)) AS firebase_screen, MAX(IF(event_param.key = 'session_id', event_param.value.string_value, NULL)) AS session_id FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS event_param WHERE event_date BETWEEN '2022-08-01' AND '2022-08-18' GROUP BY ALL ), filter_event_and_concat_event_and_screen AS ( SELECT * EXCEPT(event_name, firebase_screen, event_timestamp), CONCAT(event_name,'-',firebase_screen) AS event_name_with_screen, DATETIME(TIMESTAMP_MICROS(event_timestamp), 'Asia/Seoul') AS event_datetime FROM base WHERE event_name IN ('screen_view', 'click_payment') ), before_pivot AS ( SELECT event_date, event_name_with_screen, CASE WHEN event_name_with_screen = 'screen_view-welcome' THEN 1 WHEN event_name_with_screen = 'screen_view-home' THEN 2 WHEN event_name_with_screen = 'screen_view-food_category' THEN 3 WHEN event_name_with_screen = 'screen_view-restaurant' THEN 4 WHEN event_name_with_screen = 'screen_view-cart' THEN 5 WHEN event_name_with_screen = 'click_payment-cart' THEN 6 ELSE NULL END AS step_number, COUNT(DISTINCT user_pseudo_id) AS cnt FROM filter_event_and_concat_event_and_screen GROUP BY ALL HAVING step_number IS NOT NULL ORDER BY event_date ) SELECT event_date, MAX(IF(step_number = 1, cnt, NULL)) AS `screen_view-welcome`, MAX(IF(step_number = 2, cnt, NULL)) AS `screen_view-home`, MAX(IF(step_number = 3, cnt, NULL)) AS `screen_view-food_category`, MAX(IF(step_number = 4, cnt, NULL)) AS `screen_view-restaurant`, MAX(IF(step_number = 5, cnt, NULL)) AS `screen_view-cart`, MAX(IF(step_number = 6, cnt, NULL)) AS `click_payment-cart`, CAST(MAX(IF(step_number = 2, cnt, NULL)) AS FLOAT64) / CAST(MAX(IF(step_number = 1, cnt, NULL)) AS FLOAT64) AS CVR1, CAST(MAX(IF(step_number = 3, cnt, NULL)) AS FLOAT64) / CAST(MAX(IF(step_number = 2, cnt, NULL)) AS FLOAT64) AS CVR2, CAST(MAX(IF(step_number = 4, cnt, NULL)) AS FLOAT64) / CAST(MAX(IF(step_number = 3, cnt, NULL)) AS FLOAT64) AS CVR3, CAST(MAX(IF(step_number = 5, cnt, NULL)) AS FLOAT64) / CAST(MAX(IF(step_number = 4, cnt, NULL)) AS FLOAT64) AS CVR4, CAST(MAX(IF(step_number = 6, cnt, NULL)) AS FLOAT64) / CAST(MAX(IF(step_number = 5, cnt, NULL)) AS FLOAT64) AS CVR5 FROM before_pivot GROUP BY event_date ORDER BY event_date
-
미해결BigQuery(SQL) 활용편(퍼널 분석, 리텐션 분석)
[바짝스터디 1주차 과제] ARRAY, STRUCT 연습 문제 / PIVOT 연습 문제 / 퍼널 쿼리 연습 문제
ARRAY, STRUCT, UNNEST 연습 문제 array_exercises 테이블에서 각 영화(title)별로 장르(genres)를 UNNEST해서 보여주세요SELECT title, genres -- 오답: genres 배열을 그대로 선택하려 함. FROM `advanced.array_exercises` AS ae JOIN UNNEST(genres) AS genre; -- 오답: UNNEST와 잘못된 JOIN 방식 사용 SELECT title, -- 각 영화의 제목을 선택합니다. 이를 통해 각 장르가 어떤 영화에 속하는지 확인할 수 있습니다. genre -- UNNEST를 통해 평면화한 개별 장르를 선택합니다. FROM `advanced.array_exercises` AS ae -- advanced.array_exercises 테이블에서 데이터를 가져옵니다. CROSS JOIN UNNEST(genres) AS genre; -- genres 배열을 평면화하여 각 장르를 개별 행으로 만듭니다.genres라는 배열을 그대로 선택하려 했습니다.genres는 ARRAY 자료형이기 때문에 이 상태로는 각 장르를 개별적으로 볼 수 없습니다.UNNEST를 사용할 때 단순 JOIN을 사용했습니다. UNNEST를 통해 배열을 개별적인 행으로 변환할 때는 CROSS JOIN을 사용해야 합니다. 2. array_exercises 테이블에서 각 영화(title)별로 배우(actor)와 배역(character)을 보여주세요. 배우와 배역은 별도의 컬럼으로 나와야 합니다.SELECT title, actors.actor, -- 잘못된 접근 방식: 'actors'는 배열 형태이므로 직접 필드에 접근할 수 없음. actors.character -- 잘못된 접근 방식: 'actors'의 필드에 바로 접근하려고 하였음. FROM `advanced.array_exercises` AS ae CROSS JOIN UNNEST(actors) AS actor; -- 배열을 평면화하기 위해 UNNEST 사용, 하지만 이후 필드 접근 방식에서 실수가 있었음.SELECT title, -- 각 영화의 제목을 선택합니다. actor.actor, -- 배우의 이름을 선택합니다. UNNEST를 통해 평면화한 결과입니다. actor.character -- 배우가 맡은 배역을 선택합니다. UNNEST를 통해 평면화한 결과입니다. FROM `advanced.array_exercises` AS ae -- advanced.array_exercises 테이블에서 데이터를 가져옵니다. CROSS JOIN UNNEST(actors) AS actor; -- actors 배열을 평면화하여 각 배우와 배역 정보를 개별 행으로 만듭니다.actors 배열의 각 필드 (actor, character)에 접근하기 위해 단순히 배열을 호출했습니다. 그러나 배열은 UNNEST를 통해 각 요소를 개별 행으로 확장하지 않으면 필드에 접근할 수 없습니다.actors는 구조체(STRUCT)로 배열 안에 존재하기 때문에, 각 필드에 접근하려면 UNNEST 이후 명확한 필드명을 사용해야 했습니다.actor.actor: UNNEST된 배열의 actor 필드에 접근하여 배우의 이름을 가져옵니다.actor.character: UNNEST된 배열의 character 필드에 접근하여 각 배우의 배역 정보를 가져옵니다. 3. array_exercises 테이블에서 각 영화(title)별로 배우(actor), 배역(character), 장르 (genre)를 출력하세요. 한 Row에 배우, 배역, 장르가 모두 표시되어야 합니다.SELECT title, -- 각 영화의 제목을 선택합니다. 영화별 배우와 배역, 그리고 장르 정보를 확인할 수 있습니다. actor.actor AS actor, -- 배우의 이름을 선택합니다. UNNEST를 통해 평면화한 결과로 각 영화마다 배우의 이름을 가져옵니다. actor.character AS character, -- 배우가 맡은 배역을 선택합니다. UNNEST를 통해 평면화한 결과로 각 영화마다 배역 정보를 제공합니다. genre -- 영화의 장르를 선택합니다. UNNEST를 통해 평면화한 결과로 각 영화마다 해당하는 장르를 표시합니다. FROM `advanced.array_exercises` AS ae -- advanced.array_exercises 테이블에서 데이터를 가져옵니다. CROSS JOIN UNNEST(actors) AS actor -- actors 배열을 평면화하여 각 배우와 배역 정보를 개별 행으로 만듭니다. CROSS JOIN UNNEST(genres) AS genre -- genres 배열을 평면화하여 각 장르를 개별 행으로 만듭니다. WHERE actor.actor = "Chris Evans" -- 특정 배우의 이름을 기준으로 필터링합니다. 이 조건을 통해 Chris Evans가 출연한 영화만 선택합니다. AND genre = "Action" -- 특정 장르를 기준으로 필터링합니다. Action 장르의 영화만 선택합니다.UNNEST(actors) 및 UNNEST(genres): 각각의 배열을 평면화하여 필요한 정보에 접근하기 쉽게 만들어 줍니다. 평면화된 배열의 데이터를 각 필드별로 나누어 분석하고 이해하기가 쉬워집니다.AS actor, AS genre: 평면화된 결과에 별칭을 부여하여 이후 쿼리에서 이를 명확하게 사용할 수 있도록 합니다. 별칭을 사용함으로써 코드의 가독성이 향상되고, 각 필드를 참조할 때 혼동을 줄일 수 있습니다.앱 로그 데이터(app_logs)의 배열을 풀어주세요.SELECT user_id, event_date, event_name, user_pseudo_id, ae.key, ae.value.string_value, ae.value.int_value FROM `advanced.app_logs` CROSS JOIN UNNEST(event_params) AS aeWITH base AS ( SELECT event_date, event_timestamp, event_name, event_param.key AS key, -- event_params 배열의 key 값을 가져옵니다. 각 이벤트의 파라미터 키를 명확히 확인할 수 있도록 합니다. event_param.value AS value, -- event_params 배열의 value 구조체를 가져옵니다. value에는 여러 필드가 포함될 수 있으므로 전체 구조체를 가져옵니다. event_param.value.string_value AS string_value, -- value 구조체 내의 string_value 필드에 접근합니다. 문자열 형태의 파라미터 값을 직접 추출합니다. event_param.value.int_value AS int_value, -- value 구조체 내의 int_value 필드에 접근합니다. 정수 형태의 값을 명확히 추출하여 분석합니다. user_id -- 이벤트를 발생시킨 사용자 ID입니다. 사용자별 이벤트 참여를 분석하기 위해 필요합니다. FROM `advanced.app_logs` CROSS JOIN UNNEST(event_params) AS event_param -- event_params 배열을 평면화하여 각 파라미터를 독립된 행으로 만듭니다. WHERE event_date = "2022-08-01" -- 특정 날짜의 데이터를 선택합니다. 이 날짜에 발생한 이벤트만 필터링하여 분석합니다. ) SELECT event_date, -- 이벤트가 발생한 날짜를 선택합니다. 이를 통해 특정 날짜에 발생한 이벤트 정보를 확인할 수 있습니다. event_name, -- 이벤트의 이름을 선택합니다. 이벤트 종류별로 데이터를 구분하여 분석할 수 있도록 합니다. COUNT(DISTINCT user_id) AS cnt -- 중복되지 않는 사용자 수를 집계하여 이벤트별 사용자 수를 계산합니다. 각 이벤트에 참여한 고유 사용자의 수를 계산함으로써 이벤트의 인기도를 파악할 수 있습니다. FROM base GROUP BY event_date, event_name -- 날짜와 이벤트 이름으로 그룹화하여 집계합니다. 이를 통해 각 날짜의 이벤트별 참여도를 명확히 알 수 있습니다. ORDER BY cnt DESC; -- 사용자 수를 기준으로 내림차순 정렬하여 가장 많은 사용자가 참여한 이벤트를 먼저 보여줍니다. 이를 통해 인기 이벤트를 한눈에 파악할 수 있습니다. PIVOT 연습 문제 orders 테이블에서 유저(user_id)별로 주문 금액(amount)의 합계를 PIVOT해주세요. 날짜(order_date)를 행(Row)으로, user_id를 열(Column)으로 만들어야 합니다.SELECT order_date, SUM(IF(user_id = 1, amount, 0)) AS user_1, -- user_id가 1인 유저의 금액 합계를 해당 열로 표현합니다. SUM(IF(user_id = 2, amount, 0)) AS user_2, -- user_id가 2인 유저의 금액 합계를 해당 열로 표현합니다. SUM(IF(user_id = 3, amount, 0)) AS user_3 -- user_id가 3인 유저의 금액 합계를 해당 열로 표현합니다. FROM advanced.orders GROUP BY order_date -- 각 날짜(order_date)별로 그룹화하여 유저별 합계를 구합니다. ORDER BY order_date; -- 결과를 날짜 순으로 정렬합니다. orders 테이블에서 날짜(order_date)별로 유저들의 주문 금액(amount)의 합계를 PIVOT 해주세요. user_id를 행(Row)으로, order_date를 열(Column)으로 만들어야 합니다.SELECT user_id, SUM(IF(order_date = DATE '2023-05-01', amount, 0)) AS `2023-05-01`, -- 2023-05-01에 해당하는 주문 금액 합계 SUM(IF(order_date = DATE '2023-05-02', amount, 0)) AS `2023-05-02`, -- 2023-05-02에 해당하는 주문 금액 합계 SUM(IF(order_date = DATE '2023-05-03', amount, 0)) AS `2023-05-03`, -- 2023-05-03에 해당하는 주문 금액 합계 SUM(IF(order_date = DATE '2023-05-04', amount, 0)) AS `2023-05-04`, -- 2023-05-04에 해당하는 주문 금액 합계 SUM(IF(order_date = DATE '2023-05-05', amount, 0)) AS `2023-05-05` -- 2023-05-05에 해당하는 주문 금액 합계 FROM advanced.orders GROUP BY user_id -- 각 유저별로 그룹화하여 날짜별 주문 금액을 피벗 형태로 만듭니다. ORDER BY user_id; -- 결과를 user_id 순으로 정렬합니다.서브쿼리를 사용하는 대신, orders 테이블에서 직접 필요한 데이터를 집계하는 방식으로 간소화합니다. 3. orders 테이블에서 사용자(user_id)별, 날짜(order_date)별로 주문이 있다면 1, 없다면 0으로 PIVOT 해주세요. user_id를 행(Row)으로, order_date를 열(Column)로 만들고 주문을 많이 해도 1로 처리합니다.SELECT user_id, -- 각 사용자 ID를 행으로 표시합니다. MAX(IF(order_date = DATE '2023-05-01', 1, 0)) AS `2023-05-01`, -- 2023-05-01에 주문이 있으면 1, 없으면 0을 반환합니다. MAX(IF(order_date = DATE '2023-05-02', 1, 0)) AS `2023-05-02`, -- 2023-05-02에 주문이 있으면 1, 없으면 0을 반환합니다. MAX(IF(order_date = DATE '2023-05-03', 1, 0)) AS `2023-05-03`, -- 2023-05-03에 주문이 있으면 1, 없으면 0을 반환합니다. MAX(IF(order_date = DATE '2023-05-04', 1, 0)) AS `2023-05-04`, -- 2023-05-04에 주문이 있으면 1, 없으면 0을 반환합니다. MAX(IF(order_date = DATE '2023-05-05', 1, 0)) AS `2023-05-05` -- 2023-05-05에 주문이 있으면 1, 없으면 0을 반환합니다. FROM `advanced.orders` GROUP BY user_id -- 각 사용자별로 그룹화합니다. ORDER BY user_id; -- 사용자 ID 순으로 결과를 정렬하여 표시합니다.amount를 이용하여 MAX()를 사용한 부분에서 금액의 크기를 반환하게 됩니다. 하지만 이 문제에서는 금액이 아닌 주문 여부를 판별해야 합니다. 따라서 amount 대신 주문 여부를 확인하는 1이나 0을 사용해야 합니다.반환 값으로 amount를 사용해 최대값을 찾는 것은 주문이 여러 번 발생했을 때 의미가 없습니다. 우리는 단순히 주문이 있었는지 여부만을 확인하고자 하므로, 1 또는 0을 반환하는 것이 필요합니다. 주문이 한 번이라도 있었다면 1을, 없었다면 0을 반환하는 형태가 되어야 합니다.퍼널쿼리 연습문제각 퍼널의 유저 수를 집계데이터 : 2022-08-01 ~ 2022-08-18 -- Step 1: 초기 데이터 추출 - 기본 테이블(base)을 생성합니다. WITH base AS ( SELECT event_timestamp, event_date, event_name, user_id, user_pseudo_id, platform, MAX(IF(event_param.key = "firebase_screen", event_param.value.string_value, NULL)) AS firebase_screen, MAX(IF(event_param.key = "food_id", event_param.value.int_value, NULL)) AS food_id, MAX(IF(event_param.key = "session_id", event_param.value.string_value, NULL)) AS session_id FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS event_param WHERE event_date BETWEEN "2022-01-01" AND "2022-12-31" GROUP BY event_timestamp, event_date, event_name, user_id, user_pseudo_id, platform -- 필요한 필드만 명시적으로 그룹화합니다. ), -- Step 2: 이벤트 이름과 화면 이름을 결합하여 새로운 형태로 가공합니다. filter_event_and_concat_event_and_screen AS ( SELECT * EXCEPT(event_name, firebase_screen, event_timestamp), -- 특정 필드를 제외하고 나머지를 선택 CONCAT(event_name, "-", firebase_screen) AS event_name_with_screen, -- 이벤트와 화면 이름을 결합하여 새로운 열 생성 DATETIME(TIMESTAMP_MICROS(event_timestamp), 'Asia/Seoul') AS event_datetime -- 타임스탬프를 서울 시간대로 변환 FROM base WHERE event_name IN ("screen_view", "click_payment") -- 필요한 이벤트만 명확하게 필터링합니다. 필터링할 이벤트를 명확하게 정의해야 합니다. ) -- Step 3: 퍼널 단계 정의 및 각 단계별 유저 수를 집계합니다. SELECT event_name_with_screen, event_date, CASE WHEN event_name_with_screen = "screen_view-welcome" THEN 1 WHEN event_name_with_screen = "screen_view-home" THEN 2 WHEN event_name_with_screen = "screen_view-food_category" THEN 3 WHEN event_name_with_screen = "screen_view-restaurant" THEN 4 WHEN event_name_with_screen = "screen_view-cart" THEN 5 WHEN event_name_with_screen = "click_payment-cart" THEN 6 ELSE NULL END AS step_number, COUNT(DISTINCT user_pseudo_id) AS cnt -- 중복 유저를 제거하기 위해 DISTINCT를 사용합니다. FROM filter_event_and_concat_event_and_screen GROUP BY event_name_with_screen, event_date, step_number -- 필요한 열만 명시적으로 그룹화합니다. HAVING step_number IS NOT NULL ORDER BY event_date;
-
미해결BigQuery(SQL) 활용편(퍼널 분석, 리텐션 분석)
빠짝스터디 1주차 과제
(사진)