묻고 답해요
143만명의 커뮤니티!! 함께 토론해봐요.
인프런 TOP Writers
-
미해결BigQuery(SQL) 활용편(퍼널 분석, 리텐션 분석)
[빠짝스터디 1주차 과제] ARRAY, STRUCT 연습 문제/ PIVOT 연습 문제/ 퍼널 쿼리 연습 문제
1. ARRAY, STRUCT 연습 문제# 1. array_exercises 테이블에서 각 영화(title)별로 장르(genres)를 UNNEST해서 보여주세요. # 쿼리를 작성하는 목표, 확인할 지표: title, genre # 쿼리 계산 방법: UNNEST # 데이터의 기간: X # 사용할 테이블: advanced.array_exercises # JOIN KEY: X # 데이터 특징: genres는 배열임 SELECT title, genre FROM advanced.array_exercises CROSS JOIN UNNEST(genres) AS genre ORDER BY title# 2. array_exercises 테이블에서 각 영화(title)별로 배우(actor)와 배역(character)을 보여주세요. 배우와 배역은 별도의 컬럼으로 나와야 합니다. # 쿼리를 작성하는 목표, 확인할 지표: title, actors.actor, actors.character # 쿼리 계산 방법: UNNEST # 데이터의 기간: X # 사용할 테이블: advanced.array_exercises # JOIN KEY: X # 데이터 특징: actors는 actor, character로 구성된 struct이고, actor, character은 배열임 SELECT title, a.actor, a.character FROM advanced.array_exercises CROSS JOIN UNNEST(actors) AS a ORDER BY title# 3. array_exercises 테이블에서 각 영화(title)별로 배우(actor), 배역(character), 장르(genre)를 출력하세요. 한 Row에 배우, 배역, 장르가 모두 표시되어야 합니다. # 쿼리를 작성하는 목표, 확인할 지표: title, actors.actor, actors.character, genres # 쿼리 계산 방법: UNNEST # 데이터의 기간: X # 사용할 테이블: advanced.array_exercises # JOIN KEY: X # 데이터 특징: -- actors는 actor, charactor로 구성된 struct이고, actor, charactor은 배열임 -- genres는 배열임 SELECT title, a.actor, a.character, genre FROM advanced.array_exercises CROSS JOIN UNNEST(actors) AS a CROSS JOIN UNNEST(genres) AS genre ORDER BY title# 4. 앱 로그 데이터(app_logs)의 배열을 풀어주세요 # 쿼리를 작성하는 목표, 확인할 지표: app_logs 데이터 평면화 # 쿼리 계산 방법: UNNEST # 데이터의 기간: X # 사용할 테이블: advanced.app_logs # JOIN KEY: X # 데이터 특징: -- event_params는 key, value로 구성된 struct이고, -- key는 배열이며, -- value는 string_value, int_value로 구성된 struct이고, -- string_value, int_value 배열임 SELECT user_id, event_date, event_name, user_pseudo_id, ep.key, ep.value.string_value, ep.value.int_value FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS ep 2. PIVOT 연습 문제# 1. orders 테이블에서 유저(user_id)별로 주문 금액(amount)의 합계를 PIVOT해주세요. 날짜(order_date)를 행(Row)으로, user_id를 열(Column)으로 만들어야 합니다. # 쿼리를 작성하는 목표, 확인할 지표: user_id 별 amount PIVOT # 쿼리 계산 방법: PIVOT # 데이터의 기간: X # 사용할 테이블: advanced.orders # JOIN KEY: X # 데이터 특징: 원본 테이블에 order_date, user_id, amount가 열로 저장되어 있음 SELECT order_date, SUM(IF(user_id = 1, amount, 0)) AS user_1, SUM(IF(user_id = 2, amount, 0)) AS user_2, SUM(IF(user_id = 3, amount, 0)) AS user_3, FROM advanced.orders GROUP BY order_date ORDER BY order_date# 2. orders 테이블에서 날짜(order_date)별로 유저들의 주문 금액(amount)의 합계를 PIVOT 해주세요. user_id를 행(Row)으로, order_date를 열(Column)으로 만들어야 합니다. # 쿼리를 작성하는 목표, 확인할 지표: order_date 별 SUM(amount) # 쿼리 계산 방법: PIVOT # 데이터의 기간: X # 사용할 테이블: advanced.orders # JOIN KEY: X # 데이터 특징: 원본 테이블에 order_date, user_id, amount가 열로 저장되어 있음 SELECT user_id, SUM(IF(order_date = '2023-05-01', amount, 0)), SUM(IF(order_date = '2023-05-02', amount, 0)), SUM(IF(order_date = '2023-05-03', amount, 0)), SUM(IF(order_date = '2023-05-04', amount, 0)), SUM(IF(order_date = '2023-05-05', amount, 0)) FROM advanced.orders GROUP BY user_id ORDER BY user_id# 3. orders 테이블에서 사용자(user_id)별, 날짜(order_date)별로 주문이 있다면 1, 없다면 0으로 PIVOT 해주세요. user_id를 행(Row)으로, order_date를 열(Column)로 만들고 주문을 많이 해도 1로 처리합니다. # 쿼리를 작성하는 목표, 확인할 지표: user_id 별 order_date에 주문했는지 안 했는지 출력 # 쿼리 계산 방법: PIVOT # 데이터의 기간: X # 사용할 테이블: advanced.orders # JOIN KEY: X # 데이터 특징: 원본 테이블에 order_date, user_id가 열로 저장되어 있음 SELECT user_id, MAX(IF(order_date = '2023-05-01', 1, 0)), MAX(IF(order_date = '2023-05-02', 1, 0)), MAX(IF(order_date = '2023-05-03', 1, 0)), MAX(IF(order_date = '2023-05-04', 1, 0)), MAX(IF(order_date = '2023-05-05', 1, 0)) FROM advanced.orders GROUP BY user_id ORDER BY user_id# 4. user_id = 32888이 카트 추가하기(click_cart)를 누를때 어떤 음식(food_id)을 담았나요? # 쿼리를 작성하는 목표, 확인할 지표: user_id = 32888의 click_cart 할 때 food_id # 쿼리 계산 방법: PIVOT # 데이터의 기간: X # 사용할 테이블: advanced.app_logs # JOIN KEY: X # 데이터 특징: -- event_params-> key/value-> string_value/int_value -- key: firebase_screen/food_id/session_id SELECT user_id, event_date, event_name, event_timestamp, user_pseudo_id, MAX(IF(key = 'firebase_screen', string_value, NULL)) AS firevase_screen, MAX(IF(key = 'food_id', int_value, NULL)) AS food_id, MAX(IF(key = 'session_id', string_value, NULL)) AS session_id FROM ( SELECT user_id, event_date, event_name, event_timestamp, user_pseudo_id, event_param.key AS key, event_param.value.string_value AS string_value, event_param.value.int_value AS int_value FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS event_param WHERE user_id = 32888 AND event_name = 'click_cart' ) GROUP BY ALL ORDER BY event_date 3. 퍼널 쿼리 연습 문제# 1. 퍼널 별 유저 수 집계 # 쿼리를 작성하는 목표, 확인할 지표: 퍼널 별 유저 수 집계 # 쿼리 계산 방법: UNNEST -> event_name/string_value, event_date에 조건 설정 -> user_pseudo_id를 DISTINCT하게 COUNT # 데이터의 기간: 2022-08-01 ~ 2022-08-18 # 사용할 테이블: advanced.app_logs # JOIN KEY: X # 데이터 특징: -- screen name의 경우 event_params.key = 'firebase_screen'인 event_params.value.string_value를 추출 SELECT CONCAT(event_name, '-', string_value) AS event_name_with_screen, CASE WHEN event_name = 'screen_view' AND string_value = 'welcome' THEN 1 WHEN event_name = 'screen_view' AND string_value = 'home' THEN 2 WHEN event_name = 'screen_view' AND string_value = 'food_category' THEN 3 WHEN event_name = 'screen_view' AND string_value = 'restaurant' THEN 4 WHEN event_name = 'screen_view' AND string_value = 'cart' THEN 5 WHEN event_name = 'click_payment' AND string_value = 'cart' THEN 6 END AS step_number, COUNT(DISTINCT(user_pseudo_id)) AS cnt FROM ( SELECT event_date, event_name, event_param.value.string_value AS string_value, user_pseudo_id FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS event_param WHERE event_date BETWEEN '2022-08-01' AND '2022-08-18' AND event_param.key = 'firebase_screen' AND ( (event_name = 'screen_view' AND event_param.value.string_value = 'welcome') OR (event_name = 'screen_view' AND event_param.value.string_value = 'home') OR (event_name = 'screen_view' AND event_param.value.string_value = 'food_category') OR (event_name = 'screen_view' AND event_param.value.string_value = 'restaurant') OR (event_name = 'screen_view' AND event_param.value.string_value = 'cart') OR (event_name = 'click_payment' AND event_param.value.string_value = 'cart') ) ) GROUP BY ALL ORDER BY step_number# 2. 일자 별 퍼널 별 유저 수 집계 # 쿼리를 작성하는 목표, 확인할 지표: 일자별 퍼널 별 유저 수 집계 # 쿼리 계산 방법: UNNEST -> event_name/string_value, event_date에 조건 설정 -> user_pseudo_id를 DISTINCT하게 COUNT # 데이터의 기간: 2022-08-01 ~ 2022-08-18 # 사용할 테이블: advanced.app_logs # JOIN KEY: X # 데이터 특징: -- screen name의 경우 event_params.key = 'firebase_screen'인 event_params.value.string_value를 추출 SELECT event_date, CONCAT(event_name, '-', string_value) AS event_name_with_screen, CASE WHEN event_name = 'screen_view' AND string_value = 'welcome' THEN 1 WHEN event_name = 'screen_view' AND string_value = 'home' THEN 2 WHEN event_name = 'screen_view' AND string_value = 'food_category' THEN 3 WHEN event_name = 'screen_view' AND string_value = 'restaurant' THEN 4 WHEN event_name = 'screen_view' AND string_value = 'cart' THEN 5 WHEN event_name = 'click_payment' AND string_value = 'cart' THEN 6 END AS step_number, COUNT(DISTINCT(user_pseudo_id)) AS cnt FROM ( SELECT event_date, event_name, event_param.value.string_value AS string_value, user_pseudo_id FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS event_param WHERE event_date BETWEEN '2022-08-01' AND '2022-08-18' AND event_param.key = 'firebase_screen' AND ( (event_name = 'screen_view' AND event_param.value.string_value = 'welcome') OR (event_name = 'screen_view' AND event_param.value.string_value = 'home') OR (event_name = 'screen_view' AND event_param.value.string_value = 'food_category') OR (event_name = 'screen_view' AND event_param.value.string_value = 'restaurant') OR (event_name = 'screen_view' AND event_param.value.string_value = 'cart') OR (event_name = 'click_payment' AND event_param.value.string_value = 'cart') ) ) GROUP BY ALL ORDER BY event_date# 2-1. 일자 별 퍼널 별 유저 수 집계 PIVOT하기 SELECT event_date, MAX(IF(event_name_with_screen='screen_view-welcome', cnt, NULL)) AS `screen_view-welcome`, MAX(IF(event_name_with_screen='screen_view-home', cnt, NULL)) AS `screen_view-home`, MAX(IF(event_name_with_screen='screen_view-food_category', cnt, NULL)) AS `screen_view-food_category`, MAX(IF(event_name_with_screen='screen_view-restaurant', cnt, NULL)) AS `screen_view-restaurant`, MAX(IF(event_name_with_screen='screen_view-cart', cnt, NULL)) AS `screen_view-cart`, MAX(IF(event_name_with_screen='click_payment-cart', cnt, NULL)) AS `click_payment-cart`, FROM ( SELECT event_date, event_name_with_screen, CASE WHEN event_name_with_screen = "screen_view-welcome" THEN 1 WHEN event_name_with_screen = "screen_view-home" THEN 2 WHEN event_name_with_screen = "screen_view-food_category" THEN 3 WHEN event_name_with_screen = "screen_view-restaurant" THEN 4 WHEN event_name_with_screen = "screen_view-cart" THEN 5 WHEN event_name_with_screen = "click_payment-cart" THEN 6 ELSE NULL END AS step_number, COUNT(DISTINCT user_pseudo_id) AS cnt FROM filter_event_and_concat_event_and_screen GROUP BY ALL HAVING step_number IS NOT NULL ) GROUP BY ALL ORDER BY event_date
-
미해결BigQuery(SQL) 활용편(퍼널 분석, 리텐션 분석)
[빠짝스터디 1주차 과제] ARRAY, STRUCT 연습 문제/ PIVOT 연습문제/ 퍼널 쿼리 연습 문제
시간관리의 실패로 퀄리티있는 학습을 못 한 것 같습니다. 과제 제출 시간 이후에 복습하며 다시 꼼꼼하게 살피겠습니다!ARRAY, STRUCTCREATE OR REPLACE TABLE advanced.array_exercises AS SELECT movie_id, title, actors, genres FROM ( SELECT 1 AS movie_id, 'Avengers: Endgame' AS title, ARRAY<STRUCT<actor STRING, character STRING>>[ STRUCT('Robert Downey Jr.', 'Tony Stark'), STRUCT('Chris Evans', 'Steve Rogers') ] AS actors, ARRAY<STRING>['Action', 'Adventure', 'Drama'] AS genres UNION ALL SELECT 2, 'Inception', ARRAY<STRUCT<actor STRING, character STRING>>[ STRUCT('Leonardo DiCaprio', 'Cobb'), STRUCT('Joseph Gordon-Levitt', 'Arthur') ], ARRAY<STRING>['Action', 'Adventure', 'Sci-Fi'] UNION ALL SELECT 3, 'The Dark Knight', ARRAY<STRUCT<actor STRING, character STRING>>[ STRUCT('Christian Bale', 'Bruce Wayne'), STRUCT('Heath Ledger', 'Joker') ], ARRAY<STRING>['Action', 'Crime', 'Drama'] ) array_exercises 테이블에서 각 영화(title)별로 장르(genres)를 UNNEST해서 보여주세요.SELECT title, genre FROM `advanced.array_exercises` CROSS JOIN UNNEST(genres) AS genre array_exercises 테이블에서 각 영화(title)별로 배우(actor)와 배역(character)을 보여주세요. 배우와 배역은 별도의 컬럼으로 나와야 합니다.SELECT title, actor.actor, actor.character FROM `advanced.array_exercises`, UNNEST(actors) as actor array_exercises 테이블에서 각 영화(title)별로 배우(actor), 배역(character), 장르 (genre)를 출력하세요. 한 Row에 배우, 배역, 장르가 모두 표시되어야 합니다.SELECT title, actor.actor AS actor, actor.character AS character, genre FROM advanced.array_exercises CROSS JOIN UNNEST(actors) AS actor CROSS JOIN UNNEST(genres) AS genre 앱 로그 데이터(app_logs)의 배열을 풀어주세요.SELECT event_timestamp, event_name, event_param.key AS key, event_param.value.string_value AS string_value, event_param.value.int_value AS int_value, user_id FROM `advanced.app_logs` CROSS JOIN UNNEST(event_params) AS event_param PIVOTorders 테이블에서 유저(user_id)별로 주문 금액(amount)의 합계를 PIVOT해주세요. 날짜(order_date)를 행(Row)으로, user_id를 열(Column)으로 만들어야 합니다.SELECT order_date, SUM(IF(user_id = 1, amount, 0)) AS user_1, SUM(IF(user_id = 2, amount, 0)) AS user_2, SUM(IF(user_id = 3, amount, 0)) AS user_3, FROM `advanced.orders` GROUP BY order_date ORDER BY order_date orders 테이블에서 날짜(order_date)별로 유저들의 주문 금액(amount)의 합계를 PIVOT 해주세요. user_id를 행(Row)으로, order_date를 열(Column)으로 만들어야 합니다.SELECT user_id, MAX(IF(order_date = "2023-05-01", amount, 0)) AS `2023-05-01`, MAX(IF(order_date = "2023-05-02", amount, 0)) AS `2023-05-02`, MAX(IF(order_date = "2023-05-03", amount, 0)) AS `2023-05-03`, MAX(IF(order_date = "2023-05-04", amount, 0)) AS `2023-05-04`, MAX(IF(order_date = "2023-05-05", amount, 0)) AS `2023-05-05` FROM `advanced.orders` GROUP BY user_id ORDER BY user_id orders 테이블에서 사용자(user_id)별, 날짜(order_date)별로 주문이 있다면 1, 없다면 0으로 PIVOT 해주세요. user_id를 행(Row)으로, order_date를 열(Column)로 만들고 주문을 많이 해도 1로 처리합니다.SELECT user_id, SUM(IF(order_date = "2023-05-01", amount, 0)) AS `2023-05-01`, SUM(IF(order_date = "2023-05-02", amount, 0)) AS `2023-05-02`, SUM(IF(order_date = "2023-05-03", amount, 0)) AS `2023-05-03`, SUM(IF(order_date = "2023-05-04", amount, 0)) AS `2023-05-04`, SUM(IF(order_date = "2023-05-05", amount, 0)) AS `2023-05-05` FROM `advanced.orders` GROUP BY user_id 앱 로그 PIVOTWITH base AS ( SELECT event_date, event_timestamp, event_name, user_id, user_pseudo_id, MAX(IF(param.key = "firebase_screen", param.value.string_value, NULL)) AS firebase_screen, MAX(IF(param.key = "sesstion_id", param.value.string_value, NULL)) AS sesstion_id FROM `advanced.app_logs` CROSS JOIN UNNEST(event_params) AS param WHERE event_date = "2022-08-01" GROUP BY ALL ) SELECT event_date, COUNT(user_id) AS user_cnt FROM base WHERE event_name = "click_cart" GROUP BY event_date FunnelWITH base AS ( SELECT event_date, event_timestamp, event_name, user_id, user_pseudo_id, platform, MAX(IF(event_param.key = "firebase_screen", event_param.value.string_value, NULL)) AS firebase_screen, MAX(IF(event_param.key = "session_id", event_param.value.string_value, NULL)) AS session_id FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS event_param WHERE event_date BETWEEN "2022-08-01" AND "2022-08-18" GROUP BY ALL ), filter_event_and_concat_event_and_screen AS ( SELECT * EXCEPT(event_name, firebase_screen, event_timestamp), CONCAT(event_name, "-", firebase_screen) AS event_name_with_screen, DATETIME(TIMESTAMP_MICROS(event_timestamp), 'Asia/Seoul') AS event_datetime FROM base WHERE event_name IN ("screen_view", "click_payment") ) SELECT event_date, event_name_with_screen, CASE WHEN event_name_with_screen = "screen_view-welcome" THEN 1 WHEN event_name_with_screen = "screen_view-home" THEN 2 WHEN event_name_with_screen = "screen_view-food_category" THEN 3 WHEN event_name_with_screen = "screen_view-restaurant" THEN 4 WHEN event_name_with_screen = "screen_view-cart" THEN 5 WHEN event_name_with_screen = "click_payment-cart" THEN 6 ELSE NULL END AS step_number, COUNT(DISTINCT user_pseudo_id) AS cnt FROM filter_event_and_concat_event_and_screen GROUP BY ALL HAVING step_number IS NOT NULL
-
미해결BigQuery(SQL) 활용편(퍼널 분석, 리텐션 분석)
[빠짝스터디 1주차 과제] ARRAY, STRUCT / PIVOT / 퍼널 연습 문제
1. ARRAY, STRUCT 연습문제- 1) array_exercises 테이블에서 각 영화(title)별로 장르(genres)를 UNNEST해서 보여주세요 SELECT title , genre FROM advanced.array_exercises CROSS JOIN UNNEST(genres) AS genre -- 2) array_exercises 테이블에서 각 영화(title)별로 배우(actor)와 배역(character)을 보여주세요. 배우와 배역은 별도의 컬럼으로 나와야 합니다 SELECT title, , actor.actor AS actor , actor.character AS character FROM advanced.array_exercises CROSS JOIN UNNEST(actors) AS actor -- 3) array_exercises 테이블에서 각 영화(title)별로 배우(actor), 배역(character), 장르(genre)를 출력하세요. 한 Row에 배우, 배역, 장르가 모두 표시되어야 합니다 SELECT title , actor.actor AS actor , actor.character AS character , genre FROM advanced.array_exercises CROSS JOIN UNNEST(actors) AS actor CROSS JOIN UNNEST(genres) AS genre -- 4) 앱 로그 데이터(app_logs)의 배열을 풀어주세요 SELECT user_id , event_date , event_name , user_pseudo_id , params.key AS key , params.value.string_value AS str_value , params.value.int_value AS int_value FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS params WHERE event_date = '2022-08-01'2. PIVOT 연습문제 -- 1) orders 테이블에서 유저(user_id)별로 주문 금액(amount)의 합계를 PIVOT해주세요. 날짜(order_date)를 행(Row)으로, user_id를 열(Column)으로 만들어야 합니다 WITH step1 AS ( SELECT order_date , user_id , sum(amount) AS sum_of_amount FROM advanced.orders GROUP BY ALL ) SELECT order_date , MAX(IF(user_id = 1, sum_of_amount, 0)) AS user_1 , MAX(IF(user_id = 2, sum_of_amount, 0)) AS user_2 , MAX(IF(user_id = 3, sum_of_amount, 0)) AS user_3 FROM step1 GROUP BY order_date ORDER BY order_date -- 2) orders 테이블에서 날짜(order_date)별로 유저들의 주문 금액(amount)의 합계를 PIVOT 해주세요. user_id를 행(Row)으로, order_date를 열(Column)으로 만들어야 합니다 SELECT user_id , SUM(IF(order_date = '2023-05-01', amount, 0)) AS `2023-05-01` , SUM(IF(order_date = '2023-05-02', amount, 0)) AS `2023-05-02` , SUM(IF(order_date = '2023-05-03', amount, 0)) AS `2023-05-03` , SUM(IF(order_date = '2023-05-04', amount, 0)) AS `2023-05-04` , SUM(IF(order_date = '2023-05-05', amount, 0)) AS `2023-05-05` FROM advanced.orders GROUP BY user_id ORDER BY user_id -- 3) orders 테이블에서 사용자(user_id)별, 날짜(order_date)별로 주문이 있다면 1, 없다면 0으로 PIVOT 해주세요. user_id를 행(Row)으로, order_date를 열(Column)로 만들고 주문을 많이 해도 1로 처리합니다 SELECT user_id , MAX(IF(order_date = '2023-05-01', 1, 0)) AS `2023-05-01` , MAX(IF(order_date = '2023-05-02', 1, 0)) AS `2023-05-02` , MAX(IF(order_date = '2023-05-03', 1, 0)) AS `2023-05-03` , MAX(IF(order_date = '2023-05-04', 1, 0)) AS `2023-05-04` , MAX(IF(order_date = '2023-05-05', 1, 0)) AS `2023-05-05` FROM advanced.orders GROUP BY user_id ORDER BY user_id -- 4) 앱 로그 데이터 배열 PIVOT하기 SELECT user_id , event_date , event_name , user_pseudo_id , MAX(IF(params.key = 'firebase_screen', params.value.string_value, NULL)) AS firebase_screen , MAX(IF(params.key = 'food_id', params.value.int_value, NULL)) AS food_id , MAX(IF(params.key = 'session_id', params.value.string_value, NULL)) AS session_id FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS params WHERE event_date = '2022-08-01' GROUP BY ALL3. 퍼널분석WITH step1 AS ( SELECT event_date , event_timestamp , event_name , user_id , user_pseudo_id , MAX(IF(params.key = 'firebase_screen', params.value.string_value, NULL)) AS firebase_screen , MAX(IF(params.key = 'session_id', params.value.string_value, NULL)) AS session_id , platform FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS params WHERE event_date BETWEEN '2022-08-01' AND '2022-08-18' GROUP BY ALL ), step2 AS ( SELECT * EXCEPT(event_timestamp) , CONCAT(event_name, '-', firebase_screen) AS event_name_with_screen , DATETIME(TIMESTAMP_MICROS(event_timestamp), 'Asia/Seoul') AS event_datetime FROM step1 ), step3 AS ( SELECT * , CASE WHEN event_name_with_screen = 'screen_view-welcome' THEN 1 WHEN event_name_with_screen = 'screen_view-home' THEN 2 WHEN event_name_with_screen = 'screen_view-food_category' THEN 3 WHEN event_name_with_screen = 'screen_view-restaurant' THEN 4 WHEN event_name_with_screen = 'screen_view-cart' THEN 5 WHEN event_name_with_screen = 'click_payment-cart' THEN 6 END AS step_number FROM step2 ), step3_1 AS ( -- 1) 각 퍼널별 유저 수 집계 SELECT event_name_with_screen , step_number , COUNT(DISTINCT user_pseudo_id) AS cnt FROM step3 GROUP BY ALL HAVING step_number IS NOT NULL ORDER BY step_number ) , step3_2 AS ( -- 2) 일자별 각 퍼널별 유저 수 집계 SELECT event_date , event_name_with_screen , step_number , COUNT(DISTINCT user_pseudo_id) AS cnt FROM step3 GROUP BY ALL HAVING step_number IS NOT NULL ORDER BY event_date , step_number ) -- 3) 2) 데이터를 PIVOT SELECT event_date , MAX(IF(event_name_with_screen = 'screen_view-welcome', cnt, NULL)) AS `screen_view-welcome` , MAX(IF(event_name_with_screen = 'screen_view-home', cnt, NULL)) AS `screen_view-home` , MAX(IF(event_name_with_screen = 'screen_view-food_category', cnt, NULL)) AS `screen_view-food_category` , MAX(IF(event_name_with_screen = 'screen_view-restaurant', cnt, NULL)) AS `screen_view-restaurant` , MAX(IF(event_name_with_screen = 'screen_view-cart', cnt, NULL)) AS `screen_view-cart` , MAX(IF(event_name_with_screen = 'click_payment-cart', cnt, NULL)) AS `click_payment-cart` FROM step3_2 GROUP BY event_date ORDER BY event_date
-
미해결BigQuery(SQL) 활용편(퍼널 분석, 리텐션 분석)
[빠짝스터디 1주차 과제] ARRAY,STRUCT, PIVOT, FUNNEL 연습문제
1. ARRAY, STRUCT 연습문제array_exercises 테이블에서 각 영화(title)별로 장르(genres)를 UNNEST해서 보여주세요.SELECT title, genre FROM advanced.array_exercises AS ae CROSS JOIN UNNEST(genres) AS genrearray_exercises 테이블에서 각 영화(title)별로 배우(actor)와 배역(character)을 보여주세요. 배우와 배역은 별도의 컬럼으로 나와야 합니다.SELECT title, actor.actor, actor.character FROM advanced.array_exercises AS ae CROSS JOIN UNNEST(actors) AS actorarray_exercises 테이블에서 각 영화(title)별로 배우(actor)와 배역(character), 장르(genre)를 출력하세요. 한 Row에 배우, 배역, 장르가 모두 표시되어야 합니다.SELECT title, actor.actor, actor.character, genre FROM advanced.array_exercises AS ae , UNNEST(actors) AS actor , UNNEST(genres) AS genre앱 로그 데이터(app_logs)의 배열을 풀어주세요.SELECT user_id, event_date, event_name, user_pseudo_id, event_param.key AS key, event_param.value.string_value AS string_value, event_param.value.int_value AS int_value FROM advanced.app_logs , UNNEST(event_params) AS event_param -- WHERE -- event_date = "2022-08-01"2. PIVOT 연습 문제orders 테이블에서 유저(user_id)별로 주문 금액(amount)의 합계를 PIVOT 해주세요.날짜(order_date)를 행(row)으로, user_id를 열(column)으로 만들어야 합니다.SELECT order_date, MAX(IF(user_id = 1, sum_of_amount, 0)) AS user_1, MAX(IF(user_id = 2, sum_of_amount, 0)) AS user_2, MAX(IF(user_id = 3, sum_of_amount, 0)) AS user_3 FROM ( SELECT order_date, user_id, SUM(amount) AS sum_of_amount FROM advanced.orders GROUP BY order_date, user_id ORDER BY order_date ) GROUP BY order_date ORDER BY order_dateorders 테이블에서 날짜(order_date)별로 유저들의 주문 금액(amount)의 합계를 PIVOT 해주세요.user_id를 행(row)으로, order_date를 열(column)으로 만들어야 합니다.SELECT user_id, MAX(IF(order_date = '2023-05-01', amount, 0)) AS `2023-05-01`, MAX(IF(order_date = '2023-05-02', amount, 0)) AS `2023-05-02`, MAX(IF(order_date = '2023-05-03', amount, 0)) AS `2023-05-03`, MAX(IF(order_date = '2023-05-04', amount, 0)) AS `2023-05-04`, MAX(IF(order_date = '2023-05-05', amount, 0)) AS `2023-05-05`, FROM advanced.orders GROUP BY user_id ORDER BY user_idorders 테이블에서 사용자(user_id)별, 날짜(order_date)별로 주문이 있다면 1, 없다면 0으로 해주세요.user_id를 행(row)으로, order_date를 열(column)로 만들고 주문을 많이 해도 1로 처리합니다.SELECT user_id, MAX(IF(order_date = '2023-05-01', 1, 0)) AS `2023-05-01`, MAX(IF(order_date = '2023-05-02', 1, 0)) AS `2023-05-02`, MAX(IF(order_date = '2023-05-03', 1, 0)) AS `2023-05-03`, MAX(IF(order_date = '2023-05-04', 1, 0)) AS `2023-05-04`, MAX(IF(order_date = '2023-05-05', 1, 0)) AS `2023-05-05`, FROM advanced.orders GROUP BY user_id ORDER BY user_iduser_id = 32888이 카트 추가하기(click_cart)를 누를 때 어떤 음식(food_id)을 담았나요?WITH base AS ( SELECT event_date, event_timestamp, event_name, user_id, user_pseudo_id, MAX(IF(event_param.key = 'firebase screen', event_param.value.string_value, NULL)) AS firebase_screen, MAX(IF(event_param.key = 'food_id', event_param.value.int_value, null)) AS food_id, MAX(IF(event_param.key = 'session_id', event_param.value.string_value, null)) AS session_id FROM advanced.app_logs , UNNEST(event_params) AS event_param GROUP BY ALL ) SELECT food_id FROM base WHERE user_id = 32888 AND event_name = 'click_cart' GROUP BY food_id3. 퍼널 분석 연습문제각 퍼널의 유저 수를 집계WITH base AS( SELECT event_date, event_timestamp, event_name, user_id, user_pseudo_id, platform, MAX(IF(event_param.key = "firebase_screen", event_param.value.string_value, NULL)) AS firebase_screen, MAX(IF(event_param.key = "food_id", event_param.value.int_value, NULL)) AS food_id, MAX(IF(event_param.key = "session_id", event_param.value.int_value, NULL)) AS session_id FROM `advanced.app_logs` CROSS JOIN UNNEST(event_params) AS event_param GROUP BY ALL ), filter_event_and_concat_event_and_screen AS( SELECT * EXCEPT(event_name, firebase_screen), CONCAT(event_name, "-", firebase_screen) AS event_name_with_screen, DATETIME(TIMESTAMP_MICROS(event_timestamp), "Asia/Seoul") AS event_datetime FROM base WHERE event_date BETWEEN "2022-08-01" AND "2022-08-18" AND event_name IN ("screen_view", "click_payment") AND firebase_screen IN ("welcome", "home", "food_category", "restaurant", "cart") ) SELECT event_date, event_name_with_screen, CASE WHEN event_name_with_screen = 'screen_view-welcome' THEN 1 WHEN event_name_with_screen = 'screen_view-home' THEN 2 WHEN event_name_with_screen = 'screen_view-food_category' THEN 3 WHEN event_name_with_screen = 'screen_view-restaurant' THEN 4 WHEN event_name_with_screen = 'screen_view-cart' THEN 5 WHEN event_name_with_screen = 'click_payment-cart' THEN 6 ELSE NULL END AS step_number, COUNT(DISTINCT user_pseudo_id) AS cnt FROM filter_event_and_concat_event_and_screen GROUP BY ALL HAVING step_number IS NOT NULL ORDER BY event_date, step_number
-
미해결BigQuery(SQL) 활용편(퍼널 분석, 리텐션 분석)
[빠짝스터디 1주차 과제] ARRAY, STRUCT / PIVOT / 퍼널 연습 문제
1. ARRAY, STRUCT-- 1. array_exercises 테이블에서 각 영화(title)별로 장르(genres)를 UNNEST해서 보여주세요. -- genres 꺼내기 SELECT title , genre FROM `advanced.array_exercises` CROSS JOIN UNNEST(genres) AS genre -- 2. array_exercises 테이블에서 각 영화(title)별로 배우(actor)와 배역(character)을 보여주세요. -- arrary 안의 struct 영화/배우 꺼내기 SELECT title , ac.actor , ac.character FROM `advanced.array_exercises` CROSS JOIN UNNEST(actors) AS ac -- actor.actor도 가능 -- 3. array_exercises 테이블에서 각 영화(title)별로 배우(actor)와 배역(character), 장르(genre) 출력 -- actors, genres 각각 꺼내기 (2번 조인) SELECT -- title, actor, character, genre title , ac.actor , ac.character , genre FROM `advanced.array_exercises` CROSS JOIN UNNEST(actors) AS ac CROSS JOIN UNNEST(genres) AS genre -- 4. 앱로그 데이터(app_logs)의 배열을 풀어주세요. -- event_params 꺼내기 SELECT user_id , event_date , event_name , user_pseudo_id , event_param.key , event_param.value.string_value , event_param.value.int_value FROM `advanced.app_logs` CROSS JOIN UNNEST(event_params) AS event_param -- WHERE event_date = '2022-08-01' -- test 2. PIVOT-- 1. orders 테이블에서 유저(user_id)별로 주문 금액(amount)의 합계를 PIOVT해주세요. -- 날짜(order_date)를 행, user_id를 열, sum(amount) SELECT order_date , SUM(IF(user_id = 1, amount, 0)) AS user_1 , SUM(IF(user_id = 2, amount, 0)) AS user_2 , SUM(IF(user_id = 3, amount, 0)) AS user_3 FROM `advanced.orders` GROUP BY 1 ORDER BY 1 -- 2. orders 테이블에서 날짜(order_date)별로 유저들의 주문 금액(amount) 합계를 PIVOT -- user_id 행, order_date 열, sum(주문 금액), '-' 포함 날짜 별칭은 backtick(``) 활용 SELECT user_id , SUM(IF(order_date = '2023-05-01', amount, 0)) AS `2023-05-01` , SUM(IF(order_date = '2023-05-02', amount, 0)) AS `2023-05-02` , SUM(IF(order_date = '2023-05-03', amount, 0)) AS `2023-05-03` , SUM(IF(order_date = '2023-05-04', amount, 0)) AS `2023-05-04` , SUM(IF(order_date = '2023-05-05', amount, 0)) AS `2023-05-05` FROM `advanced.orders` GROUP BY user_id ORDER BY user_id -- 3. orders 테이블에서 사용자별 날짜별로 주문이 있다면 1, 없으면 0으로 PIOVT -- user_id 행, order_date 열, if(날짜, 1, 0) SELECT user_id , MAX(IF(order_date = '2023-05-01', 1, 0)) AS `2023-05-01` , MAX(IF(order_date = '2023-05-02', 1, 0)) AS `2023-05-02` , MAX(IF(order_date = '2023-05-03', 1, 0)) AS `2023-05-03` , MAX(IF(order_date = '2023-05-04', 1, 0)) AS `2023-05-04` , MAX(IF(order_date = '2023-05-05', 1, 0)) AS `2023-05-05` FROM `advanced.orders` GROUP BY user_id ORDER BY user_id 3. 퍼널3-1. 퍼널별 유저 수-- 1. 퍼널별 유저 수 : 2022-08-01 ~ 2022-08-18, 오픈 퍼널, COUNT(DISTINCT user_pseudo_id) WITH funnels AS ( -- 조건/컬럼 필터링, event_params UNNEST, event name + screen 문자열 컬럼 병합 SELECT CONCAT(event_name, '-', event_param.value.string_value) AS event_name_with_screen , COUNT(DISTINCT user_pseudo_id) AS cnt FROM `advanced.app_logs` CROSS JOIN UNNEST(event_params) AS event_param WHERE event_date BETWEEN '2022-08-01' AND '2022-08-18' AND event_name IN ('screen_view', 'click_payment') AND event_param.key = 'firebase_screen' AND event_param.value.string_value NOT IN ('food_detail', 'search', 'search_result') GROUP BY ALL ) SELECT event_name_with_screen -- step_number 생성 , (CASE WHEN event_name_with_screen = 'screen_view-welcome' THEN 1 WHEN event_name_with_screen = 'screen_view-home' THEN 2 WHEN event_name_with_screen = 'screen_view-food_category' THEN 3 WHEN event_name_with_screen = 'screen_view-restaurant' THEN 4 WHEN event_name_with_screen = 'screen_view-cart' THEN 5 WHEN event_name_with_screen = 'click_payment-cart' THEN 6 END ) AS step_number , cnt FROM funnels ORDER BY step_number3-2. 퍼널별 유저 수(일자별)-- 2. 퍼널별 유저 수(일자별) : 2022-08-01 ~ 2022-08-18, 오픈 퍼널, COUNT(DISTINCT user_pseudo_id) WITH funnels AS ( -- 조건/컬럼 필터링, event_params UNNEST, event name + screen 문자열 컬럼 병합 SELECT -- 일자별 event_date , CONCAT(event_name, '-', event_param.value.string_value) AS event_name_with_screen , COUNT(DISTINCT user_pseudo_id) AS cnt FROM `advanced.app_logs` CROSS JOIN UNNEST(event_params) AS event_param WHERE event_date BETWEEN '2022-08-01' AND '2022-08-18' AND event_name IN ('screen_view', 'click_payment') AND event_param.key = 'firebase_screen' AND event_param.value.string_value NOT IN ('food_detail', 'search', 'search_result') GROUP BY ALL ) SELECT event_date , event_name_with_screen -- step_number 생성 , (CASE WHEN event_name_with_screen = 'screen_view-welcome' THEN 1 WHEN event_name_with_screen = 'screen_view-home' THEN 2 WHEN event_name_with_screen = 'screen_view-food_category' THEN 3 WHEN event_name_with_screen = 'screen_view-restaurant' THEN 4 WHEN event_name_with_screen = 'screen_view-cart' THEN 5 WHEN event_name_with_screen = 'click_payment-cart' THEN 6 END ) AS step_number , cnt FROM funnels ORDER BY event_date, step_number -- 정렬 변경 3-3. 퍼널별 유저 수(일자별 PIVOT)-- 3. 퍼널별 유저 수(일자별) 집계 PIVOT WITH funnels AS ( -- 조건/컬럼 필터링, event_params UNNEST, event name + screen 문자열 컬럼 병합 SELECT -- 일자별 event_date , CONCAT(event_name, '-', event_param.value.string_value) AS event_name_with_screen , COUNT(DISTINCT user_pseudo_id) AS cnt FROM `advanced.app_logs` CROSS JOIN UNNEST(event_params) AS event_param WHERE event_date BETWEEN '2022-08-01' AND '2022-08-18' AND event_name IN ('screen_view', 'click_payment') AND event_param.key = 'firebase_screen' AND event_param.value.string_value NOT IN ('food_detail', 'search', 'search_result') GROUP BY ALL ) SELECT event_date -- event PIVOT , MAX(IF(event_name_with_screen = 'screen_view-welcome', cnt, NULL)) AS `screen_view-welcome` , MAX(IF(event_name_with_screen = 'screen_view-home', cnt, NULL)) AS `screen_view-home` , MAX(IF(event_name_with_screen = 'screen_view-food_category', cnt, NULL)) AS `screen_view-food_category` , MAX(IF(event_name_with_screen = 'screen_view-restaurant', cnt, NULL)) AS `screen_view-restaurant` , MAX(IF(event_name_with_screen = 'screen_view-cart', cnt, NULL)) AS `screen_view-cart` , MAX(IF(event_name_with_screen = 'click_payment-cart', cnt, NULL)) AS `click_payment-cart` FROM funnels GROUP BY ALL ORDER BY event_date
-
미해결BigQuery(SQL) 활용편(퍼널 분석, 리텐션 분석)
[빠짝스터디 1주차 과제] ARRAY, STRUCT / PIVOT / 퍼널 쿼리 연습 문제
ARRAY, STRUCT 연습문제 1) SELECT title, genre FROM advanced.array_exercises CROSS JOIN UNNEST(genres) AS genre 2) SELECT title, actors, actors[SAFE_OFFSET(0)].actor AS frist_actor, actors[SAFE_OFFSET(0)].character AS first_character, actors[SAFE_OFFSET(1)].actor AS second_actor, actors[SAFE_OFFSET(1)].character AS second_character FROM advanced.array_exercises AS ae 3) SELECT title, actor.actor, actor.character FROM advanced.array_exercises AS ae CROSS JOIN UNNEST(actors) AS actor 4) SELECT event_Date, event_timestamp, event_name, event_params, user_id, event_param.key AS key, event_param.value AS value, event_param.value.string_value AS string_value, event_param.value.int_value AS int_value FROM advanced.app_logs_temp CROSS JOIN UNNEST(event_params) AS event_param WHERE event_date = '2022-08-01'PIVOT 연습문제1) SELECT order_date, SUM(IF(user_id = 1, sum_of_amount, 0)) AS user_1, SUM(IF(user_id = 2, sum_of_amount, 0)) AS user_2, SUM(IF(user_id = 3, sum_of_amount, 0)) AS user_3, FROM ( SELECT order_date, user_id, SUM(amount) AS sum_of_amount FROM advanced.orders GROUP BY order_date, user_id ORDER BY order_date) GROUP BY order_date2) SELECT user_id, SUM(IF(order_date = '2023-05-01', sum_of_amount, 0)) AS `2023-05-01`, SUM(IF(order_date = '2023-05-02', sum_of_amount, 0)) AS `2023-05-02`, SUM(IF(order_date = '2023-05-03', sum_of_amount, 0)) AS `2023-05-03`, SUM(IF(order_date = '2023-05-04', sum_of_amount, 0)) AS `2023-05-04`, SUM(IF(order_date = '2023-05-05', sum_of_amount, 0)) AS `2023-05-05` FROM ( SELECT user_id, order_date, SUM(amount) AS sum_of_amount FROM advanced.orders GROUP BY user_id,order_date) GROUP BY user_id 3) SELECT user_id, MAX(IF(order_date = '2023-05-01' AND user_id IS NOT NULL, 1, 0)) AS `2023-05-01`, MAX(IF(order_date = '2023-05-02' AND user_id IS NOT NULL, 1, 0)) AS `2023-05-02`, MAX(IF(order_date = '2023-05-03' AND user_id IS NOT NULL, 1, 0)) AS `2023-05-03`, MAX(IF(order_date = '2023-05-04' AND user_id IS NOT NULL, 1, 0)) AS `2023-05-04`, MAX(IF(order_date = '2023-05-05' AND user_id IS NOT NULL, 1, 0)) AS `2023-05-05` FROM advanced.orders GROUP BY user_id 4) SELECT event_date, event_timestamp, event_name, user_id, user_pseudo_id, MAX(IF(param.key = 'firebase_screen', param.value.string_value, NULL)) AS firebase_screen, MAX(IF(param.key = 'food_id', param.value.string_value, NULL)) AS food_id, MAX(IF(param.key = 'food_id', param.value.int_value, NULL)) AS food_id2, MAX(IF(param.key = 'session_id', param.value.string_value, NULL)) AS session_id FROM advanced.app_logs_temp CROSS JOIN UNNEST(event_params) AS param WHERE event_date = '2022-08-01' GROUP BY ALL 퍼널분석WITH Funnel AS (SELECT event_date, event_timestamp, event_name, user_id, user_pseudo_id, MAX(IF(param.key = 'firebase_screen', param.value.string_value, NULL)) AS firebase_screen, MAX(IF(param.key = 'food_id', param.value.string_value, NULL)) AS food_id, MAX(IF(param.key = 'food_id', param.value.int_value, NULL)) AS food_id2, MAX(IF(param.key = 'session_id', param.value.string_value, NULL)) AS session_id FROM advanced.app_logs_temp CROSS JOIN UNNEST(event_params) AS param WHERE event_date = '2022-08-01' GROUP BY ALL ) SELECT event_date, COUNT(user_id) AS user_cnt FROM Funnel WHERE event_name = 'click_cart' GROUP BY event_date;
-
미해결BigQuery(SQL) 활용편(퍼널 분석, 리텐션 분석)
[빠짝스터디 1주차 과제] ARRAY, STRUCT / PIVOT / 퍼널 쿼리 연습 문제
ARRAY, STRUCT, UNNEST 연습 문제array_exercises 테이블에서 각 영화(title)별로 장르(genres)를 UNNEST해서 보여주세요SELECT title, genre FROM `plucky-catfish-394207.advanced.array_exercises` , unnest(genres) as genre ORDER BY 1array_exercises 테이블에서 각 영화(title)별로 배우(actor)와 배역(character)을 보여주세요. 배우와 배역은 별도의 컬럼으로 나와야 합니다SELECT title, actor.actor, actor.character, FROM `plucky-catfish-394207.advanced.array_exercises` , unnest(genres) as genre ORDER BY 1array_exercises 테이블에서 각 영화(title)별로 배우(actor), 배역(character), 장르 (genre)를 출력하세요. 한 Row에 배우, 배역, 장르가 모두 표시되어야 합니다 SELECT title, actor.actor, genre, FROM `plucky-catfish-394207.advanced.array_exercises` , unnest(genres) as genre , unnest(actors) as actors ORDER BY 1앱 로그 데이터(app_logs)의 배열을 풀어주세요SELECT user_id, event_date, event_name, user_pseudo_id, ep.key, ep.value.string_value, ep.value.int_value FROM `plucky-catfish-394207.advanced.array_exercises` , unnest(event_params) as ep ORDER BY 2 PIVOTorders 테이블에서 유저(user_id)별로 주문 금액(amount)의 합계를 PIVOT해주세요. 날짜(order_date)를 행(Row)으로, user_id를 열(Column)으로 만들어야 합니다SELECT order_date, SUM(IF(user_id = 1, amount, 0) as user_1, SUM(IF(user_id = 2, amount, 0) as user_2, SUM(IF(user_id = 3, amount, 0) as user_3, FROM `plucky-catfish-394207.advanced.array_exercises` , unnest(event_params) as ep GROUP BY 1 ORDER BY 1orders 테이블에서 날짜(order_date)별로 유저들의 주문 금액(amount)의 합계를 PIVOT 해주세요. user_id를 행(Row)으로, order_date를 열(Column)으로 만들어야 합니다 SELECT user_id, SUM(IF(order_date = '2023-05-01', amount, 0) as `2023-05-01`, SUM(IF(order_date = '2023-05-02', amount, 0) as `2023-05-02`, SUM(IF(order_date = '2023-05-03', amount, 0) as `2023-05-03`, SUM(IF(order_date = '2023-05-04', amount, 0) as `2023-05-04`, SUM(IF(order_date = '2023-05-05', amount, 0) as `2023-05-05`, FROM `plucky-catfish-394207.advanced.array_exercises` , unnest(event_params) as ep GROUP BY 1 ORDER BY 1orders 테이블에서 사용자(user_id)별, 날짜(order_date)별로 주문이 있다면 1, 없다면 0으로 PIVOT 해주세요. user_id를 행(Row)으로, order_date를 열(Column)로 만들고 주문을 많이 해도 1로 처리합니다 SELECT user_id, SUM(IF(order_date = '2023-05-01', amount > 0, 1, 0) as `2023-05-01`, SUM(IF(order_date = '2023-05-02', amount > 0, 1, 0) as `2023-05-02`, SUM(IF(order_date = '2023-05-03', amount > 0, 1, 0) as `2023-05-03`, SUM(IF(order_date = '2023-05-04', amount > 0, 1, 0) as `2023-05-04`, SUM(IF(order_date = '2023-05-05', amount > 0, 1, 0) as `2023-05-05`, FROM `plucky-catfish-394207.advanced.array_exercises` , unnest(event_params) as ep GROUP BY 1 ORDER BY 1user_id = 32888이 카트 추가하기(click_cart)를 누를때 어떤 음식(food_id)을 담았나요? SELECT event_date, event_name, event_timestamp, user_id, user_pseudo_id, MAX(IF(ep.key = 'firebase_screen', ep.value.string_value, NULL)) as firebase_screen, MAX(IF(ep.key = 'food_id', ep.value.int_value, NULL)) as food_id, MAX(IF(ep.key = 'session_id', ep.value.int_value, NULL)) as session_id, FROM `plucky-catfish-394207.advanced.array_exercises` , unnest(event_params) as ep WHERE event_date = '2022-08-01' AND user_id = 32888 GROUP BY 1, 2, 3, 4, 5 퍼널 분석WITH base AS ( SELECT event_date event_timestamp, event_name, user_id, user_pseudo_id, platform, MAX(IF(event_param.key = "firebase_screen", event_param.value.string_value, NULL)) AS firebase_screen, MAX(IF(event_param.key = "session_id", event_param.value.string_value, NULL)) AS session_id FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS event_param WHERE event_date BETWEEN "2022-08-01" AND "2022-08-18" GROUP BY ALL ), filter_event_and_concat_event_and_screen AS ( SELECT * EXCEPT(event_name, firebase_screen, event_timestamp), CONCAT(event_name, "-", firebase_screen) AS event_name_with_screen, DATETIME(TIMESTAMP_MICROS(event_timestamp), 'Asia/Seoul') AS event_datetime FROM base WHERE event_name IN ("screen_view", "click_payment") ), add_step_number AS ( SELECT event_date, event_name_with_screen, CASE WHEN event_name_with_screen = "screen_view-welcome" THEN 1 WHEN event_name_with_screen = "screen_view-home" THEN 2 WHEN event_name_with_screen = "screen_view-food_category" THEN 3 WHEN event_name_with_screen = "screen_view-restaurant" THEN 4 WHEN event_name_with_screen = "screen_view-cart" THEN 5 WHEN event_name_with_screen = "click_payment-cart" THEN 6 ELSE NULL END AS step_number, COUNT(DISTINCT user_pseudo_id) AS cnt FROM filter_event_and_concat_event_and_screen GROUP BY ALL HAVING step_number IS NOT NULL
-
미해결BigQuery(SQL) 활용편(퍼널 분석, 리텐션 분석)
[ 인프런 빅쿼리 빠짝스터디 1주차 ] ARRAY, STRUCT 연습 문제 / PIVOT 연습 문제 / 퍼널 쿼리 연습 문제
ARRAY, STRUCT-- array_exercises 테이블에서 각 영화(title)별로 장르(genres)를 UNNEST해서 보여주기. SELECT title, genre FROM advanced.array_exercises CROSS JOIN UNNEST(genres) as genre-- array_exercises 테이블에서 각 영화(title)별로 배우(actor)와 배역(character)을 보여주기. 단, 배우와 배역은 별도의 컬럼으로 나와야 함. SELECT title, actor.actor, actor.character FROM `advanced.array_exercises` CROSS JOIN UNNEST(actors) as actor-- array_exercises 테이블에서 각 영화(title)별로 배우(actor), 배역(character), 장르(genre) 출력하기 한 행에 배우, 배역, 장르가 모두 표시되어야 된다. SELECT title, actor.actor, actor.character, genre FROM `advanced.array_exercises` CROSS JOIN UNNEST(actors) as actor CROSS JOIN UNNEST(genres) as genre -- 앱 로그 데이터 (app_logs)의 배열 풀기 SELECT user_id, event_date, event_name, user_pseudo_id, event_param.key, event_param.value.string_value, event_param.value.int_value FROM advanced.app_logs CROSS JOIN UNNEST(event_params) as event_param limit 500 데이터 PIVOT-- orders 테이블에서 유저(user_id)별로 주문 금액(amount)의 합계를 PIVOT하기. 날짜(order_date)를 행(Row)으로, user_id를 열(Column)으로 만들어야됨. SELECT order_date, SUM(IF(user_id = 1,amount,0)) AS user_1, SUM(IF(user_id = 2,amount,0)) AS user_2, SUM(IF(user_id = 3,amount,0)) AS user_3 FROM advanced.orders GROUP BY order_date ORDER BY order_date asc;-- orders 테이블에서 날짜(order_date)별로 유저들의 주문 금액(amount)의 합계를 PIVOT하기. user_id를 행으로, order_date를 열로 만들어야 됨. SELECT user_id, sum(if(order_date = '2023-05-01',amount, 0)) AS `2023-05-01`, sum(if(order_date = '2023-05-02',amount, 0)) AS `2023-05-02`, sum(if(order_date = '2023-05-03',amount, 0)) AS `2023-05-03`, sum(if(order_date = '2023-05-04',amount, 0)) AS `2023-05-04`, sum(if(order_date = '2023-05-05',amount, 0)) AS `2023-05-05` FROM advanced.orders GROUP BY user_id ORDER BY user_id ASC-- orders 테이블에서 사용자(user_id)별, 날짜(order_date)별로 주문이 있다면 1, 없다면 0으로 피벗하기. user_id를 행으로, order_date를 열로 만들고 주문이 많아도 1로 처리. SELECT user_id, sum(if(order_date = '2023-05-01',1, 0)) AS `2023-05-01`, sum(if(order_date = '2023-05-02',1, 0)) AS `2023-05-02`, sum(if(order_date = '2023-05-03',1, 0)) AS `2023-05-03`, sum(if(order_date = '2023-05-04',1, 0)) AS `2023-05-04`, sum(if(order_date = '2023-05-05',1, 0)) AS `2023-05-05` FROM advanced.orders GROUP BY user_id ORDER BY user_id ASC-- 앱 로그 데이터 배열 PIVOT 하기 ( user_id = 32888이 카트 추가하기 (click_cart)를 누를때 어떤 음식(food_id)을 담았나?) WITH app_pivot AS ( SELECT user_id, event_date, event_name, user_pseudo_id, event_param.key AS key, event_param.value.string_value AS string_value, event_param.value.int_value AS int_value FROM advanced.app_logs CROSS JOIN UNNEST(event_params) as event_param ) SELECT user_id, event_date, event_name, user_pseudo_id, MAX(IF(key = 'firebase_screen', string_value,NULL)) AS `firebase_screen`, MAX(IF(key = 'food_id',int_value,NULL)) AS `food_id`, MAX(IF(key = 'session_id',string_value,NULL)) AS `session_id` FROM app_pivot WHERE user_id = 32888 AND event_name = 'click_cart' GROUP BY user_id,event_date, event_name, user_pseudo_id ORDER BY event_date ASC;퍼널 분석-- 일자별 이벤트 별 집계형태를 PIVOT 형태로 전환하기 WITH param_pivot AS ( SELECT event_date, event_timestamp, event_name, user_id, user_pseudo_id, MAX(IF(key = 'firebase_screen', event_param.value.string_value,NULL)) AS `firebase_screen`, MAX(IF(key = 'food_id',event_param.value.int_value,NULL)) AS `food_id`, MAX(IF(key = 'session_id',event_param.value.string_value,NULL)) AS `session_id` FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS event_param WHERE event_date BETWEEN '2022-08-01' AND '2022-08-18' GROUP BY ALL), event_table AS (SELECT * EXCEPT(event_name,firebase_screen), CONCAT(event_name,'_', firebase_screen) AS event_name_with_screen, FROM param_pivot WHERE event_name IN ('screen_view','click_payment') GROUP BY all ORDER BY event_date ASC ) , final AS (SELECT event_date, event_name_with_screen, (CASE WHEN event_name_with_screen = 'screen_view_welcome' THEN 1 WHEN event_name_with_screen = 'screen_view_home' THEN 2 WHEN event_name_with_screen = 'screen_view_food_category' THEN 3 WHEN event_name_with_screen = 'screen_view_restaurant' THEN 4 WHEN event_name_with_screen = 'screen_view_cart' THEN 5 WHEN event_name_with_screen = 'clik_payment_cart' THEN 6 ELSE NULL END ) AS step_number, COUNT( DISTINCT user_pseudo_id) AS cnt FROM event_table GROUP BY event_date, event_name_with_screen HAVING step_number IS NOT NULL ORDER BY event_date) SELECT event_date, SUM(IF(event_name_with_screen = 'screen_view_welcome', cnt, 0)) AS `screen_view_welcom`, SUM(IF(event_name_with_screen = 'screen_view_home', cnt, 0)) AS `screen_view_home`, SUM(IF(event_name_with_screen = 'screen_view_food_category', cnt, 0)) AS `screen_view_food_category`, SUM(IF(event_name_with_screen = 'screen_view_restaurant', cnt, 0)) AS `screen_view_restaurant`, SUM(IF(event_name_with_screen = 'screen_view_cart', cnt, 0)) AS `screen_view_cart`, SUM(IF(event_name_with_screen = 'click_payment_cart', cnt, 0)) AS `click_payment_cart` FROM final GROUP BY event_date ORDER BY event_date ASC;
-
미해결BigQuery(SQL) 활용편(퍼널 분석, 리텐션 분석)
[바짝스터디 1주차 과제] ARRAY, STRUCT / PIVOT / 퍼널 분석 연습문제
1. ARRAY, STRUCT 연습문제(1) array_exercises 테이블에서 각 영화(title)별로 장르(genres)를 UNNEST해서 보여주세요.SELECT title, genre FROM advanced.array_exercises AS ae CROSS JOIN UNNEST(genres) AS genre*UNNEST(ARRAY_Column) = UNNEST(배열)(2) array_exercises 테이블에서 각 영화(title)별로 배우(actor)와 배역(character)을 보여주세요.배우와 배역은 별도의 컬럼으로 나와야 합니다.SELECT title, actor.actor, actor.character FROM advanced.array_exercises AS ae CROSS JOIN UNNEST(actors) AS actor(3) array_exercises 테이블에서 각 영화(title)별로 배우(actor)와 배역(character), 장르(genre)를 출력하세요. 한 Row에 배우, 배역, 장르가 모두 표시되어야 합니다.SELECT title, actor.actor, actor.character, genre FROM advanced.array_exercises AS ae , UNNEST(actors) AS actor , UNNEST(genres) AS genre*연속해서 CROSS JOIN UNNEST 사용 가능(4) 앱 로그 데이터(app_logs)의 배열을 풀어주세요.SELECT user_id, event_date, event_name, user_pseudo_id, event_param.key AS key, event_param.value.string_value AS string_value, event_param.value.int_value AS int_value FROM advanced.app_logs , UNNEST(event_params) AS event_param -- WHERE -- event_date = "2022-08-01"*실습 시, 파티션 사용 필요2. PIVOT 연습문제(1) orders 테이블에서 유저(user_id)별로 주문 금액(amount)의 합계를 PIVOT 해주세요.날짜(order_date)를 행(row)으로, user_id를 열(column)으로 만들어야 합니다.SELECT order_date, MAX(IF(user_id = 1, sum_of_amount, 0)) AS user_1, MAX(IF(user_id = 2, sum_of_amount, 0)) AS user_2, MAX(IF(user_id = 3, sum_of_amount, 0)) AS user_3 FROM ( SELECT order_date, user_id, SUM(amount) AS sum_of_amount FROM advanced.orders GROUP BY order_date, user_id ORDER BY order_date ) GROUP BY order_date ORDER BY order_date*첫번째 풀이 내 집계 함수 사용 시, GROUP BY 잊지않기*ctrl+d 사용(2) orders 테이블에서 날짜(order_date)별로 유저들의 주문 금액(amount)의 합계를 PIVOT 해주세요.user_id를 행(row)으로, order_date를 열(column)으로 만들어야 합니다.SELECT user_id, MAX(IF(order_date = '2023-05-01', amount, 0)) AS `2023-05-01`, MAX(IF(order_date = '2023-05-02', amount, 0)) AS `2023-05-02`, MAX(IF(order_date = '2023-05-03', amount, 0)) AS `2023-05-03`, MAX(IF(order_date = '2023-05-04', amount, 0)) AS `2023-05-04`, MAX(IF(order_date = '2023-05-05', amount, 0)) AS `2023-05-05`, FROM advanced.orders GROUP BY user_id ORDER BY user_id(3) orders 테이블에서 사용자(user_id)별, 날짜(order_date)별로 주문이 있다면 1, 없다면 0으로 해주세요. user_id를 행(row)으로, order_date를 열(column)로 만들고 주문을 많이 해도 1로 처리합니다.SELECT user_id, MAX(IF(order_date = '2023-05-01', 1, 0)) AS `2023-05-01`, MAX(IF(order_date = '2023-05-02', 1, 0)) AS `2023-05-02`, MAX(IF(order_date = '2023-05-03', 1, 0)) AS `2023-05-03`, MAX(IF(order_date = '2023-05-04', 1, 0)) AS `2023-05-04`, MAX(IF(order_date = '2023-05-05', 1, 0)) AS `2023-05-05`, FROM advanced.orders GROUP BY user_id ORDER BY user_id(4) user_id = 32888이 카트 추가하기(click_cart)를 누를 때 어떤 음식(food_id)을 담았나요?WITH base AS ( SELECT event_date, event_timestamp, event_name, user_id, user_pseudo_id, MAX(IF(event_param.key = 'firebase screen', event_param.value.string_value, NULL)) AS firebase_screen, MAX(IF(event_param.key = 'food_id', event_param.value.int_value, null)) AS food_id, MAX(IF(event_param.key = 'session_id', event_param.value.string_value, null)) AS session_id FROM advanced.app_logs , UNNEST(event_params) AS event_param GROUP BY ALL ) SELECT food_id FROM base WHERE user_id = 32888 AND event_name = 'click_cart' GROUP BY food_id3. 퍼널 분석 연습문제(1) 각 퍼널의 유저 수를 집계 데이터 : 2022-08-01 ~ 2022-08-18WITH base AS( SELECT event_date, event_timestamp, event_name, user_id, user_pseudo_id, platform, MAX(IF(event_param.key = "firebase_screen", event_param.value.string_value, NULL)) AS firebase_screen, MAX(IF(event_param.key = "food_id", event_param.value.int_value, NULL)) AS food_id, MAX(IF(event_param.key = "session_id", event_param.value.int_value, NULL)) AS session_id FROM `advanced.app_logs` CROSS JOIN UNNEST(event_params) AS event_param GROUP BY ALL ), filter_event_and_concat_event_and_screen AS( SELECT * EXCEPT(event_name, firebase_screen), CONCAT(event_name, "-", firebase_screen) AS event_name_with_screen, DATETIME(TIMESTAMP_MICROS(event_timestamp), "Asia/Seoul") AS event_datetime FROM base WHERE event_date BETWEEN "2022-08-01" AND "2022-08-18" AND event_name IN ("screen_view", "click_payment") AND firebase_screen IN ("welcome", "home", "food_category", "restaurant", "cart") ) SELECT event_date, event_name_with_screen, CASE WHEN event_name_with_screen = 'screen_view-welcome' THEN 1 WHEN event_name_with_screen = 'screen_view-home' THEN 2 WHEN event_name_with_screen = 'screen_view-food_category' THEN 3 WHEN event_name_with_screen = 'screen_view-restaurant' THEN 4 WHEN event_name_with_screen = 'screen_view-cart' THEN 5 WHEN event_name_with_screen = 'click_payment-cart' THEN 6 ELSE NULL END AS step_number, COUNT(DISTINCT user_pseudo_id) AS cnt FROM filter_event_and_concat_event_and_screen GROUP BY ALL HAVING step_number IS NOT NULL ORDER BY event_date, step_number
-
미해결BigQuery(SQL) 활용편(퍼널 분석, 리텐션 분석)
[빠짝스터디 1주차 과제] ARRAY, STRUCT 연습문제 / PIVOT 연습문제 / 퍼널별 전환율을 쉽게 구할 수 있도록 PIVOT해보기
연습문제CREATE OR REPLACE TABLE advanced.array_excercised AS #DDL SELECT movie_id, title, actors, genres FROM ( SELECT 1 AS movie_id, 'Avengers: Endgame' AS title, ARRAY<STRUCT<actor STRING, character STRING>>[ STRUCT('Robert Downey Jr.', 'Tony Stark'), STRUCT('Chris Evans', 'Steve Rogers') ] AS actors, ARRAY<STRING>['Action', 'Adventure', 'Drama'] AS genres UNION ALL SELECT 2, 'Inception', ARRAY<STRUCT<actor STRING,character STRING>>[ STRUCT('leonardo DiCaprio', 'Cobb'), STRUCT('Joseph Gordon-Levitt', 'Arthur') ], ARRAY<STRING>['Action', 'Adventure', 'Sci-Fi'] UNION ALL SELECT 3, 'The Dark Knight', ARRAY<STRUCT<actor STRING, character STRING>>[ STRUCT('Christian Bale', 'Bruce Wayne'), STRUCT('Heath Ledger', 'Joker') ], ARRAY<STRING>['Action', 'Crime', 'Drama'] ) → actors라는 STRUCT 구조체를 만들고 그 안에 2개의 필드 actor와 character를 지정하고, STRUCT 구조체 2개를 list처럼 ARRAY에 넣은 것.위의 테이블을 가지고 연습문제 1~4번 진행array_excercises 테이블에서 각 영화(title)별로 장르(genres)를 UNNEST해서 보여주세요.SELECT title, genre FROM advanced.array_excercised CROSS JOIN UNNEST(genres) AS genre array_excercised 테이블에서 각 영화(title)별로 배우(actor)와 배역(character)을 보여주세요. 배우와 배역은 별도의 컬럼으로 나와야 합니다.SELECT title, actor.actor, actor.character FROM advanced.array_excercised CROSS JOIN UNNEST(actors) AS actor actors ARRAY안에 2개의 STRUCT 구조체가 있는 구조이므로 ARRAY 데이터에 접근하는 방법으로 데이터에는 접근 가능actors[SAFE_OFFSET(0)].actor AS first_actorarray_excercises 테이블에서 각 영화(title)별로 배우(actor), 배역(character), 장르(genre)를 출력하세요. 한 Row에 배우, 배역, 장르가 모두 표시되어야 합니다.SELECT title, actor.actor, actor.character, genre FROM advanced.array_excercised CROSS JOIN UNNEST(actors) AS actor CROSS JOIN UNNEST(genres) AS genre *UNNEST를 2번 연속 사용할 수 있다.*쿼리문의 실행순서는 FROM → JOIN → SELECT 이다. UNNEST를 통해 만들어진 actor는 현재 actor.actor가 아니라 actor라는 STRUCT 구조체이므로 구조체에 바로 접근할 수 없다는 에러가 뜰 수 있다.앱 로그 데이터(app_logs)의 배열을 풀어주세요. SELECT event_date, event_timestamp, event_name, event_param.key AS key, event_param.value AS value, user_id, user_pseudo_id, platform FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS event_paramevent_params는 RECORD라고 되어있는데 STRUCT이다. 중첩된 구조라는 의미.PIVOT 연습문제orders 테이블에서 유저(user_id)별로 주문 금액(amount)의 합계를 PIVOT해주세요. 날짜(order_date)를 행(Row)으로, user_id를 열(column)으로 만들어야 합니다.SELECT order_date, SUM(IF(user_id = 1, amount, 0)) AS user_1, SUM(IF(user_id = 2, amount, 0)) AS user_2, SUM(IF(user_id = 3, amount, 0)) AS user_3 FROM advanced.orders GROUP BY order_date ORDER BY order_date 2. orders 테이블에서 날짜(order_date)별로 유저들의 주문 금액(amount)의 합계를 PIVOT해주세요. user_id를 행(Row)으로, order_date를 열(Column)으로 만들어야 합니다.SELECT user_id, SUM(IF(order_date = "2023-05-01", amount, 0)) AS `2023-05-01`, SUM(IF(order_date = "2023-05-02", amount, 0)) AS `2023-05-02`, SUM(IF(order_date = "2023-05-03", amount, 0)) AS `2023-05-03`, SUM(IF(order_date = "2023-05-04", amount, 0)) AS `2023-05-04`, SUM(IF(order_date = "2023-05-05", amount, 0)) AS `2023-05-05` FROM advanced.orders GROUP BY user_id orders 테이블에서 사용자(user_id)별, 날짜(order_date)별로 주문이 있다면 1, 없다면 0으로 PIVOT해주세요. user_id를 행(Row)으로, order_date를 열(Column)로 만들고 주문을 많이 해도 1로 처리합니다.SELECT user_id, MAX(IF(order_date = "2023-05-01", 1, 0)) AS `2023-05-01`, MAX(IF(order_date = "2023-05-02", 1, 0)) AS `2023-05-02`, MAX(IF(order_date = "2023-05-03", 1, 0)) AS `2023-05-03`, MAX(IF(order_date = "2023-05-04", 1, 0)) AS `2023-05-04`, MAX(IF(order_date = "2023-05-05", 1, 0)) AS `2023-05-05` FROM advanced.orders GROUP BY user_id 4. user_id = 32888이 카트 추가하기(click_cart)를 누를 때 어떤 음식(food_id)을 담았나요?WITH base AS( SELECT #* EXCEPT(event_params), event_date, event_timestamp, event_name, event_param.key AS key, event_param.value.string_value AS string_value, event_param.value.int_value AS int_value, user_id, user_pseudo_id, platform FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS event_param ) SELECT user_id, event_date, event_name, MAX(IF(key = 'firebase_screen', string_value, NULL)) AS firebase_screen, MAX(IF(key = 'food_id', int_value, NULL)) AS food_id, MAX(IF(key = 'session_id', string_value, NULL)) AS session_id FROM base WHERE user_id = 32888 AND event_name = 'click_cart' GROUP BY user_id, event_date, event_name퍼널별 전환율을 쉽게 구할 수 있도록 PIVOT 해보기 WITH base AS ( SELECT event_date, event_timestamp, event_name, user_id, user_pseudo_id, platform, #event_param, MAX(IF(event_param.key = 'firebase_screen', event_param.value.string_value, NULL)) AS firebase_screen, #MAX(IF(event_param.key = 'food_id', event_param.value.int_value, NULL)) AS food_id, MAX(IF(event_param.key = 'session_id', event_param.value.string_value, NULL)) AS session_id FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS event_param WHERE #event_date = "2022-08-01" #적은 데이터로 쿼리를 작성하기 위해 만들어둔 조건 event_date BETWEEN "2022-08-01" AND "2022-08-18" GROUP BY ALL ), filter_event_and_concat_event_and_screen AS ( SELECT * EXCEPT(event_name, firebase_screen, event_timestamp), CONCAT(event_name, "-", firebase_screen) AS event_name_with_screen, DATETIME(TIMESTAMP_MICROS(event_timestamp), 'Asia/Seoul') AS event_datetime FROM base WHERE event_name IN ("screen_view", "click_payment") ), funnel_analysis AS ( SELECT event_date, #일자별로 퍼널별 유저수 쿼리 event_name_with_screen, CASE WHEN event_name_with_screen = "screen_view-welcome" THEN 1 WHEN event_name_with_screen = "screen_view-home" THEN 2 WHEN event_name_with_screen = "screen_view-food_category" THEN 3 WHEN event_name_with_screen = "screen_view-restaurant" THEN 4 WHEN event_name_with_screen = "screen_view-cart" THEN 5 WHEN event_name_with_screen = "click_payment-cart" THEN 6 ELSE NULL END AS step_number, COUNT(DISTINCT user_pseudo_id) AS cnt FROM filter_event_and_concat_event_and_screen GROUP BY ALL HAVING step_number IS NOT NULL ORDER BY event_date ) SELECT event_date, MAX(IF (event_name_with_screen = "screen_view-welcome", cnt, NULL)) AS `screen_view-welcome`, MAX(IF (event_name_with_screen = "screen_view-home", cnt, NULL)) AS `screen_view-home`, MAX(IF (event_name_with_screen = "screen_view-food_category", cnt, NULL)) AS `screen_view-food_category`, MAX(IF (event_name_with_screen = "screen_view-restaurant", cnt, NULL)) AS `screen_view-restaurant`, MAX(IF (event_name_with_screen = "screen_view-cart", cnt, NULL)) AS `screen_view-cart`, MAX(IF (event_name_with_screen = "click_payment-cart", cnt, NULL)) AS `click_payment-cart` FROM funnel_analysis GROUP BY ALL ORDER BY event_date
-
미해결BigQuery(SQL) 활용편(퍼널 분석, 리텐션 분석)
[빠짝스터디 1주차 과제] Array, Struct, Pivot, Funnel
1. ARRAY, STRUCT1) array_exercises 테이블에서 각 영화(title)별로 장르(genres)를 UNNEST해서 보여주세요.SELECT title , genre FROM advanced.array_exercises CROSS JOIN UNNEST(genres) AS genre ;2) array_exercises 테이블에서 각 영화(title)별로 배우(actor)와 배역(character)을 보여주세요. (배우와 배역은 별도의 컬럼으로 나와야 합니다)-- 동일한 단어에 대해 선택할 수 있는 함수 : cmd+d SELECT title , actor.actor , actor.character FROM advanced.array_exercises CROSS JOIN UNNEST(actors) AS actor ;3) array_exercises 테이블에서 각 영화(title)별로 배우(actor), 배역(character), 장르(genre)를 출력하세요. 한 Row에 배우, 배역, 장르가 모두 표시되어야 합니다.SELECT title , actor.actor , actor.character , genre FROM advanced.array_exercises CROSS JOIN UNNEST(actors) AS actor CROSS JOIN UNNEST(genres) AS genre ;4) 앱 로그 데이터(app_logs)의 배열을 풀어주세요.SELECT user_id , event_date , event_name , user_pseudo_id , event_param.key , event_param.value.string_value , event_param.value.int_value , platform FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS event_param ;2. PIVOT1) orders 테이블에서 유저(user_id)별로 주문 금액(amount)의 합계를 PIVOT해주세요. 날짜(order_date)를 행(Row)으로, user_id를 열(Column)으로 만들어야 합니다.SELECT order_date , SUM(IF(user_id = 1, amount, NULL)) AS `user_id_1` , SUM(IF(user_id = 2, amount, NULL)) AS `user_id_2` , SUM(IF(user_id = 3, amount, NULL)) AS `user_id_3` FROM advanced.orders GROUP BY 1 ORDER BY 1 ;2) orders 테이블에서 날짜(order_date)별로 유저들의 주문 금액(amount)의 합계를 PIVOT 해주세요. user_id를 행(Row)으로, order_date를 열(Column)으로 만들어야 합니다.SELECT user_id , SUM(IF(order_date = '2023-05-01', amount, 0)) AS `2023-05-01` , SUM(IF(order_date = '2023-05-02', amount, 0)) AS `2023-05-02` , SUM(IF(order_date = '2023-05-03', amount, 0)) AS `2023-05-03` , SUM(IF(order_date = '2023-05-04', amount, 0)) AS `2023-05-04` , SUM(IF(order_date = '2023-05-05', amount, 0)) AS `2023-05-05` FROM advanced.orders GROUP BY 1 ;3) orders 테이블에서 사용자(user_id)별, 날짜(order_date)별로 주문이 있다면 1, 없다면 0으로 PIVOT 해주세요. user_id를 행(Row)으로, order_date를 열(Column)로 만들고 주문을 많이 해도 1로 처리합니다SELECT user_id , MAX(IF(order_date = '2023-05-01', 1, 0)) AS `2023-05-01` , MAX(IF(order_date = '2023-05-02', 1, 0)) AS `2023-05-02` , MAX(IF(order_date = '2023-05-03', 1, 0)) AS `2023-05-03` , MAX(IF(order_date = '2023-05-04', 1, 0)) AS `2023-05-04` , MAX(IF(order_date = '2023-05-05', 1, 0)) AS `2023-05-05` FROM advanced.orders GROUP BY 1 ;4) user_id = 32888이 카트 추가하기(click_cart)를 누를 때 어떤 음식(food_id)을 담았나요?WITH app_logs_info AS ( SELECT user_id , event_name , MAX(IF(event_param.key = 'firebase_screen', event_param.value.string_value, NULL)) AS firebase_screen , MAX(IF(event_param.key = 'food_id', event_param.value.int_value, NULL)) AS food_id FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS event_param GROUP BY 1, 2 ) SELECT food_id FROM app_logs_info WHERE user_id = 32888 AND event_name = 'click_cart' ; 3. Funnel1) 일자별, 이벤트별 집계WITH app_logs_info AS ( SELECT user_id , event_date , event_timestamp , event_name , user_pseudo_id , event_param.key , MAX(IF(event_param.key = 'firebase_screen', event_param.value.string_value, NULL)) AS firebase_screen , MAX(IF(event_param.key = 'food_id', event_param.value.int_value, NULL)) AS food_id FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS event_param WHERE event_date BETWEEN '2022-08-01' AND '2022-08-18' AND event_name IN ('screen_view', 'click_payment') GROUP BY ALL ) , add_step_number AS ( SELECT event_date , DATETIME(TIMESTAMP_MICROS(event_timestamp), 'Asia/Seoul') AS event_date_time , user_id , user_pseudo_id , CONCAT(event_name, '-', firebase_screen) AS event_name_with_screen , CASE CONCAT(event_name, '-', firebase_screen) WHEN 'screen_view-welcome' THEN 1 WHEN 'screen_view-home' THEN 2 WHEN 'screen_view-food_category' THEN 3 WHEN 'screen_view-restaurant' THEN 4 WHEN 'screen_view-cart' THEN 5 WHEN 'click_payment-cart' THEN 6 ELSE NULL END AS step_number FROM app_logs_info ) SELECT event_date , step_number , event_name_with_screen , COUNT(DISTINCT user_pseudo_id) AS user_cnt FROM add_step_number WHERE step_number IS NOT NULL GROUP BY 1, 2, 3 ORDER BY 1, 2 ; 2) 집계 데이터 PIVOTWITH app_logs_info AS ( SELECT user_id , event_date , event_timestamp , event_name , user_pseudo_id , event_param.key , MAX(IF(event_param.key = 'firebase_screen', event_param.value.string_value, NULL)) AS firebase_screen , MAX(IF(event_param.key = 'food_id', event_param.value.int_value, NULL)) AS food_id FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS event_param WHERE event_date BETWEEN '2022-08-01' AND '2022-08-18' GROUP BY ALL ) , add_step_number AS ( SELECT event_date , DATETIME(TIMESTAMP_MICROS(event_timestamp), 'Asia/Seoul') AS event_date_time , user_id , user_pseudo_id , CONCAT(event_name, '-', firebase_screen) AS event_name_with_screen , CASE CONCAT(event_name, '-', firebase_screen) WHEN 'screen_view-welcome' THEN 1 WHEN 'screen_view-home' THEN 2 WHEN 'screen_view-food_category' THEN 3 WHEN 'screen_view-restaurant' THEN 4 WHEN 'screen_view-cart' THEN 5 WHEN 'click_payment-cart' THEN 6 ELSE NULL END AS step_number FROM app_logs_info WHERE event_name IN ('screen_view', 'click_payment') ) , agg_user_cnt AS ( SELECT event_date , step_number , event_name_with_screen , COUNT(DISTINCT user_pseudo_id) AS user_cnt FROM add_step_number WHERE step_number IS NOT NULL GROUP BY 1, 2, 3 ORDER BY 1, 2 ) SELECT event_date , MAX(IF(step_number = 1, user_cnt, NULL)) AS `screen_view-welcome` , MAX(IF(step_number = 2, user_cnt, NULL)) AS `screen_view-home` , MAX(IF(step_number = 3, user_cnt, NULL)) AS `screen_view-food_category` , MAX(IF(step_number = 4, user_cnt, NULL)) AS `screen_view-restaurant` , MAX(IF(step_number = 5, user_cnt, NULL)) AS `screen_view-cart` , MAX(IF(step_number = 6, user_cnt, NULL)) AS `click_payment-cart` FROM agg_user_cnt GROUP BY 1 ORDER BY 1 ;
-
미해결BigQuery(SQL) 활용편(퍼널 분석, 리텐션 분석)
[바짝스터디 1주차 과제] ARRAY, STRUCT 연습문제 / PIVOT 연습문제 / 퍼널 쿼리 연습문제
1. ARRAY, STRUCT 연습문제연습문제 1SELECT title , genre FROM `advanced.array_exercises` CROSS JOIN UNNEST(genres) AS genre 연습문제 2SELECT title , actor.actor , actor.character FROM `advanced.array_exercises` CROSS JOIN UNNEST(actors) AS actor 연습문제 3SELECT title , actor.actor AS actor , actor.character AS character , genre FROM `advanced.array_exercises` CROSS JOIN UNNEST(actors) AS actor CROSS JOIN UNNEST(genres) AS genre[메모]여기서 CROSS JOIN 다음 라인에 WHERE actor = 'Chris Evans' 이렇게 조건을 걸어줄 경우 오류가 발생한다. 오류는 실행 순서와 관련이 있다. 실행 순서: FROM -> JOIN -> SELECT따라서 SELECT 에서 알리아스로 이름 붙인 actor 가 아닌, CROSS JOIN 결과를 사용하여 조건을 만들어야 한다. => WHERE actor.actor = 'Chris Evans' 연습문제 4SELECT user_id , event_date , event_name , user_pseudo_id , parameter.key AS key , parameter.value.string_value AS string_value , parameter.value.int_value AS int_value FROM `inflearn-bigquery-437203.advanced.app_logs` CROSS JOIN UNNEST (event_params) AS parameter WHERE event_date = "2022-08-01" 2. PIVOT 연습문제연습문제 1SELECT order_date , SUM(IF(user_id = 1, amount, 0)) AS user_1 , SUM(IF(user_id = 2, amount, 0)) AS user_2 , SUM(IF(user_id = 3, amount, 0)) AS user_3 FROM `inflearn-bigquery-437203.advanced.orders` GROUP BY order_date ORDER BY order_date 연습문제 2SELECT user_id , SUM(IF(order_date= '2023-05-01', amount, 0)) AS `2023-05-01` , SUM(IF(order_date= '2023-05-02', amount, 0)) AS `2023-05-02` , SUM(IF(order_date= '2023-05-03', amount, 0)) AS `2023-05-03` , SUM(IF(order_date= '2023-05-04', amount, 0)) AS `2023-05-04` , SUM(IF(order_date= '2023-05-05', amount, 0)) AS `2023-05-05` FROM `inflearn-bigquery-437203.advanced.orders` GROUP BY user_id ORDER BY user_id[메모]알리아스로 컬럼명 지정할 때, 영어 제외하고 다른 문자열가 포함될 경우 → backtick (`) 으로 감싸준다. 연습문제 3SELECT user_id , MAX(IF(order_date= '2023-05-01', 1, 0)) AS `2023-05-01` , MAX(IF(order_date= '2023-05-02', 1, 0)) AS `2023-05-02` , MAX(IF(order_date= '2023-05-03', 1, 0)) AS `2023-05-03` , MAX(IF(order_date= '2023-05-04', 1, 0)) AS `2023-05-04` , MAX(IF(order_date= '2023-05-05', 1, 0)) AS `2023-05-05` FROM `inflearn-bigquery-437203.advanced.orders` GROUP BY user_id ORDER BY user_id 연습문제 4SELECT event_date , event_timestamp , event_name , user_id , user_pseudo_id , MAX(IF(params.key = 'firebase_screen', params.value.string_value, NULL)) AS firebase_screen , MAX(IF(params.key = 'food_id', params.value.int_value, NULL)) AS food_id , MAX(IF(params.key = 'session_id', params.value.string_value, NULL)) AS session_id FROM `inflearn-bigquery-437203.advanced.app_logs` CROSS JOIN UNNEST(event_params) AS params WHERE event_date = '2022-08-01' GROUP BY ALL 3. 퍼널 쿼리 연습문제연습문제 1: 각 퍼널별 유저 수 집계-- 퍼널 단계: 6 -- screen_view(welcome) -- screen_view(home) -- screen_view(food_category) -- screen_view(restaurant) -- screen_view(cart) -- click_payment(cart) WITH funnel AS ( SELECT CONCAT(event_name, '-', param.value.string_value) AS event_name_with_screen , CASE WHEN event_name = 'screen_view' AND param.value.string_value = 'welcome' THEN 1 WHEN event_name = 'screen_view' AND param.value.string_value = 'home' THEN 2 WHEN event_name = 'screen_view' AND param.value.string_value = 'food_category' THEN 3 WHEN event_name = 'screen_view' AND param.value.string_value = 'restaurant' THEN 4 WHEN event_name = 'screen_view' AND param.value.string_value = 'cart' THEN 5 WHEN event_name = 'click_payment' AND param.value.string_value = 'cart' THEN 6 END AS step_number , user_pseudo_id FROM `inflearn-bigquery-437203.advanced.app_logs` CROSS JOIN UNNEST(event_params) AS param WHERE event_date BETWEEN '2022-08-01' AND '2022-08-18' AND event_name IN ('screen_view', 'click_payment') AND param.key = 'firebase_screen' AND param.value.string_value IN ('welcome', 'home', 'food_category', 'restaurant', 'cart') ) SELECT event_name_with_screen , MAX(step_number) AS step_number , COUNT(DISTINCT user_pseudo_id) AS cnt FROM funnel GROUP BY event_name_with_screen 연습문제 2: 일자별 각 퍼널의 유저 수 집계-- 퍼널 단계: 6 -- screen_view(welcome) -- screen_view(home) -- screen_view(food_category) -- screen_view(restaurant) -- screen_view(cart) -- click_payment(cart) WITH funnel AS ( SELECT event_date -- 날짜 컬럼 추가 , CONCAT(event_name, '-', param.value.string_value) AS event_name_with_screen , CASE WHEN event_name = 'screen_view' AND param.value.string_value = 'welcome' THEN 1 WHEN event_name = 'screen_view' AND param.value.string_value = 'home' THEN 2 WHEN event_name = 'screen_view' AND param.value.string_value = 'food_category' THEN 3 WHEN event_name = 'screen_view' AND param.value.string_value = 'restaurant' THEN 4 WHEN event_name = 'screen_view' AND param.value.string_value = 'cart' THEN 5 WHEN event_name = 'click_payment' AND param.value.string_value = 'cart' THEN 6 END AS step_number , user_pseudo_id FROM `inflearn-bigquery-437203.advanced.app_logs` CROSS JOIN UNNEST(event_params) AS param WHERE event_date BETWEEN '2022-08-01' AND '2022-08-18' AND event_name IN ('screen_view', 'click_payment') AND param.key = 'firebase_screen' AND param.value.string_value IN ('welcome', 'home', 'food_category', 'restaurant', 'cart') ) SELECT event_date -- 날짜 컬럼 추가 , event_name_with_screen , MAX(step_number) AS step_number , COUNT(DISTINCT user_pseudo_id) AS cnt FROM funnel GROUP BY ALL ORDER BY event_date, step_number 연습문제 3: 일자별 각 퍼널의 유저 수 집계한 결과 → PIVOT 하기-- 퍼널 단계: 6 -- screen_view(welcome) -- screen_view(home) -- screen_view(food_category) -- screen_view(restaurant) -- screen_view(cart) -- click_payment(cart) WITH funnel AS ( SELECT event_date , CONCAT(event_name, '-', param.value.string_value) AS event_name_with_screen , CASE WHEN event_name = 'screen_view' AND param.value.string_value = 'welcome' THEN 1 WHEN event_name = 'screen_view' AND param.value.string_value = 'home' THEN 2 WHEN event_name = 'screen_view' AND param.value.string_value = 'food_category' THEN 3 WHEN event_name = 'screen_view' AND param.value.string_value = 'restaurant' THEN 4 WHEN event_name = 'screen_view' AND param.value.string_value = 'cart' THEN 5 WHEN event_name = 'click_payment' AND param.value.string_value = 'cart' THEN 6 END AS step_number , user_pseudo_id FROM `inflearn-bigquery-437203.advanced.app_logs` CROSS JOIN UNNEST(event_params) AS param WHERE event_date BETWEEN '2022-08-01' AND '2022-08-18' AND event_name IN ('screen_view', 'click_payment') AND param.key = 'firebase_screen' AND param.value.string_value IN ('welcome', 'home', 'food_category', 'restaurant', 'cart') ), -- 일자별 각 퍼널의 유저수 집계 funnel_daily AS( SELECT event_date , event_name_with_screen , MAX(step_number) AS step_number , COUNT(DISTINCT user_pseudo_id) AS cnt FROM funnel GROUP BY ALL ORDER BY event_date, step_number ) -- 일자별 각 퍼널의 유저수 집계 -> 피벗하기 SELECT event_date , SUM(IF(event_name_with_screen = 'screen_view-welcome', cnt, 0)) AS `screen_view-welcome` , SUM(IF(event_name_with_screen = 'screen_view-home', cnt, 0)) AS `scree_view-home` , SUM(IF(event_name_with_screen = 'screen_view-food_category', cnt, 0)) AS `screen_view-food_category` , SUM(IF(event_name_with_screen = 'screen_view-restaurant', cnt, 0)) AS `screen_view-restaurant` , SUM(IF(event_name_with_screen = 'screen_view-cart', cnt, 0)) AS `screen_view-cart` , SUM(IF(event_name_with_screen = 'click_payment-cart', cnt, 0)) AS `click_payment-cart` FROM funnel_daily GROUP BY event_date ORDER BY event_date
-
미해결BigQuery(SQL) 활용편(퍼널 분석, 리텐션 분석)
[빠짝스터디 1주차 과제] ARRAY, STRUCT 연습 문제/ PIVOT 연습문제/ 퍼널 분석 연습 문제
1-4. Array, Struct 연습문제 (1~4번)연습문제1문제array_exercises 테이블에서 각 영화(title)별로 장르(genres)를 UNNEST해서 보여주세요 쿼리select title, genres2 from advanced.array_exercises cross join unnest(genres) as genres2 ;결과 연습문제2문제array_exercises 테이블에서 각 영화(title)별로 배우(actor)와 배역(character)을 보여주세요. 배우와 배역은 별도의 컬럼으로 나와야 합니다쿼리select title, actor.actor, actor.character from advanced.array_exercises cross join unnest(actors) as actor order by movie_id ;결과 연습문제3문제array_exercises 테이블에서 각 영화(title)별로 배우(actor), 배역(character), 장르 (genre)를 출력하세요. 한 Row에 배우, 배역, 장르가 모두 표시되어야 합니다쿼리select title, actor.actor, actor.character genre2 # array<string> from advanced.array_exercises cross join unnest (actors) as actor cross join unnest (genres) as genre order by movie_id ;결과 연습문제4문제앱 로그 데이터(app_logs)의 배열을 풀어주세요쿼리select user_id, event_date, event_name, user_pseudo_id, event_param.key as key , event_param.value.string_value as string_value , event_param.value.int_value as int_value from advanced.app_logs cross join unnest (event_params) as event_param where event_date = "2022-08-01" limit 10 ;결과 1-9. 퍼널 SQL 쿼리 작성하기연습문제1문제orders 테이블에서 유저(user_id)별로 주문 금액(amount)의 합계를 PIVOT해주세요. 날짜(order_date)를 행(Row)으로, user_id를 열(Column)으로 만들어야 합니다쿼리select order_date, sum(if(user_id = 1, amount, 0)) as user_1, sum(if(user_id = 2, amount, 0)) as user_2, sum(if(user_id = 3, amount, 0)) as user_3 from advanced.orders group by order_date order by 1 ;결과연습문제2문제orders 테이블에서 날짜(order_date)별로 유저들의 주문 금액(amount)의 합계를 PIVOT 해주세요. user_id를 행(Row)으로, order_date를 열(Column)으로 만들어야 합니다쿼리select user_id, # 컬럼의 이름을 지정할때, 영어를 제외하고 backtick(`)사용 sum(if(order_date = "2023-05-01", amount, 0)) as `2023-05-01`, sum(if(order_date = "2023-05-02", amount, 0)) as `2023-05-02`, sum(if(order_date = "2023-05-03", amount, 0)) as `2023-05-03`, sum(if(order_date = "2023-05-04", amount, 0)) as `2023-05-04`, sum(if(order_date = "2023-05-05", amount, 0)) as `2023-05-05` from advanced.orders group by user_id order by 1 ;결과연습문제3문제orders 테이블에서 사용자(user_id)별, 날짜(order_date)별로 주문이 있다면 1, 없다면 0으로 PIVOT 해주세요. user_id를 행(Row)으로, order_date를 열(Column)로 만들고 주문을 많이 해도 1로 처리합니다쿼리select user_id, max(if(order_date = "2023-05-01", 1, 0)) as `2023-05-01`, max(if(order_date = "2023-05-02", 1, 0)) as `2023-05-02`, max(if(order_date = "2023-05-03", 1, 0)) as `2023-05-03`, max(if(order_date = "2023-05-04", 1, 0)) as `2023-05-04`, max(if(order_date = "2023-05-05", 1, 0)) as `2023-05-05` from advanced.orders group by user_id order by 1 ;결과연습문제4문제앱 로그 데이터 배열 PIVOT하기 - user_id = 32888이 카트 추가하기(click_cart)를 누를때 어떤 음식(food_id)을 담았나요?쿼리select event_date, event_timestamp, event_name, user_id, user_pseudo_id, max(if(param.key = 'firebase_screen', param.value.string_value, null)) as firebase_screen, -- max(if(param.key = 'food_id', param.value.string_value, null)) as food_id, max(if(param.key = 'food_id', param.value.int_value, null)) as food_id, max(if(param.key = 'session_id', param.value.string_value, null)) as session_id from advanced.app_logs cross join unnest(event_params) as param where event_date = "2022-08-01" and user_id = 32888 and event_name = "click_cart" group by all limit 100 ; 결과퍼널 분석문제step_number별 count, 일자별 퍼널별 유저 수 쿼리쿼리with base as (select event_date, event_timestamp, event_name, user_id, user_pseudo_id, max(if(event_param.key = 'firebase_screen', event_param.value.string_value, null)) as firebase_screen, -- max(if(event_param.key = 'food_id', event_param.value.int_value, null)) as food_id, max(if(event_param.key = 'session_id', event_param.value.string_value, null)) as session_id from advanced.app_logs cross join unnest(event_params) as event_param where 1=1 and event_date between "2022-08-01" and "2022-08-18" group by all ), filter_event_and_concat_event_and_acreen AS ( -- event_name + screen select * except(event_name, firebase_screen, event_timestamp), concat(event_name, "-", firebase_screen) as event_name_with_screen, datetime(timestamp_micros(event_timestamp), "Asia/Seoul") as event_datetime from base where 1=1 and event_name in ("screen_view", "click_payment")) # 일자별로 퍼널별 유저 수 select -- distinct(event_name_with_screen) event_date, event_name_with_screen, case when event_name_with_screen = 'screen_view-welcome' then 1 when event_name_with_screen = 'screen_view-home' then 2 when event_name_with_screen = 'screen_view-food_category' then 3 when event_name_with_screen = 'screen_view-restaurant' then 4 when event_name_with_screen = 'screen_view-cart' then 5 when event_name_with_screen = 'click_payment-cart' then 6 else null end as step_number, count(distinct user_pseudo_id) as cnt from filter_event_and_concat_event_and_acreen group by all having step_number is not null order by event_date ;결과
-
미해결BigQuery(SQL) 활용편(퍼널 분석, 리텐션 분석)
[바짝스터디 1주차 과제] ARRAY, STRUCT 연습 문제 / PIVOT 연습 문제 / 퍼널 쿼리 연습 문제
1. ARRAY, STRUCT 연습문제 연습 문제 1번# 1)array_exercises테이블에서 각 영화(title)별로 장르(genres)를 UNNEST해서 보여주세요 SELECT title, genre FROM `advanced.array_exercises` AS exercise CROSS JOIN UNNEST(genres) AS genre SELECT title, # 기존에 array_exercises에 저장되어 있던 컬럼 genre FROM `advanced.array_exercises` AS ae, UNNEST(genres) AS genre # ARRAY : 같은 타입의 여러 데이터를 저장하고 싶을 때 # ARRAY를 Flatten(평면화) => UNNEST # UNNEST를 할 때는 CROSS JOIN + UNNEST(ARRAY_COLUMN) # UNNEST(ARRAY_COLUMN) AS 새로운 이름 # SELECT 절에서 새로운 이름으로 시작한다. 기존의 ARRAY_COLUMN은 사용하지 않는다! 연습 문제 2번# 2) array_exercises 테이블에서 각 영화(title)별로 배우(actor)와 배역(character)을 보여주세요. 배우와 배역은 별도의 컬럼으로 나와야 합니다 SELECT title, actor.actor, actor.character -- FROM `advanced.array_exercises` AS ae, UNNEST(actors) AS actor FROM `advanced.array_exercises` AS ae CROSS JOIN UNNEST(actors) AS actor # actors = [STRUCT(STRING, STRING)] 연습 문제 3번# 3) array_exercises 테이블에서 각 영화(title)별로 배우(actor), 배역(character), 장르(genre)를 출력하세요. 한 Row에 배우, 배역, 장르가 모두 표시 되어야 합니다 SELECT title, actor.actor, actor.character, genre FROM `advanced.array_exercises` AS ae ,UNNEST(actors) AS actor, UNNEST(genres) as genre -- FROM `advanced.array_exercises` AS ae -- CROSS JOIN UNNEST(actors) AS actor -- CROSS JOIN UNNEST(genres) as genre # 이 문제의 의도 : UNNEST를 2번 연속 사용할 수 있다. # CROSS JOIN => JOIN 연속 2번과 맥락은 동일한데, UNNEST라는 것이 어색할 수 있었다 # 데이터의 중복이 어느정도 생기는데, 그것은 어쩔 수 없는 이슈(CROSS JOIN) -- FROM `advanced.array_exercises` AS ae -- CROSS JOIN UNNEST(actors) AS actor, UNNEST(genres) as genre SELECT title, actor.actor, actor.character, genre FROM `advanced.array_exercises` AS ae ,UNNEST(actors) AS actor, UNNEST(genres) as genre -- FROM `advanced.array_exercises` AS ae -- CROSS JOIN UNNEST(actors) AS actor -- CROSS JOIN UNNEST(genres) as genre # 이 문제의 의도 : UNNEST를 2번 연속 사용할 수 있다. # CROSS JOIN => JOIN 연속 2번과 맥락은 동일한데, UNNEST라는 것이 어색할 수 있었다 # 데이터의 중복이 어느정도 생기는데, 그것은 어쩔 수 없는 이슈(CROSS JOIN) -- FROM `advanced.array_exercises` AS ae -- CROSS JOIN UNNEST(actors) AS actor, UNNEST(genres) as genre WHERE actor.actor = 'Chris Evans' AND genre = 'Action' -- WHERE actor = 'Chris Evans' (X) # 실행 순서 : FROM -> JOIN -> SELECT # actors : ARRAY<STRUCT> => UNNEST => STRUCT # genres : ARRAY<STRING> => STRING 연습 문제 4번# 4) 앱 로그 데이터(app_logs)의 배열을 풀어주세요 SELECT event_date, event_timestamp, event_name, event_param.key as key, event_param.value as value, event_param.value.string_value as string_value, event_param.value.int_value as int_value, -- event_params, user_id, event_param FROM `advanced.app_logs` CROSS JOIN UNNEST(event_params) AS event_param WHERE event_date ='2022-08-01' 2. PIVOTPIVOT 연습 문제 1## SubQuery 방식 SELECT order_date, SUM(IF(user_id = 1, sum_of_amount, NULL)) AS user_1, SUM(IF(user_id = 2, sum_of_amount, NULL)) AS user_2, SUM(IF(user_id = 3, sum_of_amount, NULL)) AS user_3 -- MAX를 써도 동일한 결과 값이 나옴 -- 그룹화 할때 값이 하나밖에 없음 FROM ( SELECT order_date, user_id, # Amount의 합 SUM(amount) AS sum_of_amount FROM advanced.orders GROUP BY order_date, user_id -- ORDER BY order_date ) GROUP BY order_date ORDER BY order_date PIVOT 연습 문제 2# ANY_VALUE 활용 -- ANY_VALUE : 그룹화 할 대상 중에 임의의 값을 선택한다 (NULL)을 제외하고 -- ANY_VALUE에선 나머지 값들이 NULL 이거나 확정적으로 이 값이 나올 것이다 기대할 때 사용한다 SELECT user_id, ANY_VALUE(IF(order_date = PARSE_DATE('%Y-%m-%d', '2023-05-01'), amount, NULL)) AS `2023-05-01`, ANY_VALUE(IF(order_date = date('2023-05-02'), amount, NULL)) AS `2023-05-02`, ANY_VALUE(IF(order_date = date('2023-05-03'), amount, NULL)) AS `2023-05-03`, ANY_VALUE(IF(order_date = date('2023-05-04'), amount, NULL)) AS `2023-05-04`, ANY_VALUE(IF(order_date = date('2023-05-05'), amount, NULL)) AS `2023-05-05` FROM `advanced.orders` GROUP BY user_id ORDER BY user_id PIVOT 연습 문제 3# 3) orders 테이블에서 사용자(user_id)별, 날짜(order_date)별로 주문이 있다면 1, 없다면 0으로 PIVOT 해주세요.user_id를 행(Row)으로, order_date를 열(Column)로 만들고 주문을 많이 해도 1로 처리합니다 SELECT user_id, # amount 대신 1 이라고 표시. IF 문 안에 TRUE 일 때의 값이 항상 특정 컬럼이 아니라 1이라고 할 수도 있음(유무에 따라서) MAX(IF(order_date = PARSE_DATE('%Y-%m-%d', '2023-05-01'), 1, 0)) AS `2023-05-01`, MAX(IF(order_date = date('2023-05-02'), 1, 0)) AS `2023-05-02`, MAX(IF(order_date = date('2023-05-03'), 1, 0)) AS `2023-05-03`, MAX(IF(order_date = date('2023-05-04'), 1, 0)) AS `2023-05-04`, MAX(IF(order_date = date('2023-05-05'), 1, 0)) AS `2023-05-05` FROM `advanced.orders` GROUP BY user_id PIVOT 연습 문제 4 앱 로그 데이터 배열 PIVOT 하기WITH base AS ( SELECT event_date, event_name, user_id, user_pseudo_id, event_timestamp, MAX(IF(param.key = 'firebase_screen', param.value.string_value, NULL)) AS firebase_screen, MAX(IF(param.key = 'food_id', param.value.int_value, NULL)) AS food_id, MAX(IF(param.key = 'session_id', param.value.string_value, NULL)) AS session_id FROM `advanced.app_logs` CROSS JOIN UNNEST(event_params) AS param GROUP BY ALL -- WHERE event_name ='click_cart' ORDER BY user_pseudo_id LIMIT 100 ) SELECT event_date, COUNT(user_id) AS user_cnt FROM base WHERE event_name ='click_cart' GROUP BY event_date ORDER BY event_date 3. 퍼널 분석 -- event_name + screen (필요한 이벤트만 WHERE 조건에 걸어서 사용) -- step_number + COUNT WITH base AS ( SELECT event_date, event_timestamp, event_name, user_id, user_pseudo_id, platform, -- param MAX(IF(param.key = 'firebase_screen', param.value.string_value, NULL)) AS firebase_screen, MAX(IF(param.key = 'food_id', param.value.int_value, NULL)) AS food_id, MAX(IF(param.key = 'session_id', param.value.string_value, NULL)) AS session_id, FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS param WHERE event_date = '2022-08-01' GROUP BY ALL -- LIMIT 100 ), filter_event_and_concat_event_and_screen AS ( SELECT * EXCEPT(event_name, firebase_screen), CONCAT(event_name, "-", firebase_screen) AS event_name_with_screen FROM base WHERE event_name IN ("screen_view", "click_payment") ) SELECT * FROM filter_event_and_concat_event_and_screen 최종 RESULT# 일자 상관 없이 퍼널의 유저 수를 집계한 쿼리 => 일자별로 하기 위해 event_date 추가WITH base AS ( SELECT event_date, event_timestamp, event_name, user_id, user_pseudo_id, platform, -- param MAX(IF(param.key = 'firebase_screen', param.value.string_value, NULL)) AS firebase_screen, MAX(IF(param.key = 'session_id', param.value.string_value, NULL)) AS session_id, FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS param WHERE -- event_date = '2022-08-01' # 적은 데이터로 쿼리를 작성하기 위해 만들어둔 조건 event_date BETWEEN '2022-08-01' AND '2022-08-18' GROUP BY ALL ), filter_event_and_concat_event_and_screen AS ( SELECT * EXCEPT(event_name, firebase_screen), CONCAT(event_name, "-", firebase_screen) AS event_name_with_screen, DATETIME(TIMESTAMP_MICROS(event_timestamp), 'Asia/Seoul') AS event_datetime FROM base WHERE event_name IN ("screen_view", "click_payment") ) -- step_number + COUNT -- step_number : CASE WHEN 을 사용해 숫자 지정, 조건문을 여러 개 하고싶을 때 사용하는 함수 # 일자 상관 없이 퍼널의 유저 수를 집계한 쿼리 => 일자별로 하기 위해 event_date 추가 SELECT event_date, # 일자별로 퍼널별 유저 수 쿼리 event_name_with_screen, -- event_name_with_screen, -- event_datetime, user_pseudo_id, CASE WHEN event_name_with_screen = 'screen_view-welcome' THEN 1 WHEN event_name_with_screen = 'screen_view-home' THEN 2 WHEN event_name_with_screen = 'screen_view-food_category' THEN 3 WHEN event_name_with_screen = 'screen_view-restaurant' THEN 4 WHEN event_name_with_screen = 'screen_view-cart' THEN 5 WHEN event_name_with_screen = 'click_payment-cart' THEN 6 ELSE NULL END AS step_number, COUNT(DISTINCT user_pseudo_id) AS CNT FROM filter_event_and_concat_event_and_screen GROUP BY ALL HAVING step_number IS NOT NULL ORDER BY event_date
-
미해결BigQuery(SQL) 활용편(퍼널 분석, 리텐션 분석)
[ 인프런 빅쿼리 빠짝스터디 1주차 ] ARRAY, STRUCT 연습 문제 / PIVOT 연습 문제 / 퍼널 쿼리 연습 문제
1. ARRAY, STRUCT1) array_exercises 테이블에서 각 영화(title)별로 장르(genres)를 UNNEST해서 보여주세요SELECT title, genre FROM advanced.array_exercises CROSS JOIN UNNEST(genres) AS genre2) array_exercises 테이블에서 각 영화(title)별로 배우(actor)와 배역(character)을 보여주세요. (배우와 배역은 별도의 컬럼으로 나와야 합니다)SELECT title, actor.actor, actor.character FROM advanced.array_exercises CROSS JOIN UNNEST(actors) AS actor3) array_exercises 테이블에서 각 영화(title)별로 배우(actor), 배역(character), 장르(genre)를 출력하세요. 한 Row에 배우, 배역, 장르가 모두 표시되어야 합니다SELECT title, actor.actor, actor.character, genre FROM advanced.array_exercises CROSS JOIN UNNEST(actors) AS actor CROSS JOIN UNNEST(genres) AS genre4) 앱 로그 데이터(app_logs)의 배열을 풀어주세요SELECT user_id, event_date, event_name, user_pseudo_id, event_param.key AS key, event_param.value.string_value AS string_value, event_param.value.int_value AS int_value FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS event_param WHERE event_date = '2022-08-01'2. PIVOT1) orders 테이블에서 유저(user_id)별로 주문 금액(amount)의 합계를 PIVOT해주세요. 날짜(order_date)를 행(Row)으로, user_id를 열(Column)으로 만들어야 합니다SELECT order_date, SUM(IF(user_id = 1, amount, 0)) AS user_1, SUM(IF(user_id = 2, amount, 0)) AS user_2, SUM(IF(user_id = 3, amount, 0)) AS user_3, FROM advanced.orders GROUP BY order_date ORDER BY order_date 2) orders 테이블에서 날짜(order_date)별로 유저들의 주문 금액(amount)의 합계를 PIVOT 해주세요. user_id를 행(Row)으로, order_date를 열(Column)으로 만들어야 합니다-- 2) orders 테이블에서 날짜(order_date)별로 유저들의 주문 금액(amount)의 합계를 PIVOT 해주세요. user_id를 행(Row)으로, order_date를 열(Column)으로 만들어야 합니다 SELECT user_id, SUM(IF(order_date = '2023-05-01', amount, 0)) AS `2023-05-01`, SUM(IF(order_date = '2023-05-02', amount, 0)) AS `2023-05-02`, SUM(IF(order_date = '2023-05-03', amount, 0)) AS `2023-05-03`, SUM(IF(order_date = '2023-05-04', amount, 0)) AS `2023-05-04`, SUM(IF(order_date = '2023-05-05', amount, 0)) AS `2023-05-05`, FROM advanced.orders GROUP BY user_id ORDER BY user_id3) orders 테이블에서 사용자(user_id)별, 날짜(order_date)별로 주문이 있다면 1, 없다면 0으로 PIVOT 해주세요. user_id를 행(Row)으로, order_date를 열(Column)로 만들고 주문을 많이 해도 1로 처리합니다SELECT user_id, MAX(IF(order_date = '2023-05-01', 1, 0)) AS `2023-05-01`, MAX(IF(order_date = '2023-05-02', 1, 0)) AS `2023-05-02`, MAX(IF(order_date = '2023-05-03', 1, 0)) AS `2023-05-03`, MAX(IF(order_date = '2023-05-04', 1, 0)) AS `2023-05-04`, MAX(IF(order_date = '2023-05-05', 1, 0)) AS `2023-05-05`, FROM advanced.orders GROUP BY user_id ORDER BY user_id4) 앱 로그 데이터 배열 PIVOT하기# 앱 로그 PIVOT # 쿼리를 작성하는 목표, 확인할 지표 : user_id = 32888이 카트 추가하기(click_cart)를 누를때 어떤 음식(food_id)을 담았나요? # 쿼리 계산 방법 : UNNEST -> PIVOT # 데이터의 기간 : X # 사용할 테이블 : app_logs # Join KEY : X # 데이터 특징: -- event_params ARRAY, STRUCT / event_params.value ARRAY, STRUCT WITH base AS ( SELECT user_id, event_date, event_name, user_pseudo_id, event_param.key AS key, event_param.value.string_value AS string_value, event_param.value.int_value AS int_value FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS event_param ) SELECT user_id, event_date, event_name, user_pseudo_id, MAX(IF(key = 'firebase_screen', string_value, NULL)) AS firebase_screen, MAX(IF(key = 'food_id', int_value, NULL)) AS food_id, MAX(IF(key = 'session_id', string_value, NULL)) AS session_id FROM base WHERE user_id = 32888 AND event_name = 'click_cart' GROUP BY ALL3. 퍼널WITH base AS ( SELECT event_date, event_timestamp, event_name, user_id, user_pseudo_id, platform, MAX(IF(event_param.key = 'firebase_screen', event_param.value.string_value, NULL)) AS firebase_screen, MAX(IF(event_param.key = 'session_id', event_param.value.string_value, NULL)) AS session_id FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS event_param WHERE event_date BETWEEN '2022-08-01' AND '2022-08-18' GROUP BY ALL ), filter_event_and_concat_event_and_screen AS ( SELECT * EXCEPT(event_name, firebase_screen, event_timestamp), CONCAT(event_name, "-", firebase_screen) AS event_name_with_screen, DATETIME(TIMESTAMP_MICROS(event_timestamp), 'Asia/Seoul') AS event_datetime FROM base WHERE event_name IN ('screen_view', 'click_payment') ), daily_event_summary AS ( SELECT event_date, # 일자별로 퍼널별 유저 수 쿼리 event_name_with_screen, CASE WHEN event_name_with_screen = 'screen_view-welcome' THEN 1 WHEN event_name_with_screen = 'screen_view-home' THEN 2 WHEN event_name_with_screen = 'screen_view-food_category' THEN 3 WHEN event_name_with_screen = 'screen_view-restaurant' THEN 4 WHEN event_name_with_screen = 'screen_view-cart' THEN 5 WHEN event_name_with_screen = 'click_payment-cart' THEN 6 ELSE NULL END AS step_number, COUNT(DISTINCT user_pseudo_id) AS cnt FROM filter_event_and_concat_event_and_screen GROUP BY ALL HAVING step_number IS NOT NULL ) -- 집계한 데이터 PIVOT SELECT event_date, MAX(IF(event_name_with_screen = 'screen_view-welcome', cnt, 0)) AS screen_view_welcome, MAX(IF(event_name_with_screen = 'screen_view-home', cnt, 0)) AS screen_view_home, MAX(IF(event_name_with_screen = 'screen_view-food_category', cnt, 0)) AS screen_view_food_category, MAX(IF(event_name_with_screen = 'screen_view-restaurant', cnt, 0)) AS screen_view_restaurant, MAX(IF(event_name_with_screen = 'screen_view-cart', cnt, 0)) AS screen_view_cart, MAX(IF(event_name_with_screen = 'click_payment-cart', cnt, 0)) AS click_payment_cart FROM daily_event_summary GROUP BY ALL ORDER BY event_date
-
미해결BigQuery(SQL) 활용편(퍼널 분석, 리텐션 분석)
[ 인프런 빅쿼리 빠짝스터디 1주차 ] ARRAY, STRUCT 연습 문제 / PIVOT 연습 문제 / 퍼널 쿼리 연습 문제
1. ARRAY, STRUCT1) array_exercises 테이블에서 각 영화(title)별로 장르(genres)를 UNNEST해서 보여주세요SELECT title , genre FROM advanced.array_exercises CROSS JOIN UNNEST(genres) AS genre2) array_exercises 테이블에서 각 영화(title)별로 배우(actor)와 배역(character)을 보여주세요. (배우와 배역은 별도의 컬럼으로 나와야 합니다)SELECT title, , actor.actor AS actor , actor.character AS character FROM advanced.array_exercises CROSS JOIN UNNEST(actors) AS actor3) array_exercises 테이블에서 각 영화(title)별로 배우(actor), 배역(character), 장르(genre)를 출력하세요. 한 Row에 배우, 배역, 장르가 모두 표시되어야 합니다SELECT title , actor.actor AS actor , actor.character AS character , genre FROM advanced.array_exercises CROSS JOIN UNNEST(actors) AS actor CROSS JOIN UNNEST(genres) AS genre 4) 앱 로그 데이터(app_logs)의 배열을 풀어주세요SELECT user_id , event_date , event_name , user_pseudo_id , params.key AS key , params.value.string_value AS str_value , params.value.int_value AS int_value FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS params WHERE event_date = '2022-08-01' 2. PIVOT1) orders 테이블에서 유저(user_id)별로 주문 금액(amount)의 합계를 PIVOT해주세요. 날짜(order_date)를 행(Row)으로, user_id를 열(Column)으로 만들어야 합니다-- 1) orders 테이블에서 유저(user_id)별로 주문 금액(amount)의 합계를 PIVOT해주세요. 날짜(order_date)를 행(Row)으로, user_id를 열(Column)으로 만들어야 합니다 WITH step1 AS ( SELECT order_date , user_id , sum(amount) AS sum_of_amount FROM advanced.orders GROUP BY ALL ) SELECT order_date , MAX(IF(user_id = 1, sum_of_amount, 0)) AS user_1 , MAX(IF(user_id = 2, sum_of_amount, 0)) AS user_2 , MAX(IF(user_id = 3, sum_of_amount, 0)) AS user_3 FROM step1 GROUP BY order_date ORDER BY order_date2) orders 테이블에서 날짜(order_date)별로 유저들의 주문 금액(amount)의 합계를 PIVOT 해주세요. user_id를 행(Row)으로, order_date를 열(Column)으로 만들어야 합니다SELECT user_id , SUM(IF(order_date = '2023-05-01', amount, 0)) AS `2023-05-01` , SUM(IF(order_date = '2023-05-02', amount, 0)) AS `2023-05-02` , SUM(IF(order_date = '2023-05-03', amount, 0)) AS `2023-05-03` , SUM(IF(order_date = '2023-05-04', amount, 0)) AS `2023-05-04` , SUM(IF(order_date = '2023-05-05', amount, 0)) AS `2023-05-05` FROM advanced.orders GROUP BY user_id ORDER BY user_id3) orders 테이블에서 사용자(user_id)별, 날짜(order_date)별로 주문이 있다면 1, 없다면 0으로 PIVOT 해주세요. user_id를 행(Row)으로, order_date를 열(Column)로 만들고 주문을 많이 해도 1로 처리합니다SELECT user_id , MAX(IF(order_date = '2023-05-01', 1, 0)) AS `2023-05-01` , MAX(IF(order_date = '2023-05-02', 1, 0)) AS `2023-05-02` , MAX(IF(order_date = '2023-05-03', 1, 0)) AS `2023-05-03` , MAX(IF(order_date = '2023-05-04', 1, 0)) AS `2023-05-04` , MAX(IF(order_date = '2023-05-05', 1, 0)) AS `2023-05-05` FROM advanced.orders GROUP BY user_id ORDER BY user_id 4) 앱 로그 데이터 배열 PIVOT하기SELECT user_id , event_date , event_name , user_pseudo_id , MAX(IF(params.key = 'firebase_screen', params.value.string_value, NULL)) AS firebase_screen , MAX(IF(params.key = 'food_id', params.value.int_value, NULL)) AS food_id , MAX(IF(params.key = 'session_id', params.value.string_value, NULL)) AS session_id FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS params WHERE event_date = '2022-08-01' GROUP BY ALL 3. 퍼널 분석 WITH step1 AS ( SELECT event_date , event_timestamp , event_name , user_id , user_pseudo_id , MAX(IF(params.key = 'firebase_screen' , params.value.string_value , NULL)) AS firebase_screen , MAX(IF(params.key = 'session_id' , params.value.string_value , NULL)) AS session_id , platform FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS params WHERE event_date BETWEEN '2022-08-01' AND '2022-08-18' GROUP BY ALL ), step2 AS ( SELECT * EXCEPT(event_timestamp) , CONCAT(event_name, '-', firebase_screen) AS event_name_with_screen , DATETIME(TIMESTAMP_MICROS(event_timestamp), 'Asia/Seoul') AS event_datetime FROM step1 ), step3 AS ( SELECT * , CASE WHEN event_name_with_screen = 'screen_view-welcome' THEN 1 WHEN event_name_with_screen = 'screen_view-home' THEN 2 WHEN event_name_with_screen = 'screen_view-food_category' THEN 3 WHEN event_name_with_screen = 'screen_view-restaurant' THEN 4 WHEN event_name_with_screen = 'screen_view-cart' THEN 5 WHEN event_name_with_screen = 'click_payment-cart' THEN 6 END AS step_number FROM step2 -- 1) 각 퍼널별 유저 수 집계 ), step3_1 AS ( SELECT event_name_with_screen , step_number , COUNT(DISTINCT user_pseudo_id) AS cnt FROM step3 GROUP BY ALL HAVING step_number IS NOT NULL ORDER BY step_number ) -- 2) 일자별 각 퍼널별 유저 수 집계 , step3_2 AS ( SELECT event_date , event_name_with_screen , step_number , COUNT(DISTINCT user_pseudo_id) AS cnt FROM step3 GROUP BY ALL HAVING step_number IS NOT NULL ORDER BY event_date , step_number ) 3) 2) 데이터를 PIVOTSELECT event_date , MAX(IF(event_name_with_screen = 'screen_view-welcome', cnt, NULL)) AS `screen_view-welcome` , MAX(IF(event_name_with_screen = 'screen_view-home', cnt, NULL)) AS `screen_view-home` , MAX(IF(event_name_with_screen = 'screen_view-food_category', cnt, NULL)) AS `screen_view-food_category` , MAX(IF(event_name_with_screen = 'screen_view-restaurant', cnt, NULL)) AS `screen_view-restaurant` , MAX(IF(event_name_with_screen = 'screen_view-cart', cnt, NULL)) AS `screen_view-cart` , MAX(IF(event_name_with_screen = 'click_payment-cart', cnt, NULL)) AS `click_payment-cart` FROM step3_2 GROUP BY event_date ORDER BY event_date
-
미해결BigQuery(SQL) 활용편(퍼널 분석, 리텐션 분석)
[빠짝스터디 1주차 과제] ARRAY, STRUCT 연습 문제/ PIVOT 연습문제/ 퍼널 쿼리 연습 문제
[PART 1] ARRAY, STRUCT 연습문제 Q1. array_exercises 테이블에서 각 영화(title)별로 장르(genres)를 UNNEST해서 보여주세요.-- CROSS JOIN UNNEST(ARRARY_COKUMN) AS 새로운 이름 -> 이후 SELECT 절에서 새로운 이름만 포함하여 쿼리를 실행하여 평면화 가능 SELECT title, -- genres, genre FROM advanced.array_exercises AS ae CROSS JOIN UNNEST(genres) AS genre; Q2. array_exercise 테이블에서 각 영화(title)별로 배우(actors)와 배역(character)을 보여주세요 배우와 배역은 별도의 컬럼으로 나와야 합니다.SELECT title, aa.actor, aa.character FROM advanced.array_exercises AS ae CROSS JOIN UNNEST(actors) AS aa; Q3. array_exercises 테이블에서 각 영화(title)별로 배우(actor), 배역(character), 장르(genre)를 출력하세요. 한 Row에 배우, 배역, 장르가 모두 표시되어야 합니다.SELECT title, aa.actor, aa.character, gg AS genre FROM advanced.array_exercises AS ae CROSS JOIN UNNEST(actors) AS aa CROSS JOIN UNNEST(genres) AS gg; Q4. 앱 로그 데이터(app_logs)의 배열을 풀어주세요. SELECT event_date, event_timestamp, event_name, ep.key, ep.value.string_value, ep.value.int_value, user_id, user_pseudo_id, platform FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS ep WHERE event_date = "2022-08-01"; [PART 2] PIVOT 연습문제 Q1. orders 테이블에서 유저(user_id)별로 주문 금액(amount)의 합계를 PIVOT해주세요. 날짜(order_date)를 행(Row)으로, user_id를 열(Column)으로 만들어야 합니다.SELECT order_date, SUM(IF(user_id = 1, amount, 0)) AS user_1, SUM(IF(user_id = 2, amount, 0)) AS user_2, SUM(IF(user_id = 3, amount, 0)) AS user_3 FROM advanced.orders GROUP BY order_date; Q2. orders 테이블에서 날짜(order_date)별로 유저들의 주문 금액(amount)의 합계를 PIVOT 해주세요. user_id 를 행(Row)으로, order_date를 열(Column)으로 만들어야 합니다.-- 처음 시도에 짠 쿼리 SELECT user_id, SUM(IF(order_date = "2023-05-01", amount, 0)) AS 2023-05-01, SUM(IF(order_date = "2023-05-02", amount, 0)) AS 2023-05-02, SUM(IF(order_date = "2023-05-03", amount, 0)) AS 2023-05-03, SUM(IF(order_date = "2023-05-04", amount, 0)) AS 2023-05-04, SUM(IF(order_date = "2023-05-05", amount, 0)) AS 2023-05-05, FROM advanced.orders GROUP BY user_id;이렇게 쿼리를 짜고 실행해보니 다음과 같은 오류 문구가 나옴Syntax error: Unexpected integer literal "2023" at [3:52]SELECT 문에서 AS 다음에 정의한 새로운 컬럼 명칭에 오류가 있는 것 같은데, “ “ 로 감싸도 오류가 나오고 + 아예 AS 를 빼고 실행했더니 f0/f1/f2 와 같은 임의의 컬럼명이 지정됨-- 강의를 듣고 고친 쿼리 SELECT user_id, SUM(IF(order_date = "2023-05-01", amount, 0)) AS `2023-05-01`, SUM(IF(order_date = "2023-05-02", amount, 0)) AS `2023-05-02`, SUM(IF(order_date = "2023-05-03", amount, 0)) AS `2023-05-03`, SUM(IF(order_date = "2023-05-04", amount, 0)) AS `2023-05-04`, SUM(IF(order_date = "2023-05-05", amount, 0)) AS `2023-05-05`, FROM advanced.orders GROUP BY user_id;alias 로 영어가 아닌 컬럼의 이름을 새로 지정할 때에는, 반드시 backtick(`) 으로 감싸줘야 한다는 점!(c.f.) MAC 에서 backtick 은 영문인 상태로 ₩ 단축키를 누르면 나온다강의에서는 SUM 함수가 아닌 MAX 함수 or ANY_VALUE 함수로 감싸서 쿼리를 작성해주신 점 확인강의 진행 시에는 IF 문으로 먼저 데이터를 확인하고 user_id x order_date 별로 1개의 데이터만 있다는 점을 중간확인 했기 때문단, 실무 상으로는 user_id 와 order_date 가 무수히 많거나 or 데이터 양이 너무 많아 중간 조회를 할 수 없는 상황도 있기 때문에 → 이런 경우에는 MAX 로 가져가는 것이 안전하지 않을까? 하는 생각도 들었음데이터를 보고자 하는 목적이 “각 user_id x order_date 별로 주문금액의 합산” 을 보기 위함이었기 때문! Q3. orders 테이블에서 사용자(user_id)별, 날짜(order_date)별로 주문이 있다면 1, 없다면 0으로 PIVOT 해주세요. user_id를 행(Row)으로, order_date를 열(Column)로 만들고 주문을 많이 해도 1로 처리합니다.-- 처음 시도에 짠 쿼리 SELECT user_id, IF((SUM(IF(order_date = "2023-05-01", amount, NULL))) IS NOT NULL, 1, 0) AS `2023-05-01`, IF((SUM(IF(order_date = "2023-05-02", amount, NULL))) IS NOT NULL, 1, 0) AS `2023-05-02`, IF((SUM(IF(order_date = "2023-05-03", amount, NULL))) IS NOT NULL, 1, 0) AS `2023-05-03`, IF((SUM(IF(order_date = "2023-05-04", amount, NULL))) IS NOT NULL, 1, 0) AS `2023-05-04`, IF((SUM(IF(order_date = "2023-05-05", amount, NULL))) IS NOT NULL, 1, 0) AS `2023-05-05`, FROM advanced.orders GROUP BY user_id;SUM 과 IF 문으로 먼저 쿼리를 짜서 user_id 별 합산값을 확인하고 → 합산값이 NULL 이 아닌 경우에만 1을 다시 값으로 지정할 수 있도록 IF 문으로 감싸둠이렇게 쿼리를 짜도 결과값은 동이하게 나오지만, SELECT 문이 길어진 것 같아서 좀 더 효율적으로 쿼리를 짤 수 있는 방법은 없을지 고민되었음 ㅠ-- 강의를 듣고 고친 쿼리 SELECT user_id, MAX(IF(order_date = "2023-05-01", 1, 0)) AS `2023-05-01`, MAX(IF(order_date = "2023-05-02", 1, 0)) AS `2023-05-02`, MAX(IF(order_date = "2023-05-03", 1, 0)) AS `2023-05-03`, MAX(IF(order_date = "2023-05-04", 1, 0)) AS `2023-05-04`, MAX(IF(order_date = "2023-05-05", 1, 0)) AS `2023-05-05`, FROM advanced.orders GROUP BY user_id;IF 문 안에 들어있는 TRUE 조건이었던 amount 를 1로 바꿔주면 간단히 해결되었을 문제!IF 문 안에 TRUE 일 때 값이 항상 특정 컬럼이 아니라 1이라고도 할 수 있다는 것을 보여주기 위한 문제다만, PIVOT 2번 문제에서 언급했던 바와 같이 만약 user_id x order_date 별로 1개의 값만 존재하는 것이 아니라 여러 값이 존재했더라면 → 3번 쿼리에서 MAX를 썼을 때와 SUM 을 썼을 때의 결과값은 달라졌을 것이라 생각함만약 user_id x order_date 별로 N개의 값이 존재했더라면 → SUM 함수로 감쌌을 때 1의 값이 x N개 합산되어서 나왔을 것이기 때문따라서, 3번 문제의 경우 1 혹은 0 2개의 값으로만 표현해야 했기 때문에, SUM 이 아닌 MAX 함수로 쿼리를 작성하는 것이 필요하다고 생각함단, 만약 “횟수” 를 알고 싶었다면 기존대로 SUM 함수를 사용하면 됨! Q4. 앱로그 데이터 배열 PIVOT 하기 → user_id = 32888 이 카트 추가하기(click_cart)를 누를 때 어떤 음식 (food_id)을 담았는지 구해주세요. key 를 Column 으로 두고, string_value 나 int_value를 Column의 값으로 설정해서 풀어주세요.SELECT event_date, event_timestamp, event_name, user_id, user_pseudo_id, MAX(IF(ep.key = "firebase_screen", ep.value.string_value, NULL)) AS firebase_screen, MAX(IF(ep.key = "food_id", ep.value.int_value, NULL)) AS food_id, MAX(IF(ep.key = "session_id", ep.value.string_value, NULL)) AS session_id FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS ep GROUP BY ALL; [PART 3] 퍼널 연습문제WITH base AS (SELECT event_date, event_timestamp, event_name, user_id, user_pseudo_id, platform, -- event_param, MAX(IF(event_param.key = "firebase_screen", event_param.value.string_value, NULL)) AS firebase_screen, MAX(IF(event_param.key = "food_id", event_param.value.int_value, NULL)) AS food_id, MAX(IF(event_param.key = "session_id", event_param.value.string_value, NULL)) AS session_id, FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS event_param WHERE -- event_date = "2022-08-01" event_date BETWEEN "2022-08-01" AND "2022-08-18" GROUP BY ALL ), filter_event_and_concat_event_and_screen AS ( -- event_name + screen (필요한 이벤트만 WHERE 조건에 걸어서 사용) SELECT * EXCEPT(event_name, firebase_screen, event_timestamp), CONCAT(event_name, "-", firebase_screen) AS event_name_with_screen, DATETIME(TIMESTAMP_MICROS(event_timestamp), 'Asia/Seoul') AS event_datetime FROM base WHERE event_name IN ("screen_view", "click_payment") ) -- step_number + COUNT -- step_number : CASE WHEN을 사용해 숫자 지정 -- 일자별로 퍼널별 유저 수 쿼리 SELECT event_date, event_name_with_screen, CASE WHEN event_name_with_screen = "screen_view-welcome" THEN 1 WHEN event_name_with_screen = "screen_view-home" THEN 2 WHEN event_name_with_screen = "screen_view-food_category" THEN 3 WHEN event_name_with_screen = "screen_view-restaurant" THEN 4 WHEN event_name_with_screen = "screen_view-cart" THEN 5 WHEN event_name_with_screen = "click_payment-cart" THEN 6 ELSE NULL END AS step_number COUNT(DISTINCT user_pseudo_id) AS cnt FROM filter_event_and_concat_event_and_screen GROUP BY ALL HAVING step_number IS NOT NULL ORDER BY event_date
-
미해결BigQuery(SQL) 활용편(퍼널 분석, 리텐션 분석)
[바짝스터디 1주차 과제] ARRAY,STRUCT,PIVOT,FUNNEL
1. ARRAY, STRUCT중요 문법CROSS JOIN UNNEST ( ) 연습문제 1 ) ARRAY 데이터의 기본 추출 SELECT title, genre FROM `advanced.array_exercises` CROSS JOIN UNNEST(genres) AS genre 연습문제 2 ) STRUCT 데이터의 기본 추출 SELECT title, actor FROM `advanced.array_exercises` CROSS JOIN UNNEST(actors) AS actor --위와 같이 추출할시 actor 컬럼과 character 컬럼이명이 명확하게 나오지 않음으로 SELECT title, actor.actor, actor.character FROM `advanced.array_exercises` CROSS JOIN UNNEST(actors) AS actor --위와 같이 명확한 명칭을 적어주면 데이터의 컬럼을 확인하기 좋다 연습문제 3 ) CROSS JOIN 2번 사용 SELECT title, actor.actor, actor.character, genre FROM `advanced.array_exercises` CROSS JOIN UNNEST(actors) AS actor CROSS JOIN UNNEST(genres) AS genre -- OR SELECT title, actor.actor, actor.character, genre FROM `advanced.array_exercises`, UNNEST(actors) AS actor, UNNEST(genres) AS genre 연습문제 4 ) 로그 데이터 풀어보기 SELECT event_date, event_timestamp, event_name, event_param.key AS key, event_param.value.string_value AS string_value, event_param.value.int_value AS int_value, user_id, user_pseudo_id, platform FROM `advanced.app_logs` CROSS JOIN UNNEST (event_params) AS event_param2. PIVOT중요 문법IF(조건 = , TRUE , FALSE ) 컬럼명을 숫자로 설정하고 싶을시 ` 을 사용하여 감싸주어야한다 연습 문제 1 ) PIVOT 기본 SELECT order_date, SUM(IF(user_id = 1, total_amount, 0)) AS user_id_1, SUM(IF(user_id = 2, total_amount, 0)) AS user_id_2, SUM(IF(user_id = 3, total_amount, 0)) AS user_id_3 FROM( SELECT order_date, user_id, SUM(amount) AS total_amount FROM `advanced.orders` GROUP BY order_date , user_id ) GROUP BY order_date ORDER BY order_date 연습문제 2 ) PIVOT 기본 2SELECT user_id, SUM(IF(order_date = '2023-05-01', amount, 0)) AS `2023-05-01`, SUM(IF(order_date = '2023-05-02', amount, 0)) AS `2023-05-02`, SUM(IF(order_date = '2023-05-03', amount, 0)) AS `2023-05-03`, SUM(IF(order_date = '2023-05-04', amount, 0)) AS `2023-05-04`, SUM(IF(order_date = '2023-05-05', amount, 0)) AS `2023-05-05` FROM `advanced.orders` GROUP BY user_id ORDER BY user_id 연습문제 3) TRUE 값의 변화SELECT user_id, MAX(IF(order_date = '2023-05-01', 1, 0)) AS `2023-05-01`, MAX(IF(order_date = '2023-05-02', 1, 0)) AS `2023-05-02`, MAX(IF(order_date = '2023-05-03', 1, 0)) AS `2023-05-03`, MAX(IF(order_date = '2023-05-04', 1, 0)) AS `2023-05-04`, MAX(IF(order_date = '2023-05-05', 1, 0)) AS `2023-05-05` FROM `advanced.orders` GROUP BY user_id ORDER BY user_id 연습문제 4 ) 로그 데이터를 이용한 컬럼 정리SELECT event_date, event_timestamp, event_name, user_id, user_pseudo_id, MAX(IF(param.key = 'firebase_screen', param.value.string_value, NULL)) AS firebase_screen, MAX(IF(param.key = 'food_id', param.value.int_value, NULL)) AS food_id, MAX(IF(param.key = 'session_id', param.value.string_value, NULL)) AS session_id FROM `advanced.app_logs` CROSS JOIN UNNEST (event_params) AS param GROUP BY ALLGROUP BY ALL에 대한 이해,UNNEST 이후 컬럼을 어떤식으로 정리할것인가 3. 퍼널분석퍼널분석 쿼리 WITH logs AS ( SELECT event_date, event_timestamp, event_name, user_id, user_pseudo_id, MAX(IF(param.key = 'firebase_screen', param.value.string_value, NULL)) AS firebase_screen, MAX(IF(param.key = 'food_id', param.value.int_value, NULL)) AS food_id, MAX(IF(param.key = 'session_id', param.value.string_value, NULL)) AS session_id FROM `advanced.app_logs` CROSS JOIN UNNEST (event_params) AS param WHERE event_date BETWEEN '2022-12-01' AND '2022-12-31' GROUP BY ALL ), filter_logs AS ( SELECT * EXCEPT(event_name, firebase_screen, event_timestamp), CONCAT(event_name, '-', firebase_screen) AS event_name_screen, DATETIME(TIMESTAMP_MICROS(event_timestamp)) AS event_time FROM logs WHERE event_name IN ('screen_view', 'click_payment') ) funnel AS ( SELECT event_date, event_name_screen, CASE WHEN event_name_screen = 'screen_view-welcome' THEN 1 WHEN event_name_screen = 'screen_view-home' THEN 2 WHEN event_name_screen = 'screen_view-food_category' THEN 3 WHEN event_name_screen = 'screen_view-restaurant' THEN 4 WHEN event_name_screen = 'screen_view-cart' THEN 5 WHEN event_name_screen = 'click_payment-cart' THEN 6 ELSE NULL END AS flow, COUNT(DISTINCT user_pseudo_id) AS cnt FROM filter_logs GROUP BY ALL HAVING flow IS NOT NULL ORDER BY 1,3 ) PIVOTSELECT event_date, MAX(IF(flow = 1, cnt, 0)) AS screen_view_welcome, MAX(IF(flow = 2, cnt, 0)) AS screen_view_home, MAX(IF(flow = 3, cnt, 0)) AS screen_food_category, MAX(IF(flow = 4, cnt, 0)) AS screen_restaurant, MAX(IF(flow = 5, cnt, 0)) AS screen_cart, MAX(IF(flow = 6, cnt, 0)) AS click_payment_cart, FROM funnel GROUP BY event_date ORDER BY 1
-
미해결BigQuery(SQL) 활용편(퍼널 분석, 리텐션 분석)
[바짝스터디 1주차 과제] ARRAY, STRUCT 연습 문제 / PIVOT 연습 문제
1. ARRAY, STRUCT 연습문제 1)array_exercises테이블에서각영화(title)별로장르(genres)를UNNEST해서 보여주세요 SELECT title, genre FROM advanced.array_exercises CROSS JOIN UNNEST(genres) AS genre; 2)array_exercises테이블에서각영화(title)별로배우(actor)와배역(character)을 보여주세요. 배우와배역은별도의컬럼으로나와야합니다 SELECT title, actors.actor, actors.character FROM advanced.array_exercises CROSS JOIN UNNEST(actors) AS ac; 3)array_exercises테이블에서각영화(title)별로배우(actor),배역(character),장르 (genre)를출력하세요.한Row에배우,배역,장르가모두표시되어야합니다 SELECT title, actor.actor, actor.character, genre FROM `advanced.array_exercises` CROSS JOIN UNNEST(actors) as actor CROSS JOIN UNNEST(genres) as genre; 4)앱로그데이터(app_logs)의배열을풀어주세요 SELECT user_id, event_date, event_name, user_pseudo_id, event_param.key, event_param.value.string_value, event_param.value.int_value FROM `advanced.app_logs` CROSS JOIN UNNEST(event_params) AS event_param 2. PIVOT 연습 문제 1)orders테이블에서유저(user_id)별로주문금액(amount)의합계를PIVOT해주세요. 날짜(order_date)를행(Row)으로,user_id를열(Column)으로만들어야합니다 SELECT order_date, COALESCE(SUM(IF(user_id = 1, amount, null)),0) AS user_1, COALESCE(SUM(IF(user_id = 2, amount, null)),0) AS user_2, COALESCE(SUM(IF(user_id = 3, amount, null)),0) AS user_3 FROM advanced.orders GROUP BY order_date ORDER BY order_date 2)orders테이블에서날짜(order_date)별로유저들의주문금액(amount)의합계를 PIVOT해주세요.user_id를행(Row)으로,order_date를열(Column)으로만들어야 합니다 SELECT user_id, COALESCE(SUM(IF(order_date = '2023-05-01', amount, null)),0) AS `2023-05-01`, COALESCE(SUM(IF(order_date = '2023-05-02', amount, null)),0) AS `2023-05-02`, COALESCE(SUM(IF(order_date = '2023-05-03', amount, null)),0) AS `2023-05-03`, COALESCE(SUM(IF(order_date = '2023-05-04', amount, null)),0) AS `2023-05-04`, COALESCE(SUM(IF(order_date = '2023-05-05', amount, null)),0) AS `2023-05-05`, FROM advanced.orders GROUP BY user_id ORDER BY user_id; 3)orders테이블에서사용자(user_id)별,날짜(order_date)별로주문이있다면1, 없다면0으로PIVOT해주세요.user_id를행(Row)으로,order_date를열(Column)로 만들고주문을많이해도1로처리합니다 SELECT user_id, MAX(IF(order_date = '2023-05-01' AND order_id is not null, 1, 0)) AS `2023-05-01`, MAX(IF(order_date = '2023-05-02' AND order_id is not null, 1, 0)) AS `2023-05-02`, MAX(IF(order_date = '2023-05-03' AND order_id is not null, 1, 0)) AS `2023-05-03`, MAX(IF(order_date = '2023-05-04' AND order_id is not null, 1, 0)) AS `2023-05-04`, MAX(IF(order_date = '2023-05-05' AND order_id is not null, 1, 0)) AS `2023-05-05`, FROM advanced.orders GROUP BY user_id ORDER BY user_id; *) user_id=32888이카트추가하기(click_cart)를누를때어떤음식(food_id)을담았나요? WITH app_order_raw AS ( SELECT user_id, event_date, event_name, user_pseudo_id, pr.key, pr.value.string_value, pr.value.int_value FROM advanced.app_logs, UNNEST(event_params) AS pr WHERE event_date = '2022-08-01' ) SELECT user_id, event_date, event_name, user_pseudo_id, MAX(IF(key = 'firebase_screen', string_value, null)) AS firebase_screen, MAX(IF(key = 'food_id', int_value, null)) AS food_id, MAX(IF(key = 'session_id', string_value, null)) AS session_id, FROM app_order_raw GROUP BY user_id, event_date, event_name, user_pseudo_id; 3. 느낀점 퍼널이 무엇언지에 대해서 자세히 배울 수 있었음(학과 수업에선 퍼널에 대해서 딱히 알려주는 수업이 없어서 이렇게 재대로 배울 기회가 없었다.) 시작하기 앞서 빅쿼리에 대해 아는 것이 적어 잘할 수 있을지 고민이 많이 되었지만 강의 내용이 이해가 잘되어 열심히 노력하면 따라갈 수 있겠다는 생각이 듬 2주차도 잘부탁드립니다~
-
미해결BigQuery(SQL) 활용편(퍼널 분석, 리텐션 분석)
[바짝스터디 1주차 과제] ARRAY, STRUCT 연습 문제 / PIVOT 연습 문제 / 퍼널 쿼리 연습 문제
SELECT movie_id, title, genre FROM `advanced.array_exercises`, CROSS JOIN UNNEST(genres) AS genre SELECT title, actors.actor, actors.character FROM `advanced.array_exercises` CROSS JOIN UNNEST(actors) AS actors SELECT title, actor.actor, actor.character, genre FROM `advanced.array_exercises` CROSS JOIN UNNEST(actors) AS actor CROSS JOIN UNNEST(genres) AS genre SELECT event_date, event_timestamp, event_name, event_param.key AS key, event_param.value.string_value AS string_value, event_param.value.int_value AS int_value, user_id, user_pseudo_id, platform FROM `advanced.app_logs` CROSS JOIN UNNEST(event_params) AS event_paramSELECT order_date, SUM(IF(user_id = 1, amount, 0)) AS user_1, SUM(IF(user_id = 2, amount, 0)) AS user_2, SUM(IF(user_id = 3, amount, 0)) AS user_3 FROM advanced.orders GROUP BY ALL ORDER BY order_date -- 2. orders 테이블에서 날짜(order_date) 별로 유저들의 주문 금액(amount)의 합계를 PIVOT 해주세요. user_id를 행(Row)으로, order_Date를 열(Column)으로 만들어야 합니다. SELECT user_id, SUM(IF(order_date = "2023-05-01", amount, 0)) AS `2023-05-01`, SUM(IF(order_date = "2023-05-02", amount, 0)) AS `2023-05-02`, SUM(IF(order_date = "2023-05-03", amount, 0)) AS `2023-05-03`, SUM(IF(order_date = "2023-05-04", amount, 0)) AS `2023-05-04`, SUM(IF(order_date = "2023-05-05", amount, 0)) AS `2023-05-05`, FROM advanced.orders GROUP BY ALL ORDER BY user_id -- 3. orders 테이블에서 사용자(user_id)별, 날짜(order_date)별로 주문이 있다면 1, 없다면 0으로 PIVOT 해주세요. user_id를 행(Row)으로, order_date를 열(Column)로 만들고 주문을 많이 해도 1로 처리합니다. SELECT user_id, MAX(IF(order_date = "2023-05-01", 1, 0)) AS `2023-05-01`, MAX(IF(order_date = "2023-05-02", 1, 0)) AS `2023-05-02`, MAX(IF(order_date = "2023-05-03", 1, 0)) AS `2023-05-03`, MAX(IF(order_date = "2023-05-04", 1, 0)) AS `2023-05-04`, MAX(IF(order_date = "2023-05-05", 1, 0)) AS `2023-05-05`, FROM advanced.orders GROUP BY ALL ORDER BY user_id -- 4.앱 로그 PIVOT WITH base AS ( SELECT event_date, event_timestamp, event_name, user_id, user_pseudo_id, MAX(IF(param.key = "firebase_screen", param.value.string_value, NULL)) AS firebase_screen, MAX(IF(param.key = "food_id", param.value.int_value, NULL)) AS food_id, MAX(IF(param.key = "session_id", param.value.string_value, NULL)) AS session_id, FROM advanced.app_logs CROSS JOIN UNNEST(event_params) AS param GROUP BY ALL ) SELECT event_date, COUNT(user_id) AS user_cnt FROM base WHERE event_name = "click_cart" GROUP BY ALL FUNNEL with base as ( select event_date, event_timestamp, event_name, user_id, user_pseudo_id, platform, -- event_param max(if(event_param.key = 'firebase_screen', event_param.value.string_value, null)) as firebase_screen, -- max(if(event_param.key = 'food_id', event_param.value.int_value, null)) as food_id, max(if(event_param.key = 'session_id', event_param.value.string_value, null)) as session_id from advanced.app_logs cross join unnest(event_params) as event_param where event_date between '2022-08-01' and '2022-08-18' group by all ), filter_event_and_concat_event_and_screen as ( select * except(event_name, firebase_screen, event_timestamp), concat(event_name, '-', firebase_screen) as event_name_with_screen, datetime(timestamp_micros(event_timestamp), 'Asia/Seoul') as event_datetime from base where event_name in ('screen_view', 'click_payment' ) ) select event_date, event_name_with_screen, case when event_name_with_screen = 'screen_view-welcome' then 1 when event_name_with_screen = 'screen_view-home' then 2 when event_name_with_screen = 'screen_view-food_category' then 3 when event_name_with_screen = 'screen_view-restaurant' then 4 when event_name_with_screen = 'screen_view-cart' then 5 when event_name_with_screen = 'click_payment-cart' then 6 else null end as step_number, count(distinct user_pseudo_id) as cnt from filter_event_and_concat_event_and_screen group by all having step_number is not null order by 1, 3 -- 3번 퍼널 문제 with base as ( select event_date, event_timestamp, event_name, user_id, user_pseudo_id, platform, -- event_param max(if(event_param.key = 'firebase_screen', event_param.value.string_value, null)) as firebase_screen, -- max(if(event_param.key = 'food_id', event_param.value.int_value, null)) as food_id, max(if(event_param.key = 'session_id', event_param.value.string_value, null)) as session_id from advanced.app_logs cross join unnest(event_params) as event_param where event_date between '2022-08-01' and '2022-08-18' group by all ), filter_event_and_concat_event_and_screen as ( select * except(event_name, firebase_screen, event_timestamp), concat(event_name, '-', firebase_screen) as event_name_with_screen, datetime(timestamp_micros(event_timestamp), 'Asia/Seoul') as event_datetime from base where event_name in ('screen_view', 'click_payment' ) ) select event_date, event_name_with_screen, case when event_name_with_screen = 'screen_view-welcome' then 1 when event_name_with_screen = 'screen_view-home' then 2 when event_name_with_screen = 'screen_view-food_category' then 3 when event_name_with_screen = 'screen_view-restaurant' then 4 when event_name_with_screen = 'screen_view-cart' then 5 when event_name_with_screen = 'click_payment-cart' then 6 else null end as step_number, count(distinct user_pseudo_id) as cnt from filter_event_and_concat_event_and_screen group by all having step_number is not null order by 1, 3 SELECT event_date, MAX(IF(event_name_with_screen ="screen_view-welcome", cnt, NULL)) AS `screen_view-welcome`, MAX(IF(event_name_with_screen ="screen_view-home", cnt, NULL)) AS `screen_view-home`, MAX(IF(event_name_with_screen ="screen_view-food_category", cnt, NULL)) AS `screen_view-food_category`, MAX(IF(event_name_with_screen ="screen_view-restaurant", cnt, NULL)) AS `screen_view-restaurant`, MAX(IF(event_name_with_screen ="screen_view-cart", cnt, NULL)) AS `screen_view-cart`, MAX(IF(event_name_with_screen ="click_payment-cart", cnt, NULL)) AS `click_payment-cart` FROM main GROUP BY ALL ORDER BY all