인프런 커뮤니티 질문&답변

sjb2774님의 프로필 이미지
sjb2774

작성한 질문수

BigQuery(SQL) 활용편(퍼널 분석, 리텐션 분석)

[빠짝스터디 1주차 과제] ARRAY, STRUCT 연습 문제/ PIVOT 연습문제/ 퍼널 쿼리 연습 문제

작성

·

60

·

수정됨

0

ARRAY, STRUCT 연습문제

-- 1. array_exercises 테이블에서 title 별로 genres를 UNNEST하기 
SELECT
  title
  , genre
FROM advanced.array_exercises
CROSS JOIN UNNEST(genres) as genre;


-- 2. array_exercises 테이블에서 title 별로 actor, character 추출
-- actor, character는 별도의 컬럼으로 빼기 (struct의 key로써 존재하면 안 됨.)
SELECT 
  title
  , ACTORS.actor AS actor
  , ACTORS.character AS character
FROM advanced.array_exercises
CROSS JOIN UNNEST(actors) AS ACTORS;


-- 3. array_exercises 테이블에서 title 별로 actor, character, genre 추출
-- 여러 ARRAY 컬럼을 UNNEST할 경우, 각 컬럼별로 UNNEST한 것을 CROSS JOIN 진행하면 됨.

SELECT 
  title
  , ACTORS.actor AS actor
  , ACTORS.character AS character
  , genre
FROM advanced.array_exercises
CROSS JOIN UNNEST(actors) AS ACTORS
CROSS JOIN UNNEST(genres) as genre;


-- 4. app_logs 테이블(약 73만 건의 로그 데이터)의 ARRAY를 UNNEST 하기
-- event_params 형태?
-- ARRAY<STRUCT<key STRING, value STRUCT<string_value STRING, int_value INT64>>>[
--  STRUCT('firebase_screen', STRUCT('food_detail', NULL))
--  , ...
-- ]
SELECT
  user_id
  , event_date
  , event_name
  , user_pseudo_id
  , EVENT_PARAMS.key AS key
  , EVENT_PARAMS.value.string_value AS string_value
  , EVENT_PARAMS.value.int_value AS int_value
FROM advanced.app_logs
CROSS JOIN UNNEST(event_params) AS EVENT_PARAMS

 

PIVOT 연습문제

-- 1. orders 테이블에서 order_date별 user_id간 amount 합계를 PIVOT하기
SELECT 
  order_date
  , SUM(IF(user_id=1, amount, 0)) AS user_1
  , SUM(IF(user_id=2, amount, 0)) AS user_2
  , SUM(IF(user_id=3, amount, 0))AS user_3
FROM advanced.orders
GROUP BY order_date
ORDER BY order_date ASC;


-- 2. orders 테이블에서 user_id별 order_date 간 amount 합계를 PIVOT하기
SELECT 
  user_id
  , SUM(IF(order_date='2023-05-01', amount, 0)) AS `2023-05-01`
  , SUM(IF(order_date='2023-05-02', amount, 0)) AS `2023-05-02`
  , SUM(IF(order_date='2023-05-03', amount, 0)) AS `2023-05-03`
  , SUM(IF(order_date='2023-05-04', amount, 0)) AS `2023-05-04`
  , SUM(IF(order_date='2023-05-05', amount, 0)) AS `2023-05-05`
FROM advanced.orders
GROUP BY user_id
ORDER BY user_id ASC;


-- 3. orders 테이블에서 user_id별 order_date간 주문존재 여부 PIVOT하기
-- 주문 존재하면 1, 없으면 0; 주문횟수가 아님에 유의
SELECT
  user_id
  , MAX(IF(order_date='2023-05-01', 1, 0)) AS `2023-05-01`
  , MAX(IF(order_date='2023-05-02', 1, 0)) AS `2023-05-02`
  , MAX(IF(order_date='2023-05-03', 1, 0)) AS `2023-05-03`
  , MAX(IF(order_date='2023-05-04', 1, 0)) AS `2023-05-04`
  , MAX(IF(order_date='2023-05-05', 1, 0)) AS `2023-05-05`
FROM advanced.orders
GROUP BY user_id
ORDER BY user_id ASC;


-- 4. app_logs 테이블 PIVOT 하기
-- user_id=32888이 카트 추가하기(click_cart)를 누를 때 어떤 음식(food_id)를 담았나요?
WITH temp AS(
  SELECT
    user_id
    , event_date
    , event_name
    , event_timestamp
    , user_pseudo_id
    , EVENT_PARAMS.key AS key
    , EVENT_PARAMS.value.string_value AS string_value
    , EVENT_PARAMS.value.int_value AS int_value
  FROM advanced.app_logs
  CROSS JOIN UNNEST(event_params) AS EVENT_PARAMS
)

SELECT
  user_id
  , event_date
  , event_name
  , event_timestamp
  , user_pseudo_id
  , MAX(IF(key='firebase_screen', string_value, null)) AS `firebase_screen`
  , MAX(IF(key='food_id', int_value, null)) AS `food_id`
  , MAX(IF(key='session_id', string_value, null)) AS `session_id`
FROM temp
-- GROUP BY user_id, event_date, event_timestamp, event_name, user_pseudo_id
GROUP BY ALL

 

퍼널 쿼리 연습문제

WITH main AS (
  SELECT

    event_date,
    CONCAT(event_name,'-', event_param.value.string_value) AS event_name_with_screen,
    CASE
      WHEN event_name = 'screen_view' AND event_param.value.string_value = 'welcome' THEN 1
      WHEN event_name = 'screen_view' AND event_param.value.string_value = 'home' THEN 2
      WHEN event_name = 'screen_view' AND event_param.value.string_value = 'food_category' THEN 3
      WHEN event_name = 'screen_view' AND event_param.value.string_value = 'restaurant' THEN 4
      WHEN event_name = 'screen_view' AND event_param.value.string_value = 'cart' THEN 5
      WHEN event_name = 'click_payment' AND event_param.value.string_value = 'cart' THEN 6
    END AS step_num,
    COUNT(DISTINCT user_pseudo_id) AS cnt

  FROM advanced.app_logs
  CROSS JOIN UNNEST(event_params) AS event_param
  WHERE 1=1
  AND event_date BETWEEN '2022-08-01' AND '2022-08-18'
  AND event_param.key = 'firebase_screen'
  AND event_name IN ("screen_view",'click_payment')
  GROUP BY 1,2,3
  HAVING step_num IS NOT NULL
)

SELECT
  event_date,
  SUM(IF(step_num = 1, cnt, 0)) AS `screen_view-welcome`,
  SUM(IF(step_num = 2, cnt, 0)) AS `screen_view-home`,
  SUM(IF(step_num = 3, cnt, 0)) AS `screen_view-food_category`,
  SUM(IF(step_num = 4, cnt, 0)) AS `screen_view-restaurant`,
  SUM(IF(step_num = 5, cnt, 0)) AS `screen_view-cart`,
  SUM(IF(step_num = 6, cnt, 0)) AS `click_payment-cart`
FROM main
GROUP BY 1
ORDER BY event_date ASC

답변 1

0

안녕하세요, 인프런 AI 인턴입니다. sjb2774님.

BigQuery에서 ARRAY, STRUCT, PIVOT 등을 활용하여 다양한 데이터를 처리하는 연습을 하셨군요. 이러한 연습은 데이터를 효과적으로 분석하고 시각화하는 데 큰 도움이 됩니다.

아래에 관련된 기존 질문 및 답변 링크를 통해 더 많은 도움을 얻을 수 있습니다:

저는 질문자님의 이해를 돕고자 지식공유자님의 기존 답변을 참고하여 유사한 답변 링크를 추천해드리고 있습니다. 현재 베타 기능으로 답변이 만족스럽지 않을 수 있는 점 양해 부탁드립니다. 🙏 추가적으로 궁금한 점이 있으시면, 이어서 질문해 주세요. 곧 지식공유자께서 답변해 주실 것입니다.

sjb2774님의 프로필 이미지
sjb2774

작성한 질문수

질문하기