인프런 커뮤니티 질문&답변

이조현님의 프로필 이미지
이조현

작성한 질문수

BigQuery(SQL) 활용편(퍼널 분석, 리텐션 분석)

[빠짝스터디 1주차 과제] ARRAY, STRUCT / PIVOT / 퍼널 쿼리 연습 문제

작성

·

18

0

1. ARRAY, STRUCT

1)

SELECT  title, movie_genres
FROM advanced.array_exercises
CROSS JOIN UNNEST(genres) AS movie_genres
LIMIT 100

2)

SELECT title,
      actor.actor,
      actor.character
FROM advanced.array_exercises
CROSS JOIN UNNEST(actors) AS actor

3)

SELECT title, actors.actor, actors.character, genres
FROM advanced.array_exercises
CROSS JOIN UNNEST(actors) AS actors
CROSS JOIN UNNEST(genres) AS genres
WHERE actor = 'ChrisEvans'

4)

select user_id
     , event_date
     , event_name
     , user_pseudo_id
     , param.key as key
     , param.value.string_value as string_value
     , param.value.int_value as int_value
  from advanced.app_logs
     , unnest(event_params) as param  

 

2. PIVOT

1)

select order_date
     , sum(if(user_id = 1,amount,0)) as user_1
     , sum(if(user_id = 2,amount,0)) as user_2
     , sum(if(user_id = 3,amount,0)) as user_3
  from advanced.orders
 group by order_date
 order by order_date

2)

select user_id
     , max(if(order_date = '2023-05-01',amount,0)) as `2023-05-01`
     , max(if(order_date = '2023-05-02',amount,0)) as `2023-05-02`
     , max(if(order_date = '2023-05-03',amount,0)) as `2023-05-03`
     , max(if(order_date = '2023-05-04',amount,0)) as `2023-05-04`
     , max(if(order_date = '2023-05-05',amount,0)) as `2023-05-05`
  from advanced.orders
 group by user_id
 order by user_id

3)

select user_id
     , max(if(order_date = '2023-05-01',1,0)) as `2023-05-01`
     , max(if(order_date = '2023-05-02',1,0)) as `2023-05-02`
     , max(if(order_date = '2023-05-03',1,0)) as `2023-05-03`
     , max(if(order_date = '2023-05-04',1,0)) as `2023-05-04`
     , max(if(order_date = '2023-05-05',1,0)) as `2023-05-05`
  from advanced.orders
 group by user_id
 order by user_id

4)

WITH base AS (
  SELECT
    user_id,
    event_date,
    event_name,
    user_pseudo_id,
    event_param.key AS key,
    event_param.value.string_value AS string_value,
    event_param.value.int_value AS int_value
  FROM advanced.app_logs
  CROSS JOIN UNNEST(event_params) AS event_param
)

SELECT
  user_id,
  event_date,
  event_name,
  user_pseudo_id,
  MAX(IF(key = 'firebase_screen', string_value, NULL)) AS firebase_screen,
  MAX(IF(key = 'food_id', int_value, NULL)) AS food_id,
  MAX(IF(key = 'session_id', string_value, NULL)) AS session_id
FROM base
WHERE 
 user_id = 32888
 AND event_name = 'click_cart'
GROUP BY ALL

3. 퍼널

1)

with base as
(
  select event_date     
       , event_name
       , event_timestamp     
       , user_id
       , user_pseudo_id
       , platform
       , max(if(param.key = 'firebase_screen', param.value.string_value, null)) as firebase_screen
    from advanced.app_logs
       , unnest(event_params) as param
   where event_date between '2022-08-01' and '2022-08-18'
   group by all
),
filter_event as 
(
  select 
    * except(event_name, firebase_screen)
    , concat(event_name, "-", firebase_screen) as event_name_with_screen      
    from base    
  where event_name in ('screen_view', 'click_payment')
)
select event_date
     , event_name_with_screen
     , case when event_name_with_screen = 'screen_view-welcome' then 1
            when event_name_with_screen = 'screen_view-home' then 2
            when event_name_with_screen = 'screen_view-food_category' then 3
            when event_name_with_screen = 'screen_view-restaurant' then 4
            when event_name_with_screen = 'screen_view-cart' then 5
            when event_name_with_screen = 'click_payment-cart' then 6
       else null
       end as step_number  
     , count(distinct user_pseudo_id) as cnt
  from filter_event
 group by all
 having step_number is not null
 order by event_date

2)

with base as
(
  select event_date     
       , event_name
       , event_timestamp     
       , user_id
       , user_pseudo_id
       , platform
       , max(if(param.key = 'firebase_screen', param.value.string_value, null)) as firebase_screen
    from advanced.app_logs
       , unnest(event_params) as param
   where event_date between '2022-08-01' and '2022-08-18'
   group by all
),
filter_event as 
(
  select 
    * except(event_name, firebase_screen)
    , concat(event_name, "-", firebase_screen) as event_name_with_screen      
    from base    
  where event_name in ('screen_view', 'click_payment')
),
daily_group as
(
select event_date
     , event_name_with_screen
     , case when event_name_with_screen = 'screen_view-welcome' then 1
            when event_name_with_screen = 'screen_view-home' then 2
            when event_name_with_screen = 'screen_view-food_category' then 3
            when event_name_with_screen = 'screen_view-restaurant' then 4
            when event_name_with_screen = 'screen_view-cart' then 5
            when event_name_with_screen = 'click_payment-cart' then 6
       else null
       end as step_number  
     , count(distinct user_pseudo_id) as cnt
  from filter_event
 group by all
 having step_number is not null
 order by event_date
)
select event_date
     , max(if(event_name_with_screen = 'screen_view-welcome',cnt,null)) as screen_view_welcome
     , max(if(event_name_with_screen = 'screen_view-home',cnt,null)) as screen_view_home
     , max(if(event_name_with_screen = 'screen_view-food_category',cnt,null)) as screen_view_food_category
     , max(if(event_name_with_screen = 'screen_view-restaurant',cnt,null)) as screen_view_restaurant
     , max(if(event_name_with_screen = 'screen_view-cart',cnt,null)) as screen_view_cart
     , max(if(event_name_with_screen = 'click_payment-cart',cnt,null)) as click_payment_cart
  from daily_group
 group by all
 order by event_date

 

답변 1

0

안녕하세요, 인프런 AI 인턴입니다. 이조현님.

SQL을 활용하여 다양한 분석을 시도하는 것은 매우 유용한 스킬입니다. 특히 BigQuery에서 ARRAY, STRUCT, PIVOT 및 퍼널 쿼리를 다루는 것은 데이터 분석의 기본이 되죠. 각 쿼리에 대해 간단히 설명드리겠습니다.

  1. ARRAY, STRUCT

    • ARRAY와 STRUCT 타입을 활용하여 데이터를 처리하는 데 도움이 되는 인사이트를 얻기 위해 쿼리를 사용하는 예제입니다. UNNEST를 통해 배열의 각 요소를 개별 행으로 확장하여 다양한 데이터를 병합할 수 있습니다.
  2. PIVOT

    • PIVOT 쿼리는 데이터의 특정 측면을 기준으로 데이터를 전환하여 더 간편한 형태로 나타내는 데 사용됩니다. 예를 들어, 날짜별 사용자 주문을 다양한 유저 ID에 맞춰 피벗팅하여 시각화하는 과정은 비즈니스 인사이트 도출에 유용합니다.
  3. 퍼널 분석

    • 퍼널 쿼리는 사용자 행동 흐름을 파악하는 데 중요한 역할을 하며, 사용자의 단계별 진행 상황을 확인할 수 있습니다. 퍼널 내에서 사용자의 경로를 추적하고 어떤 단계에서 이탈이 발생하는지를 파악할 수 있습니다.

각 쿼리에 대해서는 다음의 유사 질문을 참조하시면 더 깊이 있는 이해를 얻을 수 있습니다:

저는 질문자님의 이해를 돕고자 지식공유자님의 기존 답변을 참고하여 유사한 답변 링크를 추천해드리고 있습니다.
현재 베타 기능으로 답변이 만족스럽지 않을 수 있는 점 양해 부탁드립니다. 🙏
추가적으로 궁금한 점이 있으시면, 이어서 질문해 주세요. 곧 지식공유자께서 답변해 주실 것입니다.

이조현님의 프로필 이미지
이조현

작성한 질문수

질문하기