상관서브쿼리에서 서브쿼리내부 GROUP BY 생략가능 여부

안녕하세요.

강의실습을 진행하다가 궁금한 사항이 있어 문의드립니다.

가장최근 급여정보를 비상관으로 풀이한 쿼리를 보면 서브쿼리내에 GROUP BY가 생략된 것 같아서요

select * from hr.emp_salary_hist a where todate = (select max(todate) from hr.emp_salary_hist x where a.empno = x.empno);

서브쿼리내 where a.empno = x.empno

에 의해 group by 가 없어도 직원별 max(todate)를 가져오게 된건가요?

그러면 아래의 쿼리에서도 group by를 생략해도 되는건가요?

-- 2건 이상 주문을 한 고객 정보

select * from nw.customers a

where exists (select 1 from nw.orders x where x.customer_id = a.customer_id

group by customer_id having count(*) >=2);

안녕하십니까,

select * from hr.emp_salary_hist a where todate = (select max(todate) from hr.emp_salary_hist x where a.empno = x.empno);

서브쿼리내 where a.empno = x.empno

에 의해 group by 가 없어도 직원별 max(todate)를 가져오게 된건가요?

=> 네 맞습니다. 그리고 group by가 없어도 된다기 보다, 오히려 group by 를 안쓰시는게 더 명확한 SQL이 될 수 있습니다. max()와 같은 aggregation이 반드시 group by 필요하지는 않습니다. 전체 데이터에도 aggregation함수를 적용할 수 있습니다.

그러니까 서브쿼리 (select max(todate) from hr.emp_salary_hist x where a.empno = x.empno)는

max(todate)가 없다면 where a.empno = x.empno에 의해서 여러건이 만들어지게 되지만 max(todate)를 통해 단 한건의 가장 최근 todate를 반환하게 됩니다. 굳이 group by x.empno를 하실 필요가 없습니다.

하지만 아래는 다릅니다.

select * from nw.customers a

where exists (select 1 from nw.orders x where x.customer_id = a.customer_id

group by customer_id having count(*) >=2);

명백하게 서브쿼리에서 group by customer로 하였을 때 count(*)가 2개 이상인 customer_id를 filtering 요구하고 있습니다. 서브쿼리내의 where x.customer_id = a.customer_id 조건으로 결과가 2건이 안되는 customer_id가 나올 수 있습니다.

때문에 위 SQL은 Group by 를 제거해서는 안됩니다.

감사합니다.

인프런 커뮤니티 질문&답변