Yahoo Finance에서 URL 가져오기

파이썬 증권 데이터 수집과 분석으로 신호와 소음 찾기

[3/8] 수집하고자 URL로 데이터를 가져올 수 없다면? requests사용하기

작성

530

이 강의를 듣고 나면 네이버금융 뿐만 아니라 다른 웹 스크래핑도 가능할 것이라 하셨는데, 처음부터 너무 막히니 속상하네요. 스크립트는 다음과 같습니다.

import pandas as pd
from bs4 import BeautifulSoup as bs

url = f"https://finance.yahoo.com/quote/YM%3DF/history?p=YM%3DF"
table = pd.read_html(url)
response = requests.get(url, headers = headers)
html = bs(response.text)
table = html.select("table")
temp = pd.read_html(str(table))
temp[0]

여기서 url부터 인식을 못하고 HTTPError가 뜹니다. 해결방법이 있을까요?

HTTPError: HTTP Error 404: Not Found

url numpy 스크래핑 python pandas 웹 크롤링 seaborn 웹 스크래핑 plotly matplotlib

답변 1

박조은

지식공유자

import pandas as pd
from bs4 import BeautifulSoup as bs

url = f"https://finance.yahoo.com/quote/YM%3DF/history?p=YM%3DF"
table = pd.read_html(url) <= 이 부분 제거가 필요합니다.
response = requests.get(url, headers = headers)
html = bs(response.text)
table = html.select("table")
temp = pd.read_html(str(table))
temp[0]

1) headers 가 있는데 다른곳에서 선언해 준것인가요?

2) 오류가 나는 문장이 중간에 있으니 한 줄 한 줄 해보세요.

import requests
import pandas as pd
from bs4 import BeautifulSoup as bs

headers = {"user-agent": "Mozilla/5.0"}
url = f"https://finance.yahoo.com/quote/YM%3DF/history?p=YM%3DF"

response = requests.get(url, headers = headers)
html = bs(response.text)
table = html.select("table")
temp = pd.read_html(str(table))
temp[0]

잘 가져와집니다.

hmin

질문자

감사합니다!! requests는 import 하는 것 한줄로 되는지 몰랐네요 ㅠ

인프런 커뮤니티 질문&답변

Yahoo Finance에서 URL 가져오기