에러가 자꾸 뜨는데 코드에 문제가 있나요 ㅠㅠ

인프런 커뮤니티 질문&답변

임헌각

작성한 질문수

직장인을 위한 프로그래밍 입문과 업무자동화 활용

웹 스크래핑 selenium

작성

177

from bs4 import BeautifulSoup
import requests
import pandas as pd
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.suppor import expected_conditions as EC
from selenium.webdriver.common.by import By

browser = webdriver.Chrome('chromdriver')
bills = list()

for i in range(1, 2):
    response = requests.get('http://watch.peoplepower21.org/index.php?mid=Euian&show=1&page={}&title=&rec_num=15&lname=&sangim=&bill_result='.format(i))
    html = response.text


    soup = BeautifulSoup(html, 'lxml')
    body = soup.body

    div_ea_list = body.find(id='ea_list')
    table = div_ea_list.table
    tbody = table.tbody

    lines = tbody.find_all('tr')

    for line in lines:
        td_list = line.find_all('td')
        bills.append(
            [td_list[0].text, td_list[1].text, td_list[2].text, td_list[3].text, td_list[4].text]
        )
        bill_url = 'http://watch.peoplepower21.org' +  td_list[1].a.get('href')

        print(bill_url)

        browser.get(bill_url)
        browser.implicitly_wait(5)
        WebDriverWait(browser, 20).until(EC.presence_of_element_located(
                (By.ID, 'collapseTwo')

        ))

        html = browser.page_source
        soup = BeautifulSoup(html, 'lxml')

        body = soup.body
        proposers = body.find(id='collapseTwoe').text.replace('','')

        bills.append(
            [td_list[0].text, td_list[1].text, td_list[2].text, td_list[3].text, td_list[4].text]
        )



df = pd.DataFrame(bills, columns=['제안일', '의안명', '발의자명단','상임위','상태' ])
df.to_excel('bill.xlsx')

browser.quit()

python

답변 2

임헌각

질문자

감사합니다.

SungYong Lee

지식공유자

안녕하세요. 혹시 이 문제 해결하셨나요? 오타가 중간중간 보이네요.

6번째 줄에

from selenium.webdriver.suppor import expected_conditions as EC

이 부분. suppor 이 아니라 support 입니다.

그리고 9번째 줄에

browser = webdriver.Chrome('chromdriver')

이 부분. chromdriver가 아니라 chromedriver 입니다.

이렇게 고치고 나면, 17번째 줄에서 아래와 같은 에러메세지가 뜰 수 있습니다.

Traceback (most recent call last):

File "C:/github/python_tutorial_for_salarymen/hk.py", line 17, in <module>

soup = BeautifulSoup(html, 'lxml')

File "C:\github\python_tutorial_for_salarymen\venv\lib\site-packages\bs4\__init__.py", line 208, in __init__

% ",".join(features))

bs4.FeatureNotFound: Couldn't find a tree builder with the features you requested: lxml. Do you need to install a parser library?

이 문제는 터미널에서

pip install lxml

명령어를 입력해서 해결할 수 있습니다. lxml 모듈이 깔리지 않았다는 뜻이거든요

17번째 줄을 통과하고 나니 46번째 줄에 오타가 있습니다.

proposers = body.find(id='collapseTwoe').text.replace('','')

여기서 collapseTwoe는 오타인 것 같죠? collapseTwo로 수정합니다.

혹시 openpyxl은 설치하셨나요? 그렇지 않다면 파일 저장할 때 문제가 생길 수 있습니다. 이 경우

pip install openpyxl

로 설치해주세요.

에러 메세지를 확인하면서 어디서 문제가 생겼는지 확인하시면, 더 빠르게 파이썬을 습득하실 수 있을겁니다. 강의 막바지까지 들으셨는데, 끝까지 화이팅하세요~!

임헌각

작성한 질문수

전체 Q&A

질문하기