네이버 현재 상영중 영화 순위¶
In [1]:
%%html
<style type='text/css'>
.CodeMirror{ font-size: 14px; font-family: callable}
</style>
In [2]:
# 라이브러리
import requests
from bs4 import BeautifulSoup
from datetime import date, timedelta
In [3]:
# 날짜
yesterday = date.today() - timedelta(1)
time = yesterday.strftime('%Y%m%d')
# 주소
url = 'https://movie.naver.com/movie/sdb/rank/rmovie.nhn'
#params = {'sel':'cnt', 'date':time} #조회수
params = {'sel':'cur', 'date':time} #평점순(현재상영)
#params = {'sel':'pnt', 'date':time} #평점순(모든영화)
In [4]:
# get
# get 요청
response = requests.get(url, params=params)
status_code = response.status_code
print(status_code)
if status_code == 200:
text = response.text
else:
soup == 'error'
In [5]:
# str ==> BeautifulSoup
soup = BeautifulSoup(text)
In [6]:
# 영화 전체정보
movie_all = soup.select_one('.list_ranking')
#movie_all
In [7]:
# 랭킹
img_one = movie_all.select_one('img') # 한개
img_all = movie_all.select('img') # 전부
In [8]:
# 타이틀
a_one = movie_all.select_one('a') # 한개
a_all = movie_all.select('a') # 전부
In [9]:
# 평점
point_one = movie_all.select_one('td.point')
point_all = movie_all.select('td.point')
In [10]:
# 랭킹 리스트에 담기
movie_rank_one = img_one['alt']
rank_list = []
for item in img_all:
if len(item['alt']) == 3:
item['alt'] = '10'
else :
movie_rank_one = item['alt']
rank_list.append(movie_rank_one)
rank_list = rank_list[0::3]
rank_list
Out[10]:
In [11]:
# 타이틀 리스트에 담기
movie_title_one = a_one.text
title_list = []
for item in a_all:
movie_title_one = item.text
title_list.append(movie_title_one)
title_list = title_list[0::2]
title_list
Out[11]:
In [12]:
# 평점 리스트에 담기
movie_point_one = point_one.text
point_list = []
for item in point_all:
movie_point_one = item.text
point_list.append(movie_point_one)
point_list
Out[12]:
In [13]:
movie_rank_list = []
for i in range(50):
movie_rank_dict = dict()
movie_rank_dict['순위'] = rank_list[i]
movie_rank_dict['타이틀'] = title_list[i]
movie_rank_dict['평점'] = point_list[i]
movie_rank_list.append(movie_rank_dict)
print(time)
movie_rank_list
Out[13]:
In [14]:
# 좀 더 효율적인 방법
movie_table = soup.select_one('.list_ranking')
movie_tr_all = movie_table.select('tr')
movie_tr_one = movie_tr_all[2]
movie_title = movie_tr_one.select_one('a[title]').text
movie_point = movie_tr_one.select_one('td.point').text
movie_title_point_list = []
for item in movie_tr_all:
movie_title = item.select_one('a[title]')
movie_point = item.select_one('td.point')
title = ''
point = 0
if movie_title:
title = movie_title.text
else:
continue
if movie_point:
point = movie_point.text
else:
continue
movie_title_point_list.append((title, point))
movie_title_point_list
Out[14]:
'학원 > Python' 카테고리의 다른 글
데이터수집- XML - 기상청 (0) | 2020.06.08 |
---|---|
데이터 수집 - 다음뉴스 (0) | 2020.06.08 |
데이터수집 - 네이버 책 검색 (0) | 2020.06.08 |
스크래핑과 크롤링 (0) | 2020.06.02 |
tinyDB 설치와 조작 (0) | 2020.06.02 |