'Python/Python Crawling' 카테고리의 글 목록

06. 판다스 (pandas) 연습 04

import pandas as pd # 데이터 처리 from pandas import DataFrame from matplotlib import pyplot as plt # 시각화 def korean_font(): # 한글 처리 plt.rc('font', family='Malgun Gothic') # - 부호 오류 처리 plt.rcParams['axes.unicode_minus'] = False def pandas_basic(): df = pd.read_csv('test1.csv', sep='/') print(df) ax = df.plot(kind='bar') ax.set_title('학생 성적표', fontsize=16) ax.set_xlabel('학생 이름') ax.set_ylabel('각과목 점수'..

Python/Python Crawling 2023.03.05

05. 판다스 (pandas) 03

import pandas as pd import requests import ora_module as db # pandas, numpy, matplotlib URL = 'http://kind.krx.co.kr/corpgeneral/corpList.do?' \ 'method=download&searchType=13' # pd.set_option('display.max_columns', None) df = pd.read_html(URL)[0] # print(df) # print(df.head()); # print(df.tail()) print(df.info()) #print(df.describe()) #print(df['종목코드'].map('{:06d}'.format)) df = df[['회사명', '종목코..

Python/Python Crawling 2023.03.05

04. 판다스 (pandas) 02

import pandas as pd from matplotlib import pyplot as plt from pandas import DataFrame def korean_font(): plt.rc('font', family='Malgun Gothic') plt.rcParams['axes.unicode_minus'] = False # scala:상수, vector:1차원, matrix:2차원, tensor:3차원이상 # Series: vector, dataFrame: matrix def serise_pa(): # 2015년도 각 도시의 인구 s = pd.Series([9904312, 3448737, 2890451, 2466052], index=['서울', '부산', '인천', '대구']) # s = p..

Python/Python Crawling 2023.03.05

03. 판다스 (pandas) 01

import pandas as pd # 데이터 분석 라이브러리 from matplotlib import pyplot as plt # 데이터 시각화 라이브러리 from pandas import DataFrame matplotlib에서의 한글 사용 (파이참 기준) def korean_font(): plt.rc('font', family='Malgun Gothic') plt.rcParams['axes.unicode_minus'] = False # 부호를 맞춰준다 CSV 파일 읽어오기 csv는 ,(콤마)로 구분되어져 있다 def pandas_basic(): df = pd.read_csv('../DataBaseConnect 데이터베이스 연결/test.csv', sep=',') print(df) print(type..

Python/Python Crawling 2023.02.24

02. 셀레니움 selenium 02

import time from selenium import webdriver from selenium.webdriver.common.by import By driver = webdriver.Chrome() driver.get('http://www.daum.net') time.sleep(2) input_ele = driver.find_element(By.NAME, 'q') time.sleep(1) input_ele.send_keys('빅데이터') time.sleep(2) input_ele.submit() time.sleep(2) link_info = driver.find_element(By.LINK_TEXT, '빅데이터(Big Data)란 무엇일까?') time.sleep(2) link_info.click..

Python/Python Crawling 2023.02.19

02. 셀레니움 selenium 01

# 데이터 수집 - 데이터 (웹) 크롤링, 스크랩핑 # 시스템을 구축 -> 데이터 수집 -> 전처리 # 정형, 반정형, 비정형 ( 3가지로 구분 ) # 정형 - 엑셀, DB, CSV # 반정형 - html, xml, json # 비정형 - 소리(음성), 이미지, 영상 등등 ( 실생활에서의 아날로그 데이터들 ) # selenium import time from selenium import webdriver # 크롬 -> 도움말 -> 크롬 정보 -> 버전 확인 109.0.5414.75 # 버전업이 되면서 크롬 드라이버 설치를 안해도 된다 from selenium.webdriver.common.by import By """ browser = webdriver.Chrome() # 브라우저 정보를 얻음 brows..

Python/Python Crawling 2023.02.15

01. 뷰티풀 수프 BeautifulSoup 02

import requests as requests from bs4 import BeautifulSoup import re http protocol(네트워크 규약) request(클라이언트) -> response(서버) 네이버 뉴스 속보 URL = 'https://news.naver.com/main/list.nhn' res = requests.get(URL, headers={'User-Agent':'Mozilla/5.0'}) html = res.text soup = BeautifulSoup(html, 'html.parser') print(soup) for i in soup.select('span[class=lede]'): print(i.text.strip()) URL = 'https://news.naver..

Python/Python Crawling 2023.02.13

01. 뷰티풀 수프 BeautifulSoup 01

BeautifulSoup HTML 파싱 파이썬 라이브러리 bs4 라이브러리 설치 필요 from bs4 import BeautifulSoup import re import urllib.request 사용법 1 with open('example.html', 'r', encoding="UTF-8") as fp: soup = BeautifulSoup(fp, 'html.parser') # html 파싱한다라는 의미 -> 전부다 가져옴 print(soup) 사용법 2 url = 'http://movie.daum.net/magazine/new' with urllib.request.urlopen(url) as res: html = res.read() soup = BeautifulSoup(html, 'html.parse..

Python/Python Crawling 2023.02.12

HicKee

Python/Python Crawling 8

티스토리툴바

« 2025/10 »
일	월	화	수	목	금	토
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31