파파스머프의 블로그

2016년 6월 29일 수요일

파이썬 - kNN 알고리즘 구현에 대한 데모입니다.

k-Nearest Neighbers(k-근접이웃)은 가장 단순한 예측 모델 중에 하나입니다.

아래와 같이 모듈을 임포트 받습니다.

from __future__ import division, unicode_literals

from collections import Counter

import math, random

import matplotlib.pyplot as plt

from linear_algebra import distance

from statistics import mean

from plot_state_borders import plot_state_borders

약간의 예시 데이터를 통해서 분석을 하면 다음과 같이 출력된다.

cities = [(-86.75,33.5666666666667,'Python'),(-88.25,30.6833333333333,'Python'),(-112.016666666667,33.4333333333333,'Java'),(-110.933333333333,32.1166666666667,'Java'),(-92.2333333333333,34.7333333333333,'R'),(-121.95,37.7,'R')

cities = [([longitude, latitude], language) for longitude, latitude, language in cities]

def plot_cities():

# key is language, value is pair (longitudes, latitudes)

plots = { "Java" : ([], []), "Python" : ([], []), "R" : ([], []) }

# we want each language to have a different marker and color

markers = { "Java" : "o", "Python" : "s", "R" : "^" }

colors = { "Java" : "r", "Python" : "b", "R" : "g" }

for (longitude, latitude), language in cities:

plots[language][0].append(longitude)

plots[language][1].append(latitude)

# create a scatter series for each language

for language, (x, y) in plots.iteritems():

plt.scatter(x, y, color=colors[language], marker=markers[language],

label=language, zorder=10)

plot_state_borders() # assume we have a function that does this

plt.legend(loc=0) # let matplotlib choose the location

plt.axis([-130,-60,20,55]) # set the axes

plt.title("Very popular language")

plt.show()

plot_cities()

파이썬 - 웹에 있는 데이터를 크롤링해서 분석하고 결과를 그래프로 출력한 데모

아나콘다 패키지를 설치한 후에
pip install beautifulsoup4
pip install html5lib
pip install requests
를 추가 설치한 후에 아래의 스크립트를 돌리면 데이터 관련 책들의 추이를 볼 수 있습니다.

from bs4 import BeautifulSoup
import requests
from time import sleep
from collections import Counter
import re

def is_video(td):
pricelabels = td('span', 'pricelabel')
return (len(pricelabels) == 1 and pricelabels[0].text.strip().startswith("video"))

def book_info(td):
"""given a BeautifulSoup <td> Tag representing a book,
extract the book's details and return a dict"""
title = td.find("div", "thumbheader").a.text
by_author = td.find('div', 'AuthorName').text
authors = [x.strip() for x in re.sub("^By ", "", by_author).split(",")]
isbn_link = td.find("div", "thumbheader").a.get("href")
isbn = re.match("/product/(.*)\.do", isbn_link).groups()[0]
date = td.find("span", "directorydate").text.strip()

return {
"title" : title,
"authors" : authors,
"isbn" : isbn,
"date" : date
}

base_url = "http://shop.oreilly.com/category/browse-subjects/" + \
"data.do?sortby=publicationDate&page="
books = []
NUM_PAGES = 31

for page_num in range(1, NUM_PAGES + 1):
print("souping page", page_num, ",", len(books), " found so far")
url = base_url + str(page_num)
soup = BeautifulSoup(requests.get(url).text, "html5lib")
for td in soup("td", "thumbtext"):
if not is_video(td):
books.append(book_info(td))

import matplotlib.pyplot as plt

def get_year(book):
return int(book["date"].split()[1])

year_counts = Counter(get_year(book) for book in books if get_year(book) <= 2014)
years = sorted(year_counts)
book_counts = [year_counts[year] for year in years]
plt.bar([x - 0.5 for x in years], book_counts)
plt.xlabel("year")
plt.ylabel("# of data books")
plt.title("Data is Big!")
plt.show()

python으로 파일에 있는 단어의 갯수 세기

킹제임스 버전의 영어 성경입니다. 코드는 아래와 같습니다.

# most_common_words.py

import sys

from collections import Counter

if __name__ == "__main__":

try:

num_words = int(sys.argv[1])

except:

print("usage: most_common_words.py num_words")

sys.exit(1)

counter = Counter(word.lower() \

for line in sys.stdin

for word in line.strip().split()

if word)

for word, count in counter.most_common(num_words):

sys.stdout.write(str(count))

sys.stdout.write("\t")

sys.stdout.write(word)

sys.stdout.write("\n")

다음과 같이 실행하면됩니다.

C:\work>type the_bible.txt | python most_common_words.py 20

2016년 6월 28일 화요일

WWDC 2016 개발자 영상 - What's New in Foundation for Swift

https://developer.apple.com/videos/play/wwdc2016/207/

공부할 꺼리가 장난아닙니다. ㅋㅋ

2016년 6월 27일 월요일

WWDC 2016 개발자 영상 - What's New in Auto Layout입니다.

https://developer.apple.com/videos/play/wwdc2016/236/

이 영상은 아직 자막이 제공되지 않습니다. 기존에 오토레이아웃을 사용했던 개발자라면 난이도가 높지 않습니다. ^^

WWDC 2016 Sirikit 개발자 영상

https://developer.apple.com/videos/play/wwdc2016/225/

Unicon chat을 개발하면서 붙이는 방법을 설명합니다.
내용이 어렵지 않습니다. 다만 재미있지는 않습니다.

xCode 8.0과 iOS 10.0 베타를 사용중입니다.

아직은 불안합니다.
그렇지만 10월이면 사용하기 될 경우이기에 미리 한번 사용해 봅니다. ㅋㅋ
베타1치고는 아직 큰 문제 없이 사용중입니다. 왼쪽으로 오른쪽으로 쓸기하면 새로운 위젯 창이 뜹니다.

xCode 8.0 베타1 과 Swift 3.0에 대한 학습

https://developer.apple.com/videos/play/wwdc2016/402/

현재 WWCD 2016에 대한 개발자 비디오에 자막이 올라와서 보고 있습니다. 역시 변경된 부분과 추가된 부분이 상당히 많습니다. 꼭 한번 보시기를 추천합니다. ^^

2016년 6월 13일 월요일

WWDC 2016이 열렸네요.

매년 6월에 열리는 WWDC 2016이 올해도 오픈됐습니다.
다른 큰 변화는 없어 보이는데 API가 대거 오픈되었습니다.

macOS로 OS X의 이름이 변경되고, 시리 API가 써드파티 앱 개발자에게 오픈되고..
Swift 3.0은 공부해서 정리해야 할 것 같습니다.

아이패드에 Swift playground가 탑재됩니다.

https://www.apple.com/education/everyone-can-code/

새로운 macOS로 이름이 변경!

기다리던 Swift 3.0 발표

Visual Studio 15 Preview 2가 나왔습니다.

비주얼 스튜디오 2015까지만 사용해 보았는데 15버전이 별도로 나왔습니다.

https://www.visualstudio.com/en-us/downloads/visual-studio-next-downloads-vs

가볍게 필요한 부분만 설치가 가능합니다. 기본 데스크탑 형태로 선택해서 셋팅을 합니다.

아직은 좀 더 사용해 봐야 알 것 같습니다.

2016년 6월 9일 목요일

워크레프트 소개 영상

자막 있는 영상