Recent Works [Python] – Old Kaskus & New Kaskus Data Crawling

This project is just some weekend project in my office, not too important and not too hard. I have a request to scrap some indonesian forum, it is kaskus. Kaskus has move to new kaskus with a new fancy looking, and old kaskus still can be accessed in old.kaskus.co.id. The goal of my project is to scrap a thread, scrapy will save some data like, posts, user information and page information into mysql database, and it use no gui, just a command line.

687474703a2f2f6f63746f6465782e6769746875622e636f6d2f696d616765732f706c756d6265722e6a7067

Library I used :

  1. Scrapy (Crawling using XPath)
  2. MySQLdb (Library to connecting Mysql and Python)

And this is some screenshoot of them (open in new tab for higher resolution) :

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s