Python Redis [How to] : Redis performance between python 2.6.5 and pypy 1.9

Redis logo

Redis performance between python 2.6.5 and pypy 1.9

I have read about pypy, it said pypy is faster than standard python. And how performance between pypy and standard python combine with redis? So let’s try.

Redis is blazing fast, that’s why i love redis. Somebody said, “Time is Money”. So Faster is better money 😀 . I’m using my laptop to do this test, Intel dualcore T4200 (2 GHz), 1 Gb ram. (Acer 4736z).

The Tests

The test is simple, I’m just test how many time needed to doing inserting to redis with SET (SADD command) (automatically decline duplicate value). I’m inserting url to redis set, it’s about 200k url (not unique). I’m using redis to check and remove duplicate url, and it’s so fast.

Here’s the flow :

  1. Select all url from db, I’m using mysql, select url from table
  2. Loop those url, and insert to Redis key.
  3. Do it with standard python and pypy

Code

# Author  : Fajri Abdillah a.k.a clasense4
# Twitter : @clasense4
# mail    : clasense4@gmail.com

# import modules used here -- sys is a very standard one
import sys
import redis
from datetime import datetime
import time
import MySQLdb

# connect to the MySQL server
try:
    conn = MySQLdb.connect (
    host = "localhost",
    user = "root",
    passwd = "testroot",
    db = "django_crawler"
    )
except MySQLdb.Error, e:
    print "Error %d: %s" % (e.args[0], e.args[1])
    sys.exit (1)

# START TIME
startTime = datetime.now()

# CURSOR DB
CURSOR = conn.cursor()

# Redis Object
R_SERVER = redis.Redis("localhost")

sql = "select link_href from cg_news"

# Create a key
key = "link:rss2"
print "Created Key\t\t : %s" % key

def insert_redis(sql):
    # INPUT 1 : SQL query
    # OUTPUT  : Array of result

    # Do MySQL query    
    CURSOR.execute(sql)
    data = CURSOR.fetchall()
    print sql
    print "MySQLexecution time\t : %s" % str(datetime.now()-startTime)   

    # timer inserting with redis
    start_time_redis = datetime.now()
    counter = 0
    count_data = 0

    for datas in data :    
		# Check if data exists in set.
        if (R_SERVER.sadd(key, datas)):
		    #print "%s " % (datas)
		    counter += 1
        count_data += 1

    print "Redis Execution time\t : %s " % (str(datetime.now()-start_time_redis))
    print "Inserted data\t\t : %s" % (counter)
    print "count data\t\t : %s" % (count_data)
    CURSOR.close()        

# Standard boilerplate to call the main() function to begin
# the program.
if __name__ == '__main__':
    insert_redis(sql)

Result

Conclusion

Like you saw, pypy is faster than standard python, but the memory is bigger than standard python. Standard python use 50Mb of Ram, and pypy is use 80Mb of ram (this part I’m forgot to capture the screenshoot). So how do you think? please share your comment below.

Better code view

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s