UP | HOME

Crawl Chinese web site with Scrapy

The response return by Scrapy is a Python repr() object thus Chinese character ’display’ in Unicode.

Quick fix is like this:

item['name'][0].encode('utf-8')

Though some characters are still display incorrectly. Find source code here.

More about Unicode in Python:   - http://evanjones.ca/python-utf8.html   - http://www.b-list.org/weblog/2007/nov/10/unicode/