getting strange difference inside enthought canopy vs. command line when trying load , utilize urllib and/or urllib.request
here's mean. i'm running python 3.5 on macos 10.11.3. i've tried on windows 10 machine too, , i'm getting same results. difference appears between using canopy , using command line.
i'm trying basic screen scraping. based on reading, think should doing:
from urllib.request import urlopen html = urlopen("http://pythonscraping.com/pages/page1.html") print(html.read())
this works @ command prompt.
but, inside canopy, not work. inside canopy error
importerror: no module named request
when canopy tries execute urllib.request import urlopen
inside canopy, works:
import urllib html = urllib.urlopen("http://pythonscraping.com/pages/page1.html") print(html.read())
i understand happening, don't want canopy python scripts fail when run them outside of canopy. also, canopy approach not seem consistent docs i've read... got there trial & error.
urllib.request
module exists in python 3. enthought canopy distribution still ships version of python 2.7 (2.7.10 of current version 1.6.2).
in python 2.x, have choice of using either urllib
or urllib2
, expose functions urlopen
@ top level (e.g. urllib.urlopen
rather urllib.request.urlopen
).
if want scripts able run through either python 3.x or in enthought canopy's python distribution, there 2 possible solutions:
use
requests
- recommended library use interacting http in python. it's third-party module can install using standardpip
oreasy_install
, or canopy package index.your equivalent code similar to:
# allows use print() function inside python 2.x __future__ import print_function import requests response = requests.get("http://pythonscraping.com/pages/page1.html") print(response.text)
use conditional importing bring in current function need regardless of version. using built-in features of python , not require third-party libraries.
your code similar to:
# allows use print() function inside python 2.x __future__ import print_function import sys try: # try importing python 3's urllib.request first. urllib.request import urlopen except importerror: # looks we're running python 2.something. urllib import urlopen response = urlopen("http://pythonscraping.com/pages/page1.html") # urllib.urlopen's response object different based # on python version. if sys.version_info[0] < 3: print(response.read()) else: # python 3's urllib responses return # stream byte-stream, , it's # set encoding of stream. # block checks if stream has content-type set # , if not, defaults using utf-8 encoding = response.headers.get_content_charset() if not encoding: encoding = 'utf-8' print(response.read().decode(encoding))
Comments
Post a Comment