python - urllib vs. urllib.request in Python3 - Enthought Canopy -


getting strange difference inside enthought canopy vs. command line when trying load , utilize urllib and/or urllib.request

here's mean. i'm running python 3.5 on macos 10.11.3. i've tried on windows 10 machine too, , i'm getting same results. difference appears between using canopy , using command line.

i'm trying basic screen scraping. based on reading, think should doing:

from urllib.request import urlopen html = urlopen("http://pythonscraping.com/pages/page1.html") print(html.read()) 

this works @ command prompt.

but, inside canopy, not work. inside canopy error

importerror: no module named request  

when canopy tries execute urllib.request import urlopen

inside canopy, works:

import urllib html = urllib.urlopen("http://pythonscraping.com/pages/page1.html") print(html.read()) 

i understand happening, don't want canopy python scripts fail when run them outside of canopy. also, canopy approach not seem consistent docs i've read... got there trial & error.

urllib.request module exists in python 3. enthought canopy distribution still ships version of python 2.7 (2.7.10 of current version 1.6.2).

in python 2.x, have choice of using either urllib or urllib2, expose functions urlopen @ top level (e.g. urllib.urlopen rather urllib.request.urlopen).

if want scripts able run through either python 3.x or in enthought canopy's python distribution, there 2 possible solutions:

  1. use requests - recommended library use interacting http in python. it's third-party module can install using standard pip or easy_install, or canopy package index.

    your equivalent code similar to:

    # allows use print() function inside python 2.x __future__ import print_function import requests  response = requests.get("http://pythonscraping.com/pages/page1.html") print(response.text) 
  2. use conditional importing bring in current function need regardless of version. using built-in features of python , not require third-party libraries.

    your code similar to:

    # allows use print() function inside python 2.x __future__ import print_function import sys  try:     # try importing python 3's urllib.request first.     urllib.request import urlopen except importerror:     # looks we're running python 2.something.     urllib import urlopen  response = urlopen("http://pythonscraping.com/pages/page1.html")  # urllib.urlopen's response object different based # on python version. if sys.version_info[0] < 3:     print(response.read()) else:     # python 3's urllib responses return     # stream byte-stream, , it's     # set encoding of stream.     # block checks if stream has content-type set     # , if not, defaults using utf-8     encoding = response.headers.get_content_charset()     if not encoding:         encoding = 'utf-8'     print(response.read().decode(encoding)) 

Comments