Using gcloud-python in GAE

2.7k views Asked by At

I've got a bunch of little Raspberry Pis running some python code which saves directly to the Datastore (skips GAE) using the gcloud-python datastore package. This works great. I now want to present the data via web and mobile clients using Google App Engine. On my MacBook I installed GAE using the installer and gcloud via pip. I can write a simple python script and execute it directly from the terminal which is able to write and read from the datastore via gcloud - that also works just fine.

However, when I try to incorporate that same code into GAE it fails. Based on my research, I expect that it is a PATH issue but after several hours of various attempts I am unable to resolve this. Suggestions would be appreciated.

I believe this post is similar to my issue: Google App Engine, Change which python version

Here are some pieces of information which may be relevant:

Python version

$ which python
/usr/bin/python

Per the referenced Stack Overflow issue I set the preferences of GAE to have a Python Path of /usr/bin/python. I have tried it

$PATH

$ echo $PATH
/Users/sheridangray/Projects/adt-bundle-mac-x86_64-20140702/sdk/google-cloud-sdk/bin:/Users/sheridangray/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/opt/X11/bin

os.path from Python interpreter

$ python
Python 2.7.6 (default, Sep  9 2014, 15:04:36)
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.39)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.path
<module 'posixpath' from '/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/posixpath.pyc'>

gcloud installation

$ pip install --upgrade gcloud
Requirement already up-to-date: gcloud in /Library/Python/2.7/site-packages
...

Python 2.7 references

$ sudo find / -name python2.7
Password:
/Applications/Dropbox.app/Contents/Frameworks/Python.framework/Versions/2.7/include/python2.7
/Applications/Dropbox.app/Contents/Frameworks/Python.framework/Versions/2.7/lib/python2.7
/Applications/Dropbox.app/Contents/Resources/include/python2.7
/Applications/Dropbox.app/Contents/Resources/lib/python2.7
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.10.sdk/System/Library/Frameworks/Python.framework/Versions/2.7/include/python2.7
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.10.sdk/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.10.sdk/usr/include/python2.7
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.10.sdk/usr/lib/python2.7
find: /dev/fd/3: Not a directory
find: /dev/fd/4: Not a directory
/System/Library/Frameworks/Python.framework/Versions/2.7/bin/python2.7
/System/Library/Frameworks/Python.framework/Versions/2.7/include/python2.7
/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7
/usr/bin/python2.7
/usr/lib/python2.7

Relevant Code

from gcloud import datastore

class MainHandler(webapp.RequestHandler):
  def get(self):
     dataset = datastore.get_dataset(dataset_id, email_address, private_key_file)

GAE Log File

*** Running dev_appserver with the following flags:
--skip_sdk_update_check=yes --port=8080 --admin_port=8000
Python command: /usr/bin/python
INFO     2014-11-21 09:02:03,276 devappserver2.py:745] Skipping SDK update check.
INFO     2014-11-21 09:02:03,288 api_server.py:172] Starting API server at: http://localhost:49183
INFO     2014-11-21 09:02:03,292 dispatcher.py:185] Starting module "default" running at: http://localhost:8080
INFO     2014-11-21 09:02:03,294 admin_server.py:118] Starting admin server at: http://localhost:8000
INFO     2014-11-21 09:02:34,319 module.py:709] default: "GET / HTTP/1.1" 200 2
INFO     2014-11-21 09:02:34,455 module.py:709] default: "GET /favicon.ico HTTP/1.1" 200 8348
INFO     2014-11-21 09:02:44,359 module.py:387] Detected file changes:
  /Users/sheridangray/Projects/city-pulse-web/main.py
ERROR    2014-11-21 09:02:46,443 webapp2.py:1552] gcloud
Traceback (most recent call last):
      File "/Users/sheridangray/Projects/adt-bundle-mac-x86_64-20140702/sdk/google-cloud-sdk/platform/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/lib/webapp2-2.5.2/webapp2.py", line 1535, in __call__
rv = self.handle_exception(request, response, e)
  File "/Users/sheridangray/Projects/adt-bundle-mac-x86_64-20140702/sdk/google-cloud-sdk/platform/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/lib/webapp2-2.5.2/webapp2.py", line 1529, in __call__
rv = self.router.dispatch(request, response)
  File "/Users/sheridangray/Projects/adt-bundle-mac-x86_64-20140702/sdk/google-cloud-sdk/platform/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/lib/webapp2-2.5.2/webapp2.py", line 1278, in default_dispatcher
return route.handler_adapter(request, response)
  File "/Users/sheridangray/Projects/adt-bundle-mac-x86_64-20140702/sdk/google-cloud-sdk/platform/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/lib/webapp2-2.5.2/webapp2.py", line 1102, in __call__
return handler.dispatch()
  File "/Users/sheridangray/Projects/adt-bundle-mac-x86_64-20140702/sdk/google-cloud-sdk/platform/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/lib/webapp2-2.5.2/webapp2.py", line 572, in dispatch
return self.handle_exception(e, self.app.debug)
  File "/Users/sheridangray/Projects/adt-bundle-mac-x86_64-20140702/sdk/google-cloud-sdk/platform/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/lib/webapp2-2.5.2/webapp2.py", line 570, in dispatch
return method(*args, **kwargs)
  File "/Users/sheridangray/Projects/city-pulse-web/main.py", line 63, in get
dataset = datastore.get_dataset(dataset_id, email_address, private_key_file)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/gcloud/datastore/__init__.py", line 103, in get_dataset
connection = get_connection(client_email, private_key_path)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/gcloud/datastore/__init__.py", line 65, in get_connection
    from gcloud.datastore.connection import Connection
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/gcloud/datastore/connection.py", line 3, in <module>
    from gcloud import connection
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/gcloud/connection.py", line 8, in <module>
class Connection(object):
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/gcloud/connection.py", line 24, in Connection
USER_AGENT = "gcloud-python/{0}".format(get_distribution('gcloud').version)
  File "/Users/sheridangray/Projects/adt-bundle-mac-x86_64-20140702/sdk/google-cloud-sdk/platform/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/lib/setuptools-0.6c11/pkg_resources.py", line 311, in get_distribution
if isinstance(dist,Requirement): dist = get_provider(dist)
  File "/Users/sheridangray/Projects/adt-bundle-mac-x86_64-20140702/sdk/google-cloud-sdk/platform/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/lib/setuptools-0.6c11/pkg_resources.py", line 197, in get_provider
return working_set.find(moduleOrReq) or require(str(moduleOrReq))[0]
  File "/Users/sheridangray/Projects/adt-bundle-mac-x86_64-20140702/sdk/google-cloud-sdk/platform/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/lib/setuptools-0.6c11/pkg_resources.py", line 666, in require
needed = self.resolve(parse_requirements(requirements))
  File "/Users/sheridangray/Projects/adt-bundle-mac-x86_64-20140702/sdk/google-cloud-sdk/platform/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/lib/setuptools-0.6c11/pkg_resources.py", line 565, in resolve
    raise DistributionNotFound(req)  # XXX put more info here
DistributionNotFound: gcloud
INFO     2014-11-21 09:02:46,456 module.py:709] default: "GET / HTTP/1.1" 500 5010
INFO     2014-11-21 09:02:46,514 module.py:709] default: "GET /favicon.ico HTTP/1.1" 304 -
2

There are 2 answers

3
minou On

GAE on your Mac can't access Python packages installed in the default place on your Mac. You need to do this:

ln -s /Library/Python/2.7/site-packages/.../gcloud /Users/sheridangray/Projects/city-pulse-web/gcloud

(need to replace ... with the relevant path info)

2
bossylobster On

You can run gcloud-python on App Engine, but it requires a little extra work. Check out a skeleton project I wrote which gets this working.

The main bases to cover are:

Getting the dependencies

In install_gcloud.sh, pip is used to install gcloud and its dependencies inside the application using

pip install --target="application/vendor/" gcloud

(As mentioned in the other answer, local installs don't get uploaded to App Engine on deploy.)

Modifying the dependencies

By pip-installing with --target set, pkg_resources.get_distribution will work as expected (it failed in your stack trace).

In addition pytz by default has too many reads to work well on App Engine, so gae-pytz is used instead. As a result, some pytz imports need to be modified.

In addition, to reduce a Compute Engine check overhead (network overhead), the oauth2client.client module can be modified.

Both of these tweaks can be found in a single commit.

Making Imports Work

The script above puts all dependencies in a directory called vendor/ and appengine_config.py adds this to the import path via Darth Vendor:

import darth
darth.vendor('vendor')

In addition, since the protobuf dependency is also part of the google package (just like all the App Engine imports, e.g. google.appengine.ext.ndb) you need to modify the __path__ associated with that package:

import os
import google

curr_dir = os.path.abspath(os.path.dirname(__file__))
vendor_dir = os.path.join(curr_dir, 'vendor')
google.__path__.append(os.path.join(vendor_dir, 'google'))

CAVEAT (as of January 22, 2015)

Be very aware that using gcloud-python within App Engine will be 3-5x slower than using the native App Engine libraries db or ndb. This is because those use direct RPCs into the App Engine runtime while gcloud-python will use HTTP outside of App Engine to talk to the Cloud Datastore API.


NOTE: I updated this after the initial posting, which referenced a previous point in history.