PyCon 2003 "Debugging Zope"

I formulated some techniques for debugging Zope and wrote a paper about it, Conversing With Zope: Interactive Debugging Using the Python Prompt, and presented the paper at the Python 2003 conference. This is a substantial but incomplete copy of the original, preserved here in case the old Zope develpers wiki is retired.

Ken Manheimer, klm@python.org
March, 2003

Introduction

"All parts should go together without forcing. You must remember that the parts you are reassembling were disassembled by you. Therefore, if you can't get them together again, there must be a reason. By all means, do not use a hammer." 
-- IBM maintenance manual, 1925 (via someone on python-dev)

This paper presents some basic techniques for interactively delving into and intervening in the operation of a running Zope application.

Developing and debugging Zope applications can be tricky, particularly since the system usually runs as a daemon process in the background. The system, however, and typical applications built with it, are very much about interactivity, and are good candidates for incremental development. Open availability of the source, support for shared client access (with ZEO), and the fact that its built on a comprehensible, dynamic like Python, provide rich avenues for interactive development, debugging, and exploration - even while Zope is serving normal visitor's requests.

We'll explore practical examples involving sessions independent of other server activity, and also intervention in Zope's service of ongoing requests. Examples will involve interaction with the server at the level of trusted (security-exempt) filesystem-based code and also at the level of through-the-web untrusted (security-regulated) code and web requests.

Appendix A has a complete recipe for setting up a Zope site running under ZEO (on Linux, but you should be able to adapt it to other platforms) so you can play along, poking and prodding your running Zope instance as you read. Working familiarity with Python is a must - see the python.org documentation section for leads.

In addition to debugging opportunites, interacting with the server directly provides immediate, hands-on experience with sophisticated Zope mechanisms like the object database, transactions, acquisition, object publishing, fine-grained security, and so on. With direct experience you can substantiate and solidify your understanding not only of your application but also of Zope, itself.

Here's one final introductory note, about debugging as a development practice. The most powerful development and debugging tool is careful thought. When debugging, it usually helps to have or seek a good conceptual model of the situation, not just poke and prod. The tools and techniques we'll be exploring can help you to discover, test, and act on your model in useful - and sometimes crucial - ways.

Python Prompt vs Intervening On An In-Progress Request

There are two fundamental access modes we'll use:

  • from the python prompt - starting up Python and opening access to the Zope server
  • trapping and triggering a debugging session in the midst of normal service of a network request.

We'll start with the simpler situation, entry from the Python prompt.

For both we'll be using a ZEO-served Zope instance so that you can open a connection to a running storage server while other clients - like a Zope web site or another interactive session - are concurrently interacting with the server. We'll use the Python prompt or a pdb session triggered in the server launch process.

Setup - Establishing a ZEO-based Zope

Appendix A has a complete recipe for setting up Zope running with ZEO, useful for both modes.

The recipe includes a script, izctl.py, for starting the server, with an option to start an interactive prompt instead of the web (and ftp, webdav, ...) server. You can adapt this configuration to an existing Zope/ZEO setup, if you already have one that you want to use, by changing the settings in zope.conf. You can also just emulate what the script does for an existing installation, if you prefer.

Starting From the Python Prompt

Interacting with Zope using the Python prompt gives you maximum latitude. You can navigate the object filesystem, grab objects, and "manipulate" them by calling methods on them directly. (You do not get the effects of the Zope publisher, like conveyance of authorization credentials, unless you explicitly invoke it - which we'll be discussing, later.) By running Zope under ZEO you can get python prompt access without interfering with simultaneous access by others to your site. With it you can interact with your site simultaneously via the network and via the python prompt. We'll be exploiting this in our explorations.

The script izctl.py from the recipe (Appendix A) provides the means to situate yourself in the Python prompt talking to Zope: izctl.py debug. This:

  • ensures that the ZEO server is going on the port indicated in zope.conf, starting the server if necessary
  • sets environment variables INSTANCE_HOME and ZEO_SERVER_PORT
  • starts Python so that your SOFTWARE_HOME/lib/python directory is on the path
  • imports Zope
  • sets app = Zope.app() - Zope.app() returns the application instance, by which we connect to the server.
  • prints an orientation greeting

izctl.py debug leaves you at the python prompt with a connection to the ZEO storage server, in the context of a transaction within which your work will occur. We will use the application instance, assigned to app, to get at objects in the application. Changes we make to those objects will be hidden from other transactions until we commit our transaction. We'll explain this as we proceed.

Looking Around The Root Folder

Once we've invoked izctl.py debug from the shell prompt, we have a hold of the Zope application object, app.

% ./izctl.py debug
Stopping ZEO pid: 17178 17179
sh: line 1: kill: (17178) - No such process
sh: line 1: kill: (17179) - No such process
Starting ZEO server on 4000
waiting for ZEO server to start 1/160 waiting for ZEO server to start 2/160 ZEO server ready.

Zope debugging session for /home/klm/work/zope/instance1 The root application object is bound to name app. To let other people see your changes, you must: get_transaction().commit() To see other people's changes, you must:

>>> app._p_jar.sync()

The Zope application object is the root folder of your Zope site. Among other things it operates as a Zope object manager, containing the top-level object in the site . Those objects are what you would see if you visit the management screen (manage_main) of the site's root through the web.

The Zope2 ObjectManager is a wild and wooly beast. It has evolved from one of the earliest elements of the Zope repertoire, and has accumulated a lot of history along with a lot of functionality - not all of it tidy or clear. Some things may seem less intuitive than they need be, and, in fact, they sometimes are. Zope3 should rectify a lot of that - but we're getting acquainted with Zope2...

Let's take a look at the names of the objects it holds:

>>> app.objectIds()
       ['acl_users', 'Control_Panel', 'temp_folder',
       'browser_id_manager', 'session_data_manager', 'Examples',
       'error_log', 'index_html', 'standard_error_message',
       'standard_html_footer', 'standard_html_header',
       'standard_template.pt']

.objectIds(), is an object manager method which returns the ids of the immediately contained objects, as the list you see. A companion method, .objectItems(), obtains a list of tuples of the objects it contains, (id, contained_object).objectValues() lists just the contained objects.

Some notable entries in the root folder:

  • acl_users - the base user folder, which holds user accounts that encompass the entire site
  • Control_Panel - the site's online control center
  • index_html - the page presented when you visit the root folder, itself, through the web

Actually, all the items are interesting in one way or another, dictating general behavior across the site.

Create a Scratch Area to Play In

Now we'll create a folder and put it in the root - both to demonstrate interacting with the application and to create a place for experimenting without disturbing the site infrastructure:

>>> from OFS.Folder import Folder                          # 1.1
>>> rawscratch = Folder()                                  # 1.2
>>> rawscratch.id = "scratch"                              # 1.3
>>> app._setObject("scratch", rawscratch)                  # 1.4
'scratch'
>>> scratch = app.scratch                                  # 1.5
  • 1.1 OFS is the "Object File System", with the various basic building blocks for the stuff that get put on the site - including folders, which we're using here.
  • 1.2 We create a folder instance
  • 1.3 ... and give it an id, scratch.
  • 1.4 We put the new folder in the root folder, which is the application object.
  • 1.5 And fetch the new scratch folder back from the root folder.

What we achieve in step 1.5, by fetching the thing named scratch, is essentially the folder we added, except for a subtle but important difference. The item we get back is actually an "acquisition wrapper" instance which contains the original folder, and generally emulates its behavior, with one addition: it retains and conveys information about the container (in this case, the app object) from which it was obtained.

The acquisition wrapper does everything it can to make the wrapped object behave similarly to the unwrapped one, so it looks equal but not identical:

>>> scratch == rawscratch
1
>>> scratch is rawscratch
0

and has the same attributes:

>>> rawscratch.objectIds
<bound method Folder.objectIds of <Folder instance at 8b6adf0>>
>>> scratch.objectIds
<bound method Folder.objectIds of <Folder instance at 8b6adf0>>

The acquisition wrapped object, however, conveys the containment context:

>>> app <Application instance at 8b4d640>
>>> scratch.aq_parent
<Application instance at 8b4d640>

while the unwrapped folder does not:

>>> getattr(rawscratch, 'aq_parent', 'no aq_parent')
'no aq_parent'

Given the wrapped object, we can get at the object that contains it, and the container's container, and so on. Acquisition enables you to find and use attributes of containing objects in the contained ones. For instance, we haven't given our scratch folder its own acl_users UserFolder, so it will appear to have the one of the root folder:

>>> scratch.acl_users
<UserFolder instance at 8b4f6b0>
>>> app.acl_users <UserFolder instance at 8b4f6b0>
>>> scratch.acl_users is app.acl_users
1

A note about fetching objects from a folder or other ObjectManager. We used simple attribute access - "app.scratch" - but could also use dictionary-style lookup: "app['scratch']". This alternate method is handy if the contained object has a name that would break Python attribute lookup syntax, eg starting with a number or containing a dot:

>>> app.ColDev.klmcol <Collector 'klmcol' (2 issues) at 0x83cf0b0>
>>> app.ColDev.klmcol.1
File "<stdin>", line 1 app.ColDev.klmcol.1
^ SyntaxError: invalid syntax
>>> app.ColDev.klmcol['1'] <CollectorIssue 1 "test" at 0x84b4078>

The unwrapped version does not include this important context.

The connection to acl_users happens to be particularly important. Zope's security system determines access permissions according to the acl_users object most closely situated to the object, containment-wise. If no acl_users folder is found, then access simply fails.

Note that our direct access to objects via the python-prompt is not regulated by Zope's security system, so we can obtain and examine objects with impunity - we're operating at the level of "trusted code".

As things currently stand, though we've added the scratch folder to the site, if you visit your site through the web (or though a concurrent ZEO python-prompt session) you won't see your scratch folder, because it's not yet visible outside the transaction - we haven't yet committed it. Let's do so:

>>> get_transaction().commit()

That's it. Now, if you visit your site via the web (eg, in management view), you should see the scratch folder in the root manage_main view. (If you used the recipe in Appendix A to create your site, you can log in as user admin with password 123.)

get_transaction is a global function which is added to the Python builtins when you import Zope. When called, it returns the transaction object within which your current actions have effect. .commit() is the method which commits the changes residing in the transaction. There's also an.abort() method, to ditch changes. Both conclude the current transaction and open a new one as the context for subsequent actions.

Neither get_transaction.commit() nor .abort() fetch changes that were made in other transactions while you were operating in yours. When you wish your session to get changes committed elsewhere since you started the current transaction, without committing yours, you would use a method on an object of the application:

The effects of ._p_jar.sync() can be a little beguiling. While it does indeed refresh persistent objects, including ones you've already fetched, it cannot refresh primitive or other Python objects that lack the persistence mixin. You must re-obtain such objects from their persistent container, after a ._p_jar.sync()

>>> app._p_jar.sync()

This tells the app to refresh its view of the ZODB, so any subsequent accesses to objects will ask the ZODB for the current versions as of the synccall.

Zope Runs On Python - and Also Runs Python, With Security Regulation

Zope is not only built on Python, it also is used to execute submitted Python code. This and other features enables Zope applications to delegate control across the network. Zope provides fine-grained security mechanisms by which the developer can designate exactly how such privileges are delegated.

The system generally treats code originating in the local file system, like core code and products, as "trusted": running in and with the implicit unrestricted privileges of the Zope system itself, not subject to the security regulations. (The security mechanisms can be explicitly applied when desired.) Network-instigated operations are subject to security regulation. Such operations occur within the context of a network request attributed to some authenticated user. That user authenticated in the request is associated by configuration with roles and permissions.

Essentially, the local trusted code asserts security declarations which governs all network-initiated actions, from network visits and also from remote-originating, "untrusted" code.

Stepping through evaluation of "untrusted" code involves some significant twists. We're going to start with the basic case of debugging trusted, local-filesystem-based code, and gradually introduce aspects of untrusted code from there.

Use the Python Debugger to Step Through Trusted, Filesystem-Based Code

For this example, we'll add an object to our scratch folder, stepping across it with the debugger. We'll be using trusted code, for demonstration purposes, to create a through-the-web, untrusted code object which will be fodder for later examples concerning untrusted code. We're not aiming at this point for elaborate debugging, just the experience of getting in and stepping through.

(If you use emacs, recent versions of python-mode, if loaded, will detect pdb sessions running in a shell or Python interaction buffer, and pop up the source files and track through them as you proceed with the debugger. Python IDEs like IDLE and Wing offer similar benefits.)

We'll be using the scratch folder we created aboveapp.scratch, and step into the standard mechanism for creating "Script (Python)" Python scripts:

>>> from Products.PythonScripts.PythonScript import manage_addPythonScript
>>> import pdb
>>> pdb.run("manage_addPythonScript(app.scratch, 'visitor_id')")
> /home/klm/work/zope/Zope-2.6.1-linux2-x86/lib/python/<string>(0)?()
(Pdb) s
  # To minimize clutter i'm going to show only selected debugger output...
(Pdb) s
[...]
-> """
  # ... at the end of the function docstring - let's step over stuff ...
(Pdb) n
[...]
-> id = str(id)
(Pdb) n
> [...]
-> id = self._setObject(id, PythonScript(id))
(Pdb) n
> [...]
-> if REQUEST is not None:
(Pdb) n
> [...]
-> return ''
(Pdb) n
--Return--
> [...]
-> return ''
(Pdb) n
--Return--
> [...]
(Pdb) 
  # Don't forget to commit the transaction!
>>> get_transaction().commit()
>>> 

I tracked down the manage_addPythonScript by examining what the ZMI add-product form calls, and unraveling from there until i found the routine that did the actual work. You often can use this approach to track stuff down, starting with something in the web UI which does the thing you want to do.

We've added an empty Python script named visitor_id to the scratch folder. Now lets use the Python prompt to add some code to the script:

>>> visitor_id = app.scratch.visitor_id
>>> text = "return 'hello world'"
>>> visitor_id.ZPythonScript_edit('', text)
>>> get_transaction().commit()

This script does not do very much!

>>> visitor_id()
'hello world'

Since you've 'commit()'ed it, you can visit the script over the web and give it a try. Eg, if you're editing it on the host and using the recipe configuration, it would be at:

http://localhost:8080/scratch/visitor_id/manage_main

You may have to adjust the host and/or port for your installation. When you visit there and hit the Test tab, you should see a mostly blank page with the words hello world. Nothing spectacular - we're just acquainted, at this point, with doing things directly.

In this section we've use pdb with Zope objects just as we can with any Python objects. In the next section we'll explore how to explicitly apply the publisher with options to generate authentication credentials, debug the publi shing process, and so forth.

Engaging the Zope Publisher From the Python Prompt

Up to now we've interacted directly with the Zope objects as python objects, without the intervention of the Zope publisher. The publisher is responsible for capturing and conveying authentication information, crucial for security operation. Sometimes you need to investigate the situation with the publisher engaged, and the Zope module that gets imported in izctl.py debug startup provides a way to do that - Zope.debug().

To demonstrate, first we change the body of the script to report the name of the authenticated user:

>>> text = "return context.REQUEST.AUTHENTICATED_USER.getUserName()"
>>> app.scratch.visitor_id.ZPythonScript_edit('', text)
>>> get_transaction().commit()

Now if you point your web browser at the ZMI view of the visitor_id script and hit the test tab, you should see a mostly blank page with just name of the user, eg admin.

If you try to run the script directly, however, without the intervention of the Zope publisher, you get a traceback:

>>> visitor_id()
Traceback (most recent call last):
[...]
  File "Script (Python)", line 1, in visitor_id
AttributeError: REQUEST
>>>

What's going on? Here's the deal. The REQUEST is an artifact of the Zope publishing process - when you visit an object via the web (or webdav, etc), the publisher handles the traffic, fabricating a REQUEST object which eventually propagates to the invoked method. When we directly use the script from the Python prompt, without the intervention of the publisher, there just isn't any request created. (It's possible to fabricate a request ourselves - for leads, some unit tests in the CMF do that - but the following is an easier and more comprehensive tool for interaction.)

The Zope module (from which we got the app application object) provides controlled access to the publication process with a function,Zope.debug(). Zope.debug() takes options which affect the publishing process.

Unadorned, you pass in what would be the path portion of the URL you would use to visit the method on the web (sans scheme, user, host, and port), and you get back the entire response as a string:

>>> Zope.debug('/scratch/visitor_id')
Status: 200 OK
X-Powered-By: Zope (www.zope.org), Python (www.python.org)
Content-Type: text/plain
Content-Length: 14

Anonymous User
____________________________________________________________

The script returned "Anonymous User" - because the publisher provided the default authentication credentials, for anonymous.

Zope.debug() takes an u option to specify authentication to be used for the request - a string with user and password separated by a : colon:

 >>> Zope.debug('/scratch/visitor_id', u="admin:123")
Status: 200 OK
[...]
admin

The path you pass in actually is just like the path portion of a URL. You can pass parameters using ?x=1&y=2 query strings. Let's change visitor_id to take an optional parameters:

>>> text = """greeting = 'Hello, '
... greeting += context.REQUEST.AUTHENTICATED_USER.getUserName()
... greeting += ", " + message
... return greeting"""
>>> app.scratch.visitor_id.ZPythonScript_edit('message=None', text)
>>> get_transaction().commit()

Now our script takes a message parameter, defaulting to None, and adds that value to the greeting before returning it. (There's a bug when you let message take the default value - we'll explore that bug a little later. For the moment we'll just play with the parameters.)

>>> Zope.debug('/scratch/visitor_id?message=You+rock', u="admin:123")
Status: 200 OK
X-Powered-By: Zope (www.zope.org), Python (www.python.org)
Content-Type: text/plain
Content-Length: 22

Hello, admin, You rock
____________________________________________________________

Now we see the message appended to the string.

Zope.debug() takes options specifically for the purposes of debugging, and we're going to explore them next.

Generally, if you make a call with the publisher that yields an uncaught exception, the publisher itself will catch the exception and report it. This is handy, but "swallows" the exception, so you can't use pdb's post-mortem operation to visit the stack frame where the exception was raised.Zope.debug() has the pm flag just for the purpose of retaining the traceback for post-mortem examination. (Here's where we exploit the bug we introduced into the script, above.):

>>> app.scratch.visitor_id.ZPythonScript_edit('message=None', text)
>>> get_transaction().commit()
  # Zope.debug() with 'pm=1', anticipating an uncaught exception:
>>> Zope.debug('/scratch/visitor_id', u="admin:123", pm=1)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "/home/klm/work/zope/Zope-2.6.1-linux2-x86/lib/python/Zope/__init__.py", line 57, in debug
  [...]
  File "Script (Python)", line 3, in visitor_id
TypeError: <html>
  <head><title>scratch</title></head>
  <h2>Site Error</h2>
  <p>An error was encountered while publishing this resource.
  </p>
  <p>
  <strong>Error Type: TypeError</strong><br />
  <strong>Error Value: cannot add type &quot;None&quot; to string</strong><br /> 
  </P>
  <p>Troubleshooting Suggestions</p>
  [...]
>>> import pdb
>>> pdb.pm()
> /home/klm/work/zope/Zope-2.6.1-linux2-x86/lib/python/Script (Python)(3)visitor_id()
(Pdb) print message
None
(Pdb) print "some string" + message
*** TypeError: cannot add type "None" to string
(Pdb) q
>>> 

We found the problem - when no message is actually passed in, message has the default value None, and the script fails when it tries to add that value to a string.

We had to do some estimation to figure out which line we were actually at in the script, because pdb is only able to report the line numbers - the python script compilation process does not convey the original source code. (As of recently, emacs python-mode will step properly through python scripts if you put copies of them in buffers set to python-mode and named for the script.)

Sometimes it's valuable to examine the Zope publishing process, itself, including operations like object file system traversal and security validation of the method being invoked. Zope.debug() has a nice provision for that, signalled by the command flag d=1. When you invoke it you are presented with a little description that guides you to the crucial stages of the publishing process. We're not going to investgate that here, but it is worth exploring on your own.

Instead, we're going to continue using the debugger with the code we've created. We're going to focus, in particular, on using pdb's set_trace()function to enter the debugger in the midst of activity. This technique, trapping operation at a certain point, is especially useful for examining operation enclosed by many layers of execution machinery - the case with any untrusted code and requests from the network. We'll start by introducing the technique with our example python script.

Enabling pdb.set_trace() requires a special measure of its own. It is not available to through-the-web scripts by default, because it can be abused to exhaust server worker threads:

>>> text = """import pdb; pdb.set_trace()
... greeting = 'Hello, '
... greeting += context.REQUEST.AUTHENTICATED_USER.getUserName()
... return greeting"""
>>> app.scratch.visitor_id.ZPythonScript_edit('', text)
>>> get_transaction().commit()
>>> Zope.debug('/scratch/visitor_id')
Status: 500 Internal Server Error
[...]
Bobo-Exception-Type: ImportError
Bobo-Exception-Value: bobo exception
[...]
  <p>An error was encountered while publishing this resource.
  <strong>Error Type: ImportError</strong><br />
  <strong>Error Value: import of &quot;pdb&quot; is unauthorized</strong><br /> 
____________________________________________________________

To make it available for through-the-web use, you must authorize it for import, using facilities of the Zope AccessControl module:

>>> from AccessControl import ModuleSecurityInfo
>>> ModuleSecurityInfo('pdb').declarePublic('set_trace')
>>> Zope.debug('/scratch/visitor_id')
--Return--
> /home/klm/work/zope/Zope-2.6.1-linux2-x86/lib/python2.1/pdb.py(895)set_trace()->None
-> Pdb().set_trace()
(Pdb) s
> /home/klm/work/zope/Zope-2.6.1-linux2-x86/lib/python/Script (Python)(2)visitor_id()

We're in. When stepping through and out of the script, we find ourselves deep in the Zope publishing machinery, in the code responsible for executing python scripts. There's plenty to explore there, but again it's beyond the scope of this effort, so we continue on our way from there, back to the result and then the python prompt:

(Pdb) n
--Return--
> /home/klm/work/zope/Zope-2.6.1-linux2-x86/lib/python/Script (Python)(4)visitor_id()\
  ->'Hello, Anonymous User'
(Pdb) n
> /home/klm/work/zope/Zope/lib/python/Products/PythonScripts/PythonScript.py(316)_exec()
-> if keyset is not None:
(Pdb) c
Status: 200 OK
X-Powered-By: Zope (www.zope.org), Python (www.python.org)
Content-Type: text/plain
Content-Length: 21

Hello, Anonymous User
____________________________________________________________

--Return--
> /home/klm/work/zope/Zope-2.6.1-linux2-x86/lib/python/<stdin>(1)?()->None
(Pdb) n
>>> 

For the next section, we want to be able to embed the set_trace() call in code that will be called by the server in the process of serving a request, rather than directly by us from the python prompt. That's not a problem with trusted code like filesystem-based Products, which can import pdb and call set_trace with impunity. Since we want to use it from the script, and won't have the opportunity to enable the import outside the script, we need a way to enable it from within the script.

Appendix B presents some code you can install in your server as an external method, so you can call it from your script to enable the import. Once it's installed, you can invoke it and then can use set_trace:

>>> text = """context.enable_set_trace(); import pdb; pdb.set_trace()
... greeting = 'Hello, '
... greeting += context.REQUEST.AUTHENTICATED_USER.getUserName()
... return greeting"""
>>> app.scratch.visitor_id.ZPythonScript_edit('', text)
>>> get_transaction().commit()
>>> Zope.debug('/scratch/visitor_id')
--Return--
> /home/klm/work/zope/Zope-2.6.1-linux2-x86/lib/python2.1/pdb.py(895)set_trace()->None
-> Pdb().set_trace()
(Pdb) q

The difference in the code is the "context.enable_set_trace()" added at the start. The difference in effect is that you don't anymore need the independent security declaration allowing pdb.set_trace() - it's now self contained. You can exit your izctl.py debug session, for instance, and reenter without having to do a security declaration. That's taken care of from within the script. We're going to be depending on that in the next section.

In this section we've explored interacting directly with the objects in the object file system using the python prompt. We've also engaged the publisher with the python prompt, accessing objects with with the same environment they would have if reached via the network, with authentication.

We've also begun to explore embedding a pdb.set_trace() in our code, to trap into the debugger in the midst of operation. We'll use that technique heavily in the next section to debug operation processing requests from the network.

Intervening On Ongoing Requests

Sometimes you just can't reproduce the circumstances of some event in an application using the python prompt.

  • You may not be able to pinpoint the specific conditions that trigger it, and have to wait for a visit to determine them
  • The condition may involve circumstances you cannot artificially create

In such cases, you cannot use the Zope.debug() from the python prompt. There is something you can do, but it involves some drawbacks which make other methods preferable when they suffice.

This method involves putting your pdb.set_trace() in the code in question, possibly conditioned to trigger only when the situation you want to investigate is occurring, eg within a try/except on an unintended exception. Then you let restart your site server in debug mode (which i'll explain) so that the server produces the debugging prompt when the event occurs.

To do this, you have to be running the server in "debug" mode and have access to the terminal from which the server was started. izctl.py start -D (izctl.py is part of the recipe in Appendix A) will do that for you. (That passes -D to ZServer, for ZServer's debug mode.) Then, when the set_trace() triggers, the debug session will occur on the terminal where the server is running.

To demonstrate we'll use the visitor_id script as we last prepared it, with 'context.enable_set_trace(); import pdb; pdb.set_trace()'. We'll visit it from the web and then step through with the pdb prompt in the server process.

To begin with, we'll start the server in debug mode:

% ./izctl.py start -D
Stopping ZEO pid: 2155 2156
sh: line 1: kill: (2155) - No such process
sh: line 1: kill: (2156) - No such process
Starting ZEO server on 4000
waiting for ZEO server to start 1/160
waiting for ZEO server to start 2/160
ZEO server ready.
Starting default (8000) Zope...

(What's all this? izctl.py, unable to reach the storage server, ensures the server process is in fact terminated and then starts a new server process. It repeatedly probes for presence of the new storage server instance it then starts the Zope server process.)

Now, when we visit the visitor_id url over the web - at:

http://localhost:8080/scratch/visitor_id

... we're presented with the pdb prompt by the server process:

--Return--
> /home/klm/work/zope/Zope-2.6.1-linux2-x86/lib/python2.1/pdb.py(895)set_trace()->None
-> Pdb().set_trace()
(Pdb) s
> /home/klm/work/zope/instance1/var/Script (Python)(2)visitor_id()
(Pdb) n
> /home/klm/work/zope/instance1/var/Script (Python)(3)visitor_id()
(Pdb) 
> /home/klm/work/zope/instance1/var/Script (Python)(4)visitor_id()
(Pdb) 
--Return--
> /home/klm/work/zope/instance1/var/Script (Python)(4)visitor_id()->'Hello, admin'
(Pdb) 
> /home/klm/work/zope/Zope/lib/python/Products/PythonScripts/PythonScript.py(316)_exec()
-> if keyset is not None:
(Pdb) c
> /home/klm/work/zope/Zope-2.6.1-linux2-x86/ZServer/PubCore/ZServerPublisher.py(24)__init__()

At this point we're at the main loop of the Zope publisher - it's here that the publisher dispatches the response back to the visiting browser and frees up the data structures:

      -> finally:
      (Pdb) c
      > /home/klm/work/zope/Zope-2.6.1-linux2-x86/ZServer/PubCore/ZServerPublisher.py(25)__init__()
      -> response._finish()
      (Pdb) c
      > /home/klm/work/zope/Zope-2.6.1-linux2-x86/ZServer/PubCore/ZServerPublisher.py(26)__init__()
      -> request=response=None
      (Pdb) q

Note that I was saying c in those steps around the publisher loop! This is a problem which is the chief pitfall of this method.

Due to a pdb bug (reputed to be fixed only in very recent python versions, after 2.1 and maybe even after 2.2), in certain rare situations the 'c'ontinue pdb command doesn't work right, and the debugger continues single stepping. Unfortuntely, the ZServerPublisher loop is one such situation. This means that the thread where you started debugging is stuck in debug mode.

Since you don't want to deplete the worker thread pool - it's small by default, consisting of four or so threads - the workaround is to restart the server after you've returned the request. Running the server in -D debug mode exposes a "Restart" button on the site's Control Panel (under linux, but last i checked not on windows), and recent versions of Zope generally restart fairly quickly. Still, this is one reason to prefer usingZope.debug() instead of instrumenting the server.

Note that quitting the pdb session within the guts of the publishing process - eg, within your application code - probably will not sacrifice a thread. This is because the publish_module() function catches exceptions, preventing them from aborting the thread. Doing so also usually returns an error to the calling browser and no commitment of the transaction, and so may not be an acceptable option.

As if that weren't enough, this approach is delicate for another reason - the possibility of separate requests coinciding so that multiple trace sessions overlap. In that case you will be faced with pdb prompts from both (or more) requests, simultaneously. The IO for the prompts just doesn't sort out right when that happens, and you're hosed.

If you're operating in an uncontrolled environment - eg, with public visits to your server - you must prevent such coincidences by judicious placement and/or conditioning of the set_trace() calls. (You probably want to prevent them from being triggered by unsuspecting visitors, anyway, since the visitors browser hang waiting for a response until you conclude debugging.)

To be on the safe side you can create a separate site instance running on a different port than your public site. This doesn't help in cases where the problem is due to numerous visitors or to undetermined actions by the visitors - then you may need to do more work to emulate or trap the conditions.

With pdb fixed, the more disruptive of the drawbacks will be remedied. Still, this active-server-based debugging session is best used with a development server, not in production service.

A Loose End

It's important to note that these techniques apply equally well to filesystem-based code, including Zope Product and even core code, as they do to python scripts. It's even easier in some ways - filesystem-based trusted code is exempt from security regulation, and so can import pdb and invokeset_trace() without enable_set_trace(). The python script we used in this paper was a convenient vehicle, a consistent landmark for progression through the examples. I wouldn't expect, however, to see python scripts as the only or even the most common targets for tracing.

In Summary

We've explored some variants which give you interior access to your application:

  • With python prompt access you can navigate the object file system, obtain objects from your site, and examine and articulate them directly.
  • With Zope.debug() you can engage the publisher to supply authentication information and debug the security-related and other through-the-web aspects of your application, as well.
  • With judicious situation of pdb.set_trace() you can trap specific conditions and trigger the debugger for requests coming from the network, while they're being handled by Zope / your application. (This one presents serious drawbacks for production sites.)

We've haven't delved deeply into the actual debugging - that deserves a paper of its own. What we have done is explore ways to find your way whenever you need to interactively examine the state and operation of a Zope application from the inside.

Appendix A - Setting up your ZEO-based Zope site for exploring

You can use the following recipe to set up and control a complete ZEO-based Zope site from scratch, and control it. This includes a very handy startup script, izctl.py, which makes it easy to get into the python prompt connected to Zope.

(This should work on most linux systems - i use redhat 8.0. You should be able to substitute binaries for your platform, found in the Zope.org products download area, http://dev.zope.org/Products .) :

% mkdir zexplore
% cd zexplore
% mkdir dist
  # Stable binary Zope release:
% wget http://dev.zope.org/Products/Zope/2.6.1/Zope-2.6.1-linux2-x86.tgz
% tar -xzf Zope-2.6.1-linux2-x86.tgz 
% ln -s Zope-2.6.1-linux2-x86 Zope
% mv Zope-2.6.1-linux2-x86.tgz dist
  # Stable ZEO release:
% wget http://dev.zope.org/Products/ZEO/ZEO-1.0.tgz
% tar -xzf ZEO-1.0.tgz
% mv ZEO-1.0.tgz dist
% cp -r ZEO-1.0/ZEO Zope/lib/python
  # Cook the Zope software home:
% cd Zope
% install
  # ... zope installation gets cleaned ...
% cd ..
% mkdir instance1
% cd instance1
% mkdir var import Products Extensions
% wget http://www.zope.org/Members/klm/ZopeDebugging/izctl.py
  # Edit top line of izctl.py so it has the path to the python
  # executable in your Zope installation, SOFTWARE_HOME/bin/python
% chmod +x izctl.py
% wget http://www.zope.org/Members/klm/ZopeDebugging/zope.conf
% wget http://www.zope.org/Members/klm/ZopeDebugging/custom_zodb.py
  # Create an initial 'admin' user, password '123' (you can change
  # it on the site by visiting the /acl_users 'admin' entry):
% wget http://www.zope.org/Members/klm/ZopeDebugging/inituser
  # Now (if you appropriately edited izctl.py's #! "magic number")
  # we're ready to enter the debug prompt:
% ./izctl.py debug
  [... your interactive session ...]

  # And/or:
% ./izctl.py start
  # ... The Zope server would be going detached from the terminal ...

% ./izctl.py start -D
  # ... The Zope server would be going, still attached to the terminal ...

  # To shutdown the Zope server, when running detached:
% ./izctl.py stop_zope

  # When you want to shutdown the ZEO server:
% ./izctl.py stop_zeo

Appendix B - An external method for use setting traces in python scripts

Here's a script you can use as an external method on your site to enable pdb.set_trace() in your through-the-web scripts. Follow the instructions in the doc strings to use it:

    """Enable use of pdb.set_trace() for TTW debugging in python scripts,
    templates, etc.

    NOTE WELL that you don't want to these exposed for general consumption
    on production sites - eg, calls accidentally left in production code
    will cause them to consume zope threads, leading to quick starvation.

    To use, Add a python "external method" in your site home:

      Id: enable_set_trace
      Module name: Products.WhateverProduct.enable_set_trace
      Function name: enable_set_trace

    To use, add the following line to your python script:

      context.enable_set_trace(); import pdb; pdb.set_trace()

    and start your site with the '-D' option, so it stays attached to the
    terminal in the foreground.  The 'pdb.set_trace()' will produce a pdb
    interaction prompt from that process.
    """

    def enable_set_trace():
        """After running this function, through-the-web methods on your site will
        allow import of pdb and call of pdb.set_trace():

          context.enable_set_trace(); import pdb; pdb.set_trace()

You should start the Zope process with -D so it stays attached to the controlling terminal. Then the pdb prompt will appear in that process.

import AccessControl; AccessControl.ModuleSecurityInfo(pdb).declarePublic(set_trace)