Crafty Hacks

Software utilities, tools, and techniques I developed that have been particularly useful.

Zope (and Plone, etc) debugging - "Conversing With zope"

While working at Zope Corporation I developed and described some essential techniques for interactive python-prompt Zope debugging.

During a Zope Corp customer project (working with Chris McDonough and Evan Wallace), I devised the original precursor to zdctl/zopectl/zctl operational control scripts, to be able to launch Zope/ZEO for interactive debugging. I included a 'debug' command which starts the application server instance or zeo client such that the developer can interact with the Zope run time in the Python interpreter's interactive command loop.  Evan, Chris, Tres Seaver and I all embellished on it, as have others, but the facility is the basis for the Plone client shell 'debug' commands - 'instance debug', 'client1 debug', etc.

Having developed this while working at Zope Corp, I was able to more easily influence fixes and accommodations in the Zope server operation for this special operating mode. The actual, original zopectl code was only ten or twenty lines of python code, though other code was affected along the way.

Together with Emacs PDB track (below), this provides a versatile Emacs-based visual Zope debugging facility.

I wrote and presented a paper at  PyCon 2003Conversing With Zope, about the essentials of debugging Zope, including some techniques made possible with this facility.

Python debugging with Emacs - 'PDB track'

Some (often, challenging) portion of many programming efforts involves investigating and fixing operating code - debugging.  One crucial tool is a screen-oriented debugger, where the code being executed is presented visually as the program proceeds under control of the developer. Emacs, a prevalent programmers editor, includes facilities for interacting with other programs, and an elaborate (and massive, code-wise) facility for interacting with debuggers - 'GUD' - the Grand Unified Debugger.

GUD is not often good, unfortunately, for the kind of debugging one does with interpreted systems, like Python, where it's common to want to enter debugging mode upon discovering a malfunction, after the program has been launched. this is particularly the case with long-running server programs, like Zope. GUD is organized more as an executive, intended to control the target program as a subordinate, from launch. It can attach to already running programs only with special, burdensome provisions. Having to start a new session (under GUD control) and reproduce the problem context is often infeasible. I came up with an alternative to a GUD-based interface which works quite well, and took a proportionally tiny amount of effort compared to tailoring something based on GUD.

Instead of creating code that takes executive control of the debugging process, I created on addon filter to the shell-process interaction mechanism (comint) which tracks the shell output, where Python's line-oriented debugger (PDB) can be running. Whenever this filter notices something which looks like a report from pdb identifying the current line and file of code being debugged, it presents that line and file in another emacs window.

With this in place, you need merely add "import pdb; pdb.set_trace()" to running code in order to get pdb going. If the zope was started in an emacs shell, and bought to the foreground, emacs will notice the debugging situation and act like a screen-oriented debugger.

That's the essence of it. it takes care not to intrude at the wrong time, and also provides for non-file-system files for which you have a local copy (eg, when running Zope internal Python Scripts), and other nuances.

Why is this a crafty hack?

  1. It takes less than 200 lines of emacs lisp code, including copious comments, docstrings, etc, serving instead of gud's 3400 ui and 3400 backend lines. (To be fair, GUD does a lot more, but much of that is not relevant or useful when debugging Python or other interpreted code.)
  2. It works a lot better than GUD for many things, particularly for activities typically involving interpreted code.
  3. It's generally invaluable, significantly improving emacs for python development.

Barry Warsaw integrated my PDB tracking into python-mode.el not long after, and it has been adopted as part of GNU emacs python.el.

(The paper i wrote and presented at !PyCon 2003 about the essentials of Zope Debugging includes a little info about using PDB track.)

I've created a stand-alone version of pdbtrack, available in my EmacsUtils github repository.

ZWiki "parenting"

Wikis are great, but they suffer the same drawback as many link-navigated collections (eg, the web:-): there's little or no scalable notion of relative location. The connections are organized like spaghetti is organized on your plate. Figuratively, navigating link- and search-oriented collections is like living in a world where every room (page) is connected to every other by teleporters. there is no firm basis for general orientation, for a sense of location within progressively scaled contexts.

Many web sites have site maps, and provide navigation aids like breadcrumbs and (occasionally) "next", "previous", and "up", and so forth. The measures, however, are usually provided as afterthoughts, and many are almost incongruous in the multiply-connected context of the web. To be useful they might impose disciplines that restrict the free flow of interconnection, and they often depend on deliberate assignment of metadata on top of the authoring process, which is notoriously prone to be neglected.

In WikiForZopeOrgAndCom, I observed that subtopics often are created from their parent pages. I built that into pages - when new pages are created (via following an unsatisfied wiki link), they start registered with the originating page as their parent. The "parentage" setting is adjustable, for those cases where the creation order doesn't fit, and/or the page belongs in multiple places - but it turns out to be a pretty conducive guideline, collecting the organizational metadata based on activity rather than deliberate assignment. It proves to be effective, providing a basis for recognizing (casual, multiply-connected) hierarchies within each wiki - providing some transitive sense of topical relationship with related pages in the collection.

The assignment is not perfect, nor is the organizational paradigm. Wiki gardening is still necessary as a wiki grows, but at least there is an actual, usable expression of the organizational maps that the gardeners often have in mind when they organize their wikis. Though not ideal, it does promote suitable organizing and it delivers (through breadcrumbs and a structured wiki map) a distinct and useful sense of location as you navigate the wiki. It hints, in principle, at scalable ways to progressively organize collaboratively developed knowledge, with the effort suitably proportioned and distributed, and with significant gain.

"Scout" Knowbot

For my example (pet) knowbot on the CNRI KOE project, i developed an interactive knowbot - kpshell.py - a read-eval-print loop (repl) within a kp (knowbot program). (One of the things i worked on at CNRI with the other python guys was a prototype "knowbot" platform, for networked hosting of programs that can migrate, at their discretion, from node to node.)

The operator can launch this kp on their local node and interact with it much like a regular python shell. They can examine their local knowbot operating context and experiment with knowbot commands, including directing the knowbot to clone and/or migrate to another node. The read/eval/print loop connection remains, so the operator can continue to interact with the knowbot in the remote environment - examine the environment, put data in its "suitcase", and continue to migrate to other nodes and exercise the system. The operator can also tell the kp to clone itself, and other interesting things.

This turned out to be a quite handy interactive experimenting and debugging tool, which is a lot of the reason that i wrote the thing - i'm a glutton for interactive development.

Better CVS Modules Knitting

For years, zope community development hinged on the public Zope CVS repository, where resided many modules that were stitched together to form various Zope bundles, including the central distribution itself. The CVS provision for knitting things together, 'modules', have severe drawbacks that proved show stoppers for our purposes. (See the modules drawbacks explanation i posted to the Zope Announcements mailing list.)

My remedy (described in the drawbacks posting) was some scripts hooked into CVS which would implement symbolic links as dictated by a text file, adjusted whenever that file was updated in each CVS repository. Running on the repository server, the scripts could guarantee sanity and hygiene of the desired links, and report (via automatic checkin notices) the results.

The beauty of this approach is that people entrusted with checkin privileges had the means to make adjustments to the module-like knitting structure, without gaining any of the other privileges that would come, eg, with login access to the machine. All of the mechanism and linking recipes were under version control, as well. The recipes themselves were the height of simplicity - file/link path pairs. (see instructions for committers for more details. A table distinct from the items being linked is not the ideal way to manifest it, but it has its benefits as well as drawbacks.)

Moreover, the checkin reporting mechanisms, which posted checkin notices to various mailing lists concerned with particular repository areas (and which i also engineered), would see checkins for their areas whether they were contained directly or via symbolic links - so all the right parties were notified of changes that concerned them, even when the code in question fell into the purview of several distinct projects.

This mechanism served well for many years, bypassing a critical shortcoming in CVS, and has been superseded only with the commissioning for most public Zope projects by using more modern source code management systems.

icomplete - another emacs tweak

icomplete is another high-leverage hack of which i'm proud. in this case, a small amount of code provides modest but frequent value.

icomplete is an add-on which operates in the emacs minibuffer, incrementally indicating available alternatives when entering input that has automatic completion. I modeled it on a similar extension that someone made just for buffer names, but as a mini-buffer mode. That means that it works generally, providing a consistent UI that extends all emacs commands that do simple (table-oriented) completion.

Adding visual decorations to allout outlines using events

In an application of event-driven techniques, I implemented automatic maintenance of graphic outline decorations - topic bullets and nesting graph lines - in allout purely as handlers for events generated when an outline's structure is changed - topic collapse and expansion, cuts and pastes, bullet changes, etc. This approach not only prevented the allout code from becoming significantly more complicated by the added functionality, it also provided an event framework that other enhancements could use.

8mmbackup

while i was maintaining computer infrastructure for an automated manufacturing research lab, 8mm streaming-tape backup devices became available well before packaged network backup software could handle the new device's storage capacity. cpio and other standard Unix utilities didn't have a size limit, though. together with awk, sed, cron and other unix "tiny tools" i had more than adequate building blocks for a decent backup system.

i scripted such a system, and it worked well.

i have a datapoint from around when we were fully using the system, around 1990. at that point the lab had something like 60 to 70 researchers with Sun workstations depending on around 15 sun servers with around 8 to 9 GB of used disk capacity. the backup servers were diskless client nodes sitting around the network, doing backups via NFS. (see http://www.sunmanagers.org/archives/1990/0214.html for some remarks.)

among other things, this experience opened my eyes to the ramifications of developing something useful.

walking to the server room to restore two weeks worth of forty researchers work from an 8mm backup tape, i was a bit taken aback to realize that i was responsible not only for running the restore correctly. if something was wrong with the backup, it might have been due to the script that i wrote, an inadequate implementation of a verification regime, or some other extra way i had involved myself in the backup process. i was a little daunted by that realization, and redoubled in my understanding of the importance of software development thoroughness and quality. as it happened, that and other restores worked fine.

one side note. i'm a bit embarrassed to say i used csh as my scripting language. with no proper functions, i used labelled goto for some of the control flow! still, though primitive in many ways, it was able to do use suprisingly sophisticated methods. for instance, i was able to use guarded interrupts (onintr) for controlled handling of "exceptions" (errors), aliases for macro-style simple function calls, and generally, leveraged unix tools to make a fairly robust, useful system. i eventually switched over to bash, which is a much better programming language - but still have some fondness for what was possible with csh.

during the first two years that the script was available via anonymous ftp, there were over 800 retrievals. though that's small by today's open source standards, it was a striking success by my standards at the time. five or six year after i first made it available i was still getting messages from people around the world with questions about using it and favorable comments.

playlistsync - synchronize iTunes playlists to fs, eg for android

i'm a playlist fanatic - i organize my music so i can easily put on the kind of thing i want to listen to when i have a hankering for it. i'm also quite partial to android phones and an early adopter, so for a while i had no music synchronization options beside building my own. fortunately that wasn't too hard to do, given a python port of mac applescript, and my initial effort has paid off because i was able to tailor the synchronization to my way of organizing my playlists in ways that the emerging alternatives don't begin to do.

in addition to reasonable full and incremental synchronization, my python playlistsync script provides a workaround for an inexcusable oversight in the shuffle play on the standard android players (where the first track on the playlist is always played if you start a playlist in shuffle play mode), conveying playlist folder nesting to the target playlists, and providing for deliberate ordering of playlists in the device players, which sorts them alphabetically by playlist name, and more.

see iTunes Playlist Sync for more details and a usable version of the open-source script.

a few handy scripts

the thing about a decent shell scripting language is that it's for expressing the commonplace things you do - and commonplace means that some of the things you use it for, you frequently repeat. it is a realm where automation is going to be instrumental in flowing and growing your everyday activities. as with most seasoned shell users, particularly those who tinker, over time i've developed some invaluable scripts. here are a few which have grown over time into Things Worth Sharing.

  • killp
    • a script for terminating processes with discretion. candidates whose invoking commands match the arguments you pass to killp are presented for interactive selection. one of my early shell scripting ventures, and one that has stuck with me and grown more refined and more useful over the years. killp -? for instructions. here's a current copy of the killp script.
  • freq
    • A very welcome addition to my daily operations, this script presents menus of commands from files, located at various places in your directories. You select which command you want to execute by name or number.

      The command lines are globbed (wildcard-expanded) and variable expanded, and can contain mixtures of compound, pipelined, and backgrounded commands, so you can express elaborate actions.

      I find it useful as much for documentation of elaborate actions as it is for frequently executed elaborate actions. By first presenting those commands from the .frequents file in the current working directory, you can partition your activity lists by directory, since there are some activities specific to some directories.