Building and Packaging a Python command-line tool for Debian

python-logo-notext-svgPython packaging has a chequered past.

Distutils was and still is the original tool included with the standard library. But then setuptools was created to overcome the limitations of distutils, gained wide adoption, subsequently stagnated, and a fork called distribute was created to address some of the issues. Distutils2 was an attempt to take the best of previous tools to support Python 3, but it failed. Then distribute grew to support Python 3, was merged back in to setuptools, and everything else became moot!

Unfortunately, it’s hard to find reliable information on python packaging, because many articles you might find in a Duckduckgo search were created before setuptools was reinvigorated. Many reflect practices that are sub-optimal today, and I would disregard anything written before the distribute merge, which happened in March 2013.

While which packaging tool to use was ambiguous in the past, it’s now much easier to recommend one. At the time of writing (September 2016), you should use setuptools. It’s what most packages use, is fully supported by pypi and pip, and works pretty well. For a summary of the subject of python packaging tools, this page summarises them all very well. For an authoritative reference, see packaging.python.org.

debianThis article will show you how to package a simple command line application for Debian with setuptools, pytest and stdeb.

There is nothing that makes me an authority on this, however I am currently implementing a Continuous Integration pipeline for Python packages at Gumtree. I’ve also been packaging Python packages (with varying degrees of success) for more than two years, and am lucky to work with some talented engineers with extensive knowledge of Jenkins and packaging generally.

But most importantly, I’ve done Python packaging wrong many times, and an article such as this would have helped me immensely when I started out.

Audience

I will try to use plain English where possible, but I’m aiming this at people who know at least a bit of Python, are familiar with Debian or Linux, and want to package their Python software properly for the OS. Familiarity with argparse also helps if you want to build on what you learn here.

This is an opinionated article, and reflects what I consider to be current best-practice. However at the end of the day this is just how I do it, and there could well be superior methods or tools that I’m not aware of. If you have any suggestions for this article, please comment or contact me. I expect this to be a living document as practices evolve.

One tool I am not covering here is Spotify’s dh-virtualenv. The reason for this is simple – it’s overkill for small projects, and isn’t particularly useful if you don’t understand what’s happening underneath it. If you have a large complex app with many dependencies that aren’t in Debian, or you need newer versions it’s a great tool, but for simple stuff which can happily use the base Debian Python distribution, I recommend sticking to stdeb.

The resulting package can’t be uploaded to pypi because there is already a “pysay” package (which I have no affiliation with), but the package built here could be uploaded in principle by another name. (sidenote: check out ponysay. It takes the cowsay phenomenon to a whole new level)

I also won’t cover hosting an internal repository. Devpi is the tool to use for this, and the documentation is quite good. If you want to host your own pypi repository it’s leaps and bounds ahead of anything else.

Required tools

A text editor/IDE of your choice (I use vim and PyCharm), a bash terminal, Python (2 or 3, this should be compatible with both), git, python-setuptools, python-stdeb, python-argparse, and a Debian or Ubuntu machine should do it. Bash is also available on OSX machines, and Windows as well if you install Windows Subsystem for Linux, which is probably the way to go, given the nightmare that is Python development tooling on native Windows.

Everything except the Debian package build can be done on OSX, and with WSfL you could probably do the whole lot without a real Debian machine at all.

Outline

Packaging Python code for Debian involves several layers (module, python package, debian package), and this article will address all of them:

  1. Creating a simple Python module. You can safely skip this section if you already have code to use, or if you’re comfortable writing modules you can use my example on Github and skim through.
  2. Packaging our module for Python by creating a setup.py file which configures setuptools.
  3. Building a python package for Debian with stdeb.
  4. Adding unit tests to our package.

You may want to create a repo online somewhere, such as on Github, but this is optional – Git is not the focus of this tutorial, it’s just what I use.

The completed project is available at https://github.com/al4/python-tutorial-pysay.

Creating a Python module

You should always create a module for any code that could be reused or extended in the future. The overhead is low, and you can import your module into other projects (code reuse). Having a well-organised project is a solid foundation to build on.

What distinguishes a module from a simple file? A Python file is simply a .py file that can be executed directly by calling python file.py $args.
You can also call ./file.py $args if it is an executable file with a shebang (i.e. something like #!/usr/bin/python) at the top.

This is fine for throwaway scripts, but it’s no way to write Python software, and doesn’t encourage code reuse at all.

A Python module is simply a directory with a __init__.py file in it, which makes it importable by calling import module from within Python. It is still executable with arguments from the command line by calling python -c "import module; module.some_function()" $args from the command line (if the module is in your path, which by default includes the current directory), but this is not the preferred interface as you will see later in the tutorial.

It’s worth stressing that, while we’re going to put code in __init__.py, generally it is considered bad practice to put lots of logic there. __init__ is supposed to be code that initialises the module, and it is often used to provide aliases to classes or functions for convenience, i.e. to abstract the API interface of your module from the implementation.

For the case of a command line application though, I think parsing the arguments inside __init__ is acceptable, but if this makes you squeamish you can create a “cli.py” file and import it if you wish.

Create a directory and initialise a git repository inside:

mkdir python-pysay
cd python-pysay
git init
git remote add origin http://path/to/git/repo  # if you created an online repo

The root of the project does not have to be the name of the project or package, it can be anything really. But for usability’s sake it is good for it to bear some relation to the content, while distinguishable from the module itself. I prefer to use what would be its Debian package name.

Next, we need to create the python module. After this the module will actually be ready to import!

From the project root (~/src/python-pysay):

mkdir pysay # from project root
echo "print('pysay imported!')" > pysay/__init__.py

Our module is now importable:

$ python -c 'import pysay'
pysay imported!

Now we need a function to execute. Edit __init__.py, and insert a function at the end of the file with the following content:

def main():
    print('This is the main function!')

If we now import and execute main(), the result is predictable:

$ python -c 'import pysay; pysay.main()'
pysay imported!
This is the main function

Executing it directly at this point of course won’t execute main(). If you do python pysay/__init__.py, you will get:

$ python pysay/__init__.py
pysay imported!

To actually do something when called directly like this (again, I don’t recommend this), you would need to do the usual if __name__ == __main__; main() dance that you are probably familiar with.

A better idea (at least until we have setup an entry point), is to create a shell wrapper for testing, so create a file called pysay.sh in the project root with the following contents:

#!/bin/bash
python -c "import pysay; pysay.main()" "${@}"

This passes the command line arguments to the module, just as executing a python file directly would do.

You can also do this as a python script:

#!/usr/bin/env python
import pysay
pysay.main()

The effect of both is the same:

$ chmod +x ./pysay.*
$ ./pysay.sh
pysay imported!
This is the main function
$ ./pysay.py
pysay imported!
This is the main function

(the above examples are in the example repository as pysay.sh and pysay.py)

The arguments (accessible via sys.argv) are global to the python interpreter, so argparse and any arguments you pass to this script will still work.

Speaking of argparse, we should set up argparse to parse our command line arguments. Add the following function to __init__.py:

import argparse

def parse_args():
    parser = arparse.ArgumentParser()
    parser.add_argument('text', help='Text to pysay')
    return parser.parse_args()

We also need to add it to main(). Edit the main function so it reads as follows:

def main():
    print('This is the main function')
    args = parse_args()
    print('Pysays', args.text)

When we execute our wrapper script we should now see:

$ ./pysay.py -h
pysay imported!
This is the main function
usage: pysay.py [-h] text

positional arguments:
  text        Text to pysay

optional arguments:
  -h, --help  show this help message and exit

Auto-generated help is just one of the reasons that I love Python’s argparse module.

When you pass an argument it should handle it like so:

./pysay.sh test
pysay imported!
This is the main function
pysays test

Referencing other files

In more complex applications, you might want to spread code across multiple files. In this scenario, creating a module is not just desirable, it is essential.

As a contrived example, say we want to make pysay print out an ascii python logo.

Create a separate file inside the module directory called ascii_art.py, with the following contents:

pylogo = """
          .?77777777777777$.            
          777..777777777777$+           
         .77    7777777777$$$           
         .777..77777777$$$$$$           
         ..........:77$$$$$$$           
  .77777777777777777$$$$$$$$$.=======.  
 777777777777777777$$$$$$$$$$.========  
777777777777$$$$$$$$$$$$$$$$ :========+.
77777777777$$$$$$$$$$$$$$+..=========++~
777777777$$..~=====================+++++
77777$$$$.~~==================++++++++: 
 7$$$$$$$.==================++++++++++. 
 .,$$$$$$.================++++++++++~. \\
         .=========~.........           \\
         .=============++++++            {text}
         .==========+++.  .++           
          ,=======++++++,,++,           
          ..=====+++++++++=.            
                .~+=...                 
"""

def py_format(text):
    # Sorry for terrible example
    return pylogo.format(text=text)

Add the import to the top of __init__.py:

from .ascii_art import py_format

And the call to main() (also in __init__.py):

    print(py_format(args.text))

(you can find the completed files here)

Now we should see the following on execution:

./pysay.sh "Argparse is awesome"
pysay imported!
This is the main function

          .?77777777777777$.
          777..777777777777$+
         .77    7777777777$$$
         .777..77777777$$$$$$
         ..........:77$$$$$$$
  .77777777777777777$$$$$$$$$.=======.
 777777777777777777$$$$$$$$$$.========
777777777777$$$$$$$$$$$$$$$$ :========+.
77777777777$$$$$$$$$$$$$$+..=========++~
777777777$$..~=====================+++++
77777$$$$.~~==================++++++++:
 7$$$$$$$.==================++++++++++.
 .,$$$$$$.================++++++++++~. \
         .=========~.........           \
         .=============++++++            Argparse is awesome
         .==========+++.  .++
          ,=======++++++,,++,
          ..=====+++++++++=.
                .~+=...

So at this point, we have a working Python command line module, but no way to install it. Now the real fun begins.

Packaging for Python

Python packaging, in this example, is simply a matter of creating a setup.py file, which configures setuptools with the metadata of our project.

Create a file called setup.py in the root of your project:

from setuptools import setup

__author__ = 'Your name'


setup(
    name='pysay',
    version='0.0.1',
    packages=['pysay'],
)

This is about the simplest setup.py you could create, and it is enough to make the package installable.

Let’s test it. From the root of your pysay project:

$ python setup.py install
running install
running bdist_egg
running egg_info
writing pbr to pysay.egg-info/pbr.json
writing pysay.egg-info/PKG-INFO
writing top-level names to pysay.egg-info/top_level.txt
writing dependency_links to pysay.egg-info/dependency_links.txt
reading manifest file 'pysay.egg-info/SOURCES.txt'
writing manifest file 'pysay.egg-info/SOURCES.txt'
installing library code to build/bdist.macosx-10.11-x86_64/egg
running install_lib
running build_py
creating build/bdist.macosx-10.11-x86_64/egg
creating build/bdist.macosx-10.11-x86_64/egg/pysay
copying build/lib/pysay/__init__.py -> build/bdist.macosx-10.11-x86_64/egg/pysay
copying build/lib/pysay/ascii_art.py -> build/bdist.macosx-10.11-x86_64/egg/pysay
byte-compiling build/bdist.macosx-10.11-x86_64/egg/pysay/__init__.py to __init__.pyc
byte-compiling build/bdist.macosx-10.11-x86_64/egg/pysay/ascii_art.py to ascii_art.pyc
creating build/bdist.macosx-10.11-x86_64/egg/EGG-INFO
copying pysay.egg-info/PKG-INFO -> build/bdist.macosx-10.11-x86_64/egg/EGG-INFO
copying pysay.egg-info/SOURCES.txt -> build/bdist.macosx-10.11-x86_64/egg/EGG-INFO
copying pysay.egg-info/dependency_links.txt -> build/bdist.macosx-10.11-x86_64/egg/EGG-INFO
copying pysay.egg-info/pbr.json -> build/bdist.macosx-10.11-x86_64/egg/EGG-INFO
copying pysay.egg-info/top_level.txt -> build/bdist.macosx-10.11-x86_64/egg/EGG-INFO
zip_safe flag not set; analyzing archive contents...
creating 'dist/pysay-0.0.1-py2.7.egg' and adding 'build/bdist.macosx-10.11-x86_64/egg' to it
removing 'build/bdist.macosx-10.11-x86_64/egg' (and everything under it)
Processing pysay-0.0.1-py2.7.egg
Copying pysay-0.0.1-py2.7.egg to /usr/local/lib/python2.7/site-packages
Adding pysay 0.0.1 to easy-install.pth file[/python]

Installed /usr/local/lib/python2.7/site-packages/pysay-0.0.1-py2.7.egg
Processing dependencies for pysay==0.0.1
Finished processing dependencies for pysay==0.0.1

From this point, you could fire up a python interpreter from anywhere on your file system, call “import pysay”, and it would work – you are no longer bound to calling it from the source directory.

We can also see that pip knows about it:

$ pip freeze | grep pysay
pysay==0.0.1

It’s a bit cumbersome to use however. Calling python -c "import pysay; pysay.main()" doesn’t really qualify as a user interface. Using a shell script is also problematic, because we don’t actually know where the module will be installed.

To address these problems, setuptools provides a mechanism called an entry point.

Adding an Entry Point

Add the following to your setup() declaration in setup.py, and bump the version to 0.0.2 while you’re at it:

    entry_points={
        'console_scripts': [
            'pysay=pysay:main',
        ]
    },

This simply tells setuptools that it should create a console script to call the pysay.main function.

Uninstall the old version (pip uninstall pysay), as setuptools does not support upgrading packages. That’s a job for pip. Now run python setup.py install again:

$ python setup.py install
running install
[...]
Installing pysay script to /usr/local/bin

Installed /usr/local/lib/python2.7/site-packages/pysay-0.0.2-py2.7.egg
Processing dependencies for pysay==0.0.2
Finished processing dependencies for pysay==0.0.2

Notice how it installed a script to /usr/local/bin. Now we can cd out of the project directory and run it from anywhere:

~ $ pysay -h
pysay imported!
This is the main function
usage: pysay [-h] text
[...]

Everything is working, but you should add some more metadata to the setup() declaration in your setup.py file. For example:

    description='Simple tutorial from al4.co.nz',
    author='Your Name',
    author_email='you@domain',
    license='MIT',
    url='https://github.com/al4/python-tutorial-pysay',

The more metadata you add for your users, and for stdeb to build your Debian package, the better. Bump your package version (now 0.0.3) and uninstall+install again.

Developing without installing

Of course, when you’re developing your package, you don’t want to have to run setup.py install every time. Wouldn’t it be nice if changes were immediately reflected? Strangely enough, this functionality is built in:

python setup.py develop

The “develop” command symlinks the current directory into your python packages directory, so any subsequent executions of pysay will actually be running the code from your work-space.

Try it, by making a simple change, then calling pysay test from any directory. You should see your changes reflected.

Finished developing? Run pip uninstall pysay.

Building a Debian package with stdeb

To be honest, I was surprised how easy this is. Once you have a proper python package you have all the information required; from there it’s just a matter of transposing it into Debian’s format. And stdeb does this for you, provided your packaging is not too esoteric.

Stdeb can only build binary packages on a Debian machine though, as it depends on the standard Debian build tools.

As I primarily run OSX, I use a Vagrant machine for this purpose (the simple Vagrantfile is included with the github project), but you can do this however you like.

On a Debian machine (Jessie was used for this tutorial), install some dependencies:

sudo apt-get update
sudo apt-get -y install build-essential debhelper tar gzip python-stdeb devscripts

If you used my Vagrantfile, the source directory will be available at /home/vagrant/build/pysay.

You can use stdeb from pypi if you wish, but the goal here is to create software compatible with the base Debian Python distribution, so this isn’t necessary. If you need newer software, you probably don’t want to use Debian’s python packages, so have a look at dh-virtualenv.

To build a .deb file, run:

python setup.py --command-packages=stdeb.command bdist_deb

After the build process has completed, the .deb file will be at ./deb_dist/python-pysay_0.0.3-1_all.deb.

Test it by installing:

$ sudo dpkg -i ./deb_dist/python-pysay_0.0.3-1_all.deb
Selecting previously unselected package python-pysay.
(Reading database ... 50815 files and directories currently installed.)
Preparing to unpack .../python-pysay_0.0.3-1_all.deb ...
Unpacking python-pysay (0.0.3-1) ...
Setting up python-pysay (0.0.3-1) ...
$ pysay -h
pysay imported!
[...]
$ which pysay
/usr/bin/pysay

Here we can see that pysay is now installed in the Debian-managed directory of /usr/bin (pip would install to /usr/local/bin). You can also view all the files in the package with dpkg -L python-pysay, which will show your module installed to /usr/lib/python2.7/dist-packages.

And that, is all there is to it!

Adding Unit Tests

Test-driven development is beyond the scope of this article, but many people know what testing is, they just don’t know how to start. In this section, we will create a basic unit test class, and a simple test to ensure that our string contains the input text.

First, ensure that you are in “develop” mode by running python setup.py develop. Create the tests directory:

mkdir tests  # from project root
cd tests

Now, create a new file called test_ascii_art.py, to reflect the fact that we are testing functions in the ascii_art.py file. Other than starting with “test_” the name is not significant, but naming by files you’re testing is a convenient convention to follow.

Add the following to test_ascii_art.py:

from unittest import TestCase
from pysay.ascii_art import py_format


class TestAsciiArt(TestCase):
    def test_py_format_contains_input_text(self):
        self.assertIn(
            'test-string',
            py_format('test-string')
        )

Now, we can run our tests like so:

$ python -m unittest discover
pysay imported!
.
----------------------------------------------------------------------
Ran 1 test in 0.000s

OK

From the project root however, this doesn’t work. Adding a __init__.py file will allow tests to be imported and it will detect them automatically, but instead I recommend simply running python -m unittest discover tests.

When you add more tests, they will also be executed and discovered, so long as they “look like tests”. This can mean different things to different test runners, but generally, if they are functions that start with “test_”, in a TestCase class, in a file that also starts with “test_”, in a directory called “tests”, any runner that doesn’t detect them needs to evaluate its life choices.

Now, let’s add a failing test and observe the result. Add the following function to the TestAsciiArt class:

    def test_fails(self):
        """ A test that always fails
        """
        self.assertEqual(
            'foo', 'bar'
        )

And observe the result:

$ python -m unittest discover
pysay imported!
F.
======================================================================
FAIL: test_fails (test_ascii_art.TestAsciiArt)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/alforbes/src/public/python-pysay/tests/test_ascii_art.py", line 17, in test_fails
    'foo', 'bar'
AssertionError: 'foo' != 'bar'

----------------------------------------------------------------------
Ran 2 tests in 0.000s

FAILED (failures=1)

For documentation on the unittest assertions you can call, see the unittest documentation. I recommend you use these rather than the assert keyword, as output from the unittest.TestCase assertions is much more useful.

Using Pytest

Pytest has a few advantages, but the main benefits are improved test discovery, and cleaner output with better error reporting.

The good practices section of the pytest documentation is a great resource.

To install pytest, do pip install pytest. While you have to be in the tests directory or specify it with unittest, Pytest will detect the tests and execute them from the project root:

$ pytest  # from root
================================= test session starts ==================================
platform darwin -- Python 2.7.12, pytest-3.0.2, py-1.4.31, pluggy-0.3.1
rootdir: /Users/alforbes/src/public/python-pysay, inifile:
collected 2 items

tests/test_ascii_art.py F.

======================================= FAILURES =======================================
_______________________________ TestAsciiArt.test_fails ________________________________

self = <test_ascii_art.TestAsciiArt testMethod=test_fails>

    def test_fails(self):
        """ A test that always fails
            """
        self.assertEqual(
>           'foo', 'bar'
        )
E       AssertionError: 'foo' != 'bar'

tests/test_ascii_art.py:18: AssertionError
========================== 1 failed, 1 passed in 0.05 seconds ==========================

Remove the always failing test before continuing.

Testing with setuptools

Setuptools supports the unittest module natively, but in order to do run them you need to be able to import your tests, which means adding a __init__.py file and going against pytest recommendations. But if you want to run them this way (it can be useful for some automated tools), add the following line to your setup() declaration:

    test_suite='tests',

And you can now run your suite like so:

$ python setup.py test
running test
running egg_info
writing pbr to pysay.egg-info/pbr.json
writing pysay.egg-info/PKG-INFO
writing top-level names to pysay.egg-info/top_level.txt
writing dependency_links to pysay.egg-info/dependency_links.txt
writing entry points to pysay.egg-info/entry_points.txt
reading manifest file 'pysay.egg-info/SOURCES.txt'
writing manifest file 'pysay.egg-info/SOURCES.txt'
running build_ext
pysay imported!
test_py_format_contains_input_text (tests.test_ascii_art.TestAsciiArt)
Test that py_format output contains the input string ... ok

----------------------------------------------------------------------
Ran 1 test in 0.000s

OK

Alternatively, you can integrate pytest, which is what I would recommend. See the pytest documentation for more information.

Recap

From this, I hope you can take away that it really is easy to build Python command line applications for Debian, and have them nicely install like any Debian package. There really is no excuse to deploy on production servers with pip!

We have created a simple module from scratch, configured setuptools to create a python package, added a console script to easily access our code, and wrapped the whole thing up in a .deb package.

There are many more customisation options that can be applied to both the Python and Debian packages, but those are beyond the scope of this article. See the stdeb and python packaging documentation for more information.

I hope you found the article useful, if you have any questions please hit me up in the comments.

Leave a Reply