Creating Your First Python Package

Offered by: Digital Work at Otto-Friedrich-Universität Bamberg License: CC BY 4.0

graph TD
    A["<b>Goal:</b> Create a sharable Python tool"] --> B["<b>1.</b> Learn the Anatomy of a Package"];
    B --> C["<b>2.</b> Initialize Project with <code>uv init</code>"];
    C --> D["<b>3.</b> Create the Directory Structure"];
    D --> E["<b>4.</b> Install in Editable Mode"];
    E --> F["<b>5.</b> Add a Function and a Test"];
    F --> G["<b>6.</b> Manage Dependencies"];
    G --> H["<b>Result:</b> A foundational, working package"];


    style A fill:#a2a2e2,stroke:#000,stroke-width:1px,color:#000
    style B fill:#f2d08b,stroke:#000,stroke-width:1px,color:#000
    style C fill:#f2d08b,stroke:#000,stroke-width:1px,color:#000
    style D fill:#f2d08b,stroke:#000,stroke-width:1px,color:#000
    style E fill:#f2d08b,stroke:#000,stroke-width:1px,color:#000
    style F fill:#f2d08b,stroke:#000,stroke-width:1px,color:#000
    style G fill:#f2d08b,stroke:#000,stroke-width:1px,color:#000
    style H fill:#d199f1,stroke:#000,stroke-width:1px,color:#000

1. Introduction: What is a Python Package?

A Python package is a standardized way to bundle and distribute reusable code so that others (or your future self) can easily install and use it with a simple pip install command.

In this module, you will learn the fundamentals by building a complete, working package from scratch. Our goal is to demystify the process and understand the why behind each step. We will build a simple but practical utility package that standardizes journal names in bibliographic data, which is a core task in literature review tools like Colrev. This will give you the foundational skills needed to later build your own Colrev plugins.

1b. Prerequisites: Setting Up Your Tools

Before we begin, we need to ensure you have the necessary command-line tools. These do not come with Python and must be installed separately.

Installing uv

uv is a modern, extremely fast tool for managing Python packages and projects. We will use it to initialize our project and handle dependencies.

Open your terminal or command prompt and run the following command:

# For macOS, Linux, and Windows WSL
curl -LsSf https://astral.sh/uv/install.sh | sh

# For Windows (Powershell)
irm https://astral.sh/uv/install.ps1 | iex

After installation, close and reopen your terminal. Verify it was installed correctly by running:

uv --version

You should see the installed version number printed.

Installing pytest

pytest is the framework we will use to write and run tests for our code. While we could install it globally like uv, it’s a best practice to install testing tools as development dependencies for each project. We will do this in Step 3.5.

2. The Anatomy of a Modern Python Package

A consistent structure is key. It allows automated tools and other developers to understand your project instantly.

colrev-journal-formatter/
├── src/
│   └── colrev_journal_formatter/
│       ├── __init__.py
│       └── main.py
├── tests/
│   └── test_main.py
├── LICENSE
├── README.md
└── pyproject.toml
  • pyproject.toml: The control center. It contains the package name, version, dependencies, and build instructions.
  • src/colrev_journal_formatter/: The source folder. Your actual Python code lives here. Using src is a modern best practice that prevents many common import errors.
  • tests/: The testing folder. Contains code that automatically checks if your source code works correctly.
  • README.md & LICENSE: Your project’s documentation and legal rules.

3. Step-by-Step Guide to Creating Your Package

Let’s build a package named colrev-journal-formatter.

Step 3.1: Initialize the Project with uv

First, create a directory and initialize the project.

mkdir colrev-journal-formatter
cd colrev-journal-formatter
uv init

This creates your pyproject.toml file. Let’s examine every line:

[project]
name = "colrev-journal-formatter"
version = "0.1.0"
description = "A short description of the project."
authors = [ { name = "Your Name", email = "you@example.com" } ]
requires-python = ">=3.8"
dependencies = []

[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"
  • [project]: A standard section holding all your project’s metadata.
  • name: The name used to install your package (pip install colrev-journal-formatter).
  • version: The current version. You should increment this with each new release.
  • dependencies = []: A list of other packages that your package needs to function. uv add will populate this list for you.
  • [build-system]: This section tells pip how to build your package. It specifies the “build backend” (hatchling) which acts as a “factory” to assemble your code into a distributable format.

Step 3.2: Create the Directory Structure

Now, let’s create the necessary files and folders for our code.

mkdir -p src/colrev_journal_formatter tests
touch src/colrev_journal_formatter/__init__.py
touch src/colrev_journal_formatter/formatter.py
touch tests/test_formatter.py

Note that Python package names often use underscores (_) instead of hyphens (-).

Step 3.3: Install the Package in Editable Mode

Install your package locally so you can use and test it as you develop. Run this from the project’s root directory.

pip install -e .

The -e or --editable flag is essential for development. It creates a link to your source code instead of copying it. This means any changes you make to your Python files are immediately usable without needing to reinstall.

Step 3.4: Add Your Core Logic

Let’s add our core logic. Open src/colrev_journal_formatter/formatter.py and add this code:

# src/colrev_journal_formatter/formatter.py

def standardize_journal_name(name: str) -> str:
    """
    Standardizes a journal name by replacing common abbreviations.
    """
    abbreviations = {
        "J": "Journal",
        "Comput": "Computing",
        "Syst": "Systems",
        "Sci": "Science",
    }
    
    words = name.split()
    standardized_words = [abbreviations.get(word, word) for word in words]
    
    return " ".join(standardized_words)

Step 3.5: Add pytest and Write Your First Test

Now we’ll add pytest as a dependency.

uv add pytest

This command adds pytest to the [project.dependencies] section in your pyproject.toml, installing it as one of your dependencies.

Next, open tests/test_formatter.py and add your tests:

# tests/test_formatter.py
from colrev_journal_formatter.formatter import standardize_journal_name

def test_standardize_journal_name():
    """Tests that abbreviations are correctly expanded."""
    input_name = "J of Comput Syst"
    expected_name = "Journal of Computing Systems"
    assert standardize_journal_name(input_name) == expected_name

def test_no_abbreviations():
    """Tests that a name with no abbreviations remains unchanged."""
    input_name = "Journal of Modern Science"
    assert standardize_journal_name(input_name) == input_name

Finally, run the tests:

pytest

4. Applying Your Skills: The Colrev Plugin Context

You have just built a complete, tested Python package. This is the exact skill set needed to create a Colrev plugin. A Colrev plugin is simply a standard Python package that is designed to interact with the Colrev framework.

The standardize_journal_name function you wrote is a perfect example of a data cleaning operation that a colrev prep package might perform. To turn your package into a real plugin, you would add more Colrev-specific code to register it and have it process bibliographic records. The core work of writing clean, testable functions and packaging them, is exactly what you have just learned.

5. Conclusion and Further Steps

Congratulations! You have successfully created, installed, and tested a complete Python package from scratch. You’ve learned the fundamental skills of a modern Python developer:

  • Structuring a project with pyproject.toml and a src layout.
  • Initializing a project with uv init.
  • Developing efficiently using an editable install (pip install -e .).
  • Ensuring code quality with automated tests using pytest.
  • Managing dependencies declaratively with uv add.

These are the universal building blocks of sharable and maintainable Python code.

Further Steps

Your journey as a package developer is just beginning. The logical next steps in a package’s lifecycle are:

  • Documentation: A good README.md is essential. For larger projects, tools like Sphinx or MkDocs can build a full documentation website from your code’s docstrings. Clear documentation is what separates a good project from a great one.
  • Publishing: To share your package with the world, you can publish it to the Python Package Index (PyPI). This makes it available to anyone via pip install your-package-name. This process is typically automated using GitHub Actions.