3  Research Best Practices

Here are some links to resources on good coding practices and project setup.

Resources

  • Better Code Better Science - Full guide on project setup, good coding practices, software testing
  • The Good Research Code Handbook - This handbook is for grad students, postdocs and PIs who do a lot of programming as part of their research.
  • The Turing Way: Reproducible Research for Everyone - Guide on reproducible research practices, including code organization and version control
  • Use git and GitHub for version control
  • Use Datalad for version control of datasets. Kind of like git for data.
  • Working remotely using VScode
  • Use linters to highlight semantic and stylistic issues with your code
  • Use formatters to automatically apply stylistic changes
  • Use uv for management of Python versions, virtual environments, Python projects, and packages
  • Use cookiecutter to set up projects based on a template.
  • Great books: Clean Code (Robert C. Martin), Naming Things (Tom Benner), Design Patterns: Elements of Reusable Object-Oriented Software (Erich Gamma, Richard Helm, Ralph Johnson, John Vlissides)

Guide for project setup:

In general, it is recommended to read through the Essential Tools and Techniques for an extensive guide on tools to use, and Project Setup for a more elaborate guide on how to set up projects. Following is a brief guide on how to set up a Python project using uv and cookiecutter.

Step 1: Install basic tools

  1. Install git for version control.
  2. Install uv. You can install it via pip, but it is recommended to install it standalone.
  3. Install cookiecutter via uv:
uv tool install cookiecutter

Step 2: Create a new project

Initialize a new Python project using cookiecutter. There are many different templates available online. A simple one for Python projects could be gh:RDoerfel/simple-cookiecutter or gh:patrickmineault/true-neutral-cookiecutter. A more major one for complex projects could be gh:CBS-HPC/replication_package:

In your root directory, run:

cookiecutter gh:RDoerfel/simple-cookiecutter

This will prompt you to enter some information about your project, like the project name, author name, etc. After you have entered all the information, cookiecutter will create a new directory with the project structure based on the template. Once done, navigate into your new project directory:

cd your_project_name
ls -a

If you used the simple-cookiecutter template, you should see a structure like this:

your-project-name/
├── your_package_name/     # Main source code
├── data/                  # Data files
├── docs/                  # Documentation
├── notebooks/             # Jupyter notebooks
├── results/               # Analysis results
├── scripts/               # Utility scripts
├── tests/                 # Test files
├── .gitignore
├── LICENSE
├── README.md
└── pyproject.toml         # Poetry configuration

Step 3: Set up Python environment

Now select a specific Python version and initialize a new environment using uv:

uv python install 3.12
uv venv --python 3.12

This will create a new virtual environment for your project and a pyproject.toml file to manage your dependencies.

If you already have a pyproject.toml available, probably created from cookiecutter, you can skip the previous step and just run

uv sync

You can now add dependencies to your project using uv. For example, to add numpy and pandas, run:

uv add numpy pandas

This will install the packages and add them to your pyproject.toml file.

Step 4: Version control

Now you can start adding your code and files to the project. Remember to commit your changes regularly using git. First start by initializing your local git repository:

git init

If you want to add a remote repository, for example on GitHub, you can do so by running:

git remote add origin your_remote_repository_url

It is recommended to use ssh for authentication instead of https. Now start by adding all files and pushing your first commit:

git add .
git commit -m "Initial commit"
git push -u origin main

In the future, you can just add, commit and push your changes regularly. Keep commits small, i.e., one commit per feature or bugfix, and add meaningful commit messages. You can always revert to previous versions if something goes wrong. It is much easier if you have a fine-grained git history.

Reproducible workflow