NOTE: you can start at Problem formulation. The motivation section just explains how I found myself asking this.
Motivation
I use and love nbdev. It makes developing a python package iteratively as easy as it gets. Especially, as I generally do this along side a project which uses said package. Does this question require knowledge of nbdev? no. It only motivates why I am asking. Normally when I create a new nbdev project (nbdev_new
), I get a settings.ini
file and setup.py
file. In order to keep working on different packages / projects simple, I will immediately create a conda environment file env.yml
for the project (see Files > Example Conda File
maybe this link works).
I know that the environment one develops a package in is NOT necessarily the minimum dependencies it requires. Further, a package's dependencies are a subset of those one may need for a working on a project utilizing said package. In MY use case it is clear that I am double-dipping! I am developing the package as I use it on a project.
So for the sake of this question let's assume that the package dependencies == project dependencies
. In other words, the env.yml
file contains all of the requirements
for the setting.ini
file.
nbdev workflow
make new empty repo "current_project" and clone it
cd path/to/current_project
nbdev_new
make
env.yml
filecreate / update environment:
# create conda environment
$ mamba env create -f env.yml
# update conda environment as needed
$ mamba env update -n current_project --file env.yml
# $ mamba env update -n current_project --file env.mac.yml
- active environment:
# activate conda environment
$ conda activate current_project
- install
current_project
:
# install for local development
$ pip install -e .
Problem formulation
I am developing a package in python using a setup.py
file. My package may have requirements (listed under settings.ini
with the key requirements
) that get automatically important and used in the setup.py
file. While developing my package I have a conda environment which is specified in a yaml file env.yml
(see Files > Example Conda File
maybe this link works).
I also have some GitHub actions that test my package. I dislike having to update settings.ini
manually (especially since it doesn't allow for multiple lines) to get the requirements into setup.py
. Especially as I have already listed them out nice and neatly in my env.yml
file. So my question is as follows:
Question
Given a conda environment yaml file (e.g.
env.yml
) how can one iterate through its content and convert the dependencies (and the versions) to the correctpypi
version (required bysetup.py
), storing them insettings.ini
under the keywordrequirements
?
caveats:
- version specifier requirements in conda are not the same as
pypi
. Most notably=
vs==
, amongst others. - package names for conda may not be the same for
pypi
. For examplepytorch
is listed astorch
forpypi
andpytorch
for conda. - the environment yaml file may have channel specifiers e.g.
conda-forge::<package-name>
- the environment yaml file may specify the python version e.g.
python>=3.10
, which shouldn't be a requirement. - MY ideal solution works with my workflow. That means the contents of
env.yml
need to get transferred tosettings.ini
.
Desired outcome
My desired outcome is that I can store all of my package requirements in the conda environment file env.yml
and have them automatically find themselves in the setup.py
file under install_requires
. Since my workflow is built around reading the requirements in from a settings.ini
file (from nbdev), I would like the solution to take the values of env.yml
and put them in settings.ini
.
Note I am sharing my current solution as an answer below. Please help / improve it!
Files
Example conda file
# EXAMPLE YAML FILE
name: current_project
channels:
- pytorch
- conda-forge
- fastai
dependencies:
- python>=3.10
# Utilities
# -------------------------------------------------------------------------
- tqdm
- rich
- typer
# Jupyter Notebook
# -------------------------------------------------------------------------
- conda-forge::notebook
- conda-forge::ipykernel
- conda-forge::ipywidgets
- conda-forge::jupyter_contrib_nbextensions
# nbdev
# -------------------------------------------------------------------------
- fastai::nbdev>=2.3.12
# PyTorch & Deep Learning
# -------------------------------------------------------------------------
- pytorch>=2
# NOTE: add pytorch-cuda if using a CUDA enabled GPU. You will need to
# remove this if you are on Apple Silicon
# - pytorch::pytorch-cuda
- conda-forge::pytorch-lightning
# Plotting
# -------------------------------------------------------------------------
- conda-forge::matplotlib
- conda-forge::seaborn
# Data Wrangling
# -------------------------------------------------------------------------
- conda-forge::scikit-learn
- pandas>=2
- numpy
- scipy
# Pip / non-conda packages
# -------------------------------------------------------------------------
- pip
- pip:
# PyTorch & Deep Learning
# -----------------------------------------------------------------------
- dgl
Current Solution
The current solution is the file
env_to_ini.py
(seeFiles > env_to_ini.py
maybe this link works).NOTE this solution uses
rich
andtyper
to create an informative command line interface (CLI) which will show you which dependencies were added, changed, removed, or unchanged. This should make working with the script easier (especially should there be bugs in it) as it has not been extensively tested.How to use
env_to_ini.py
Assumptions:
env.yml
orenv.mac.yml
under project rootsettings.ini
under project rootenv_to_ini.py
under project rootThis script is provided so that if the
env.yml
(orenv.mac.yml
) file changes you can automatically update the dependencies of thecurrent_project
package (undersettings.ini
) to match.Caveats
This is a bit hacky. You can modify it per project as needed. The so called "hackiness" is primarily located under the two
TODO
s, which I will now explain. Note thatTODO
2 is more important thanTODO
1.TODO 1: exclusion
Search for the following in the provided script:
Per the original question, some packages are to be excluded from the conda environment file
env.yml
. Namely,python
. Under the firstTODO
, which achieves this currently via anif / elif / else
statement, you could change the script to accept an additional argument containing package to exclude, or read in an additional file containing these.NOTE: It is unclear to me that if you were to modify your
env.yml
file to have a section afterdependencies
calledignore
if it would mess up your usage with conda. e.g.TODO 2: mapping
Search for the following in the provided script:
Per the original question, some packages need to be renamed because their package name on conda is different than it is on
pypi
. The example give ispytorch
which is listed astorch
forpypi
andpytorch
for conda.Under the function
requirements_to_ini
this is currently achieved by using anif / elif / else
statement. You could change the script to accept an additional argument containing package to rename, or read in an additional file containing these.NOTE: It is unclear to me that if you were to modify your
env.yml
file to have a section afterdependencies
calledrename
if it would mess up your usage with conda. e.g.NOTE: It is unclear to me how you could determine this automatically per your question's desired outcome.
Files
env_to_ini.py
Update
env2ini
Now it is a CLI on pypi, conda, github, and "docs"