By @jmmshn, Jesper Kristensen and 0xbrainjar
At Advanced Blockchain Research and Composable Labs, we primarily work with Python for our data analytics. However, as Julia becomes more popular, the benefit of its performance and ease of use in defining abstract mathematical operations should not be ignored.
For the near future, we foresee that mixed Python and Julia codebases will be the most successful for complex data analytics tasks, especially when the operations hit the performance ceiling of Python. Therefore, this post will demonstrate the setup of a minimal GitHub repository (repo) containing both Python and Julia codes.
Julia, as a language, is most often used for abstract mathematics and scientific computing. Writing math codes without proper testing is a sure way to invite pain and frustration. As such, our goal is to help you set up a repo with mixed Python and Julia codes in addition to helping you implement automated testing. You can view our GitHub itself by clicking here.
We are actively developing and simulating new bridging technologies in these languages, the topic of future posts.
More on Julia
Julia is a high-level, high-performance, dynamic programming language optimized for scientific computing. It is a general-purpose language with a type system that is dynamically typed and dispatched. This makes it significantly faster than other interpreted languages such as Python and Ruby.
Despite these advantages, most of the scientific community, especially in the fields of data science and machine learning, have well-established Python libraries. Additionally, as great as Julia is for scientific computing, it is still difficult to beat the universality of Python. With this being the case, a Julia-only model is often a good choice for a research project, but it is not a good choice for a broader production system that has to integrate with existing libraries and frameworks (and, of course, depending on the production system, neither may be a good choice, so we do work in more languages in general).
About PyJulia
PyJulia is a Python wrapper for Julia that provides a high-level interface to Julia. The mechanics of embedding Julia in Python code using PyJulia has been covered in other posts here and here. This post will focus on the best practices of connecting the Julia and Python codes in a new codebase, as well as configuring GitHub Actions to run automated continuous integration (CI) workflows for your mixed Python+Julia code.
Installing and configuring PyJulia for a project
To get started, you will need to make sure that Python and Julia are installed on your machine. After that, you need to make sure you have the package manager for each language installed. We recommend using Miniconda for Python, see e.g. this guide. Assuming you have set up Python with pip the easiest way to install PyJulia is via PyPI using the pip command:
pip install julia
Once you have installed PyJulia, you can use its built-in install command to install the required PyCall Julia package and perform the necessary setup:
According to the PyJulia's documentation, the Python interpreter from conda is statically linked to libpython and PyJulia does not fully support such Python interpreters yet. The recommended workaround is to pass compiled_modules=False to the Julia constructor once to disable Julia's pre-compilation cache mechanism.
Note: this does affect performance, so you may consider switching to a different Python interpreter if performance is an issue. More information on this issue can be found here.
Now you can test your PyJulia installation by running the following code:
For most Python + Julia projects, you will also need to install dependencies for both languages. For python, simply run pip install -r requirements.txt to install the dependencies. In Julia, you can install the dependencies from the Project.toml file by running
assuming the current directory (.) contains the Project.toml file. For the present example we will just have the Example package dependency as a placeholder.
Once the dependencies tree has been built, you can access this particular Julia environment from within your Python code by running:
Where is the directory where you have the Project.toml file.
Setting up a Mixed Python and Julia Project
Following these steps, we now have all the individual pieces for the skeleton of a mixed Python and Julia project. For the full code, go to our GitHub.
For now, we will discuss all the different parts of this skeleton repo. The structure of the project is as follows:
Here the setup.py and setup.cfg files are used to install the present package as py_jl.
Since testing is crucial for the kind of numerical problems people typically use Julia to solve, the setup.cfg also contains additional information about how to run tests using pytest.
Since Python is extremely object oriented and Julia is extremely functional, it is helpful to have a direct one-to-one mapping between all the user exposed functions in the two languages. To accomplish this, we will use the julia_funcs.jl file to define all the Julia function we want to expose and a corresponding julia_funcs.py file to defines simple python wrappers around these Julia functions. These simple python wrappers serve two important purposes:
- They are used to provide docstrings for the exposed functions.
- They might be needed to clean up the type signatures python-side.
As an example, we have the following two files:
And
For well-established Julia packages the kind of type conversion shown above should not be necessary. And in fact the example above will work better if the type signature of the Julia function’s input is changed from Float64 -> Number. However, for some packages, the type signature of the exposed functions might be less flexible so conversion python-side might be necessary.
Continuous Integration with GitHub Actions
Once you have set up the project, you can now set up GitHub Actions to run the tests. The following yaml file is placed into the .github/workflows directory and will run automatically whenever a new commit is pushed.
Note the important lines here are
Which ensures that the PyCall package is installed Julia-side and that all the Julia functions in the present repo are made available to PyJulia.
Conclusion
We have covered the basics of setting up a Python + Julia project as well as how to set up continuous integration with GitHub Actions. We hope this tutorial has been helpful to you and that you can continue to learn more about the Julia language and how to use it in your own projects.
If you are a developer with a project you think fits our ecosystem and goals, and you would like to participate in our interactive testing landscape at Composable Labs, reach out on Telegram at @brainjar.