Welcome to swak's documentation! ================================= *Swiss army knife for functional data-science projects.* Introduction ------------ This package is a collection of small, modular, and composable building blocks implementing frequently occurring operations in typical data-science applications. In abstracting away boiler-plate code, it thus saves time and effort. - Consolidate all ways to configure your project (command-line arguments, environment variables, and config files) with the :doc:`cli` and :doc:`text` packages, respectively. - Wrap the project config into a versatile :doc:`jsonobject`. - Focus on writing small, configurable, modular, reusable, and testable building blocks. Then use the flow controls in :doc:`funcflow` to compose them into arbitrarily complex workflows, that are still easy to maintain and to expand. - Quickly set up projects on Google BigQuery and Google Cloud as well as AWS object Storage, and efficiently download lots of data in parallel with the :doc:`cloud` sub-package. - Build powerful neural-network architectures from the elements in :doc:`pt` and train your deep-learning models with early stopping and checkpointing. From feature embedding, over feature importance, to repeated residual blocks, a broad variety of options is available. - And much more ... Installation ------------ - Create a new virtual environment running at least ``python 3.12``. - The easiest way of installing :mod:`swak` is from the python package index `PyPI `_, where it is hosted. Simply type .. code-block:: bash pip install swak or treat it like any other python package in your dependency management. - If you need support for interacting with the Google Cloud Project, in particular Google BigQuery and Google Cloud Storage, install *extra* dependencies with: .. code-block:: bash pip install swak[cloud] - In order to use the subpackage :mod:`swak.pt`, you need to have `PyTorch `_ installed. Because there is no way of knowing whether you want to run it on CPU only or also on GPU and, if so, which version of CUDA (or ROC) you have installed on your machine and how, it is not an explicit dependency of :mod:`swak`. You will have to install it yourself, *e.g.*, following `these instructions `_. If you are using `pipenv` for dependency management, you can also have a look at the `Pipfile `_ in the root of the `swak repository `_ and taylor it to your needs. Personally, I go .. code-block:: bash pipenv sync --categories=cpu for a CPU-only installation of PyTorch and .. code-block:: bash pipenv sync --categories=cuda if I want GPU support. Usage ----- Try making a new repository using the `swak-template `_ as a, well, template. .. toctree:: :hidden: :maxdepth: 1 :caption: API Reference dictionary funcflow text cli jsonobject pd pl cloud pt misc Indices and tables ================== * :ref:`genindex` * :ref:`modindex` * :ref:`search`