installation
package
Create and activate a new virtual environment running at least
python 3.12.The easiest way of installing
slangmodis from the python package index PyPI, where it is hosted. Simply typepip install slangmod
or treat it like any other python package in your dependency management.
While it is, in principle, possible to run
slangmodon the CPU, this is only intended for debugging purposes. To get any results in finite time, you also need a decent graphics card, and you must have a working installation of PyTorch to make good use of it. Because there is no way of knowing which version of CUDA (or ROC) you have installed on your machine and how you installed it, PyTorch it is not an explicit dependency ofslangmod. You will have to install it yourself, e.g., following these instructions. If you are usingpipenvfor dependency management, you can also have a look at the Pipfile in the root of theslangmodrepository and taylor it to your needs. Personally, I gopipenv sync --categories=cpu
for a CPU-only installation of PyTorch (for debugging only) and
pipenv sync --categories=cuda
if I want GPU support.
Finally, with the virtual environment you just created active, open a console and type
slangmod -hto check that everything works.
docker
A docker image with GPU-enabled PyTorch and all other dependencies inside is available on the Docker Hub.
docker pull yedivanseven/slangmod
To use it, you must have a host machine that
has an NVIDIA GPU,
has the drivers for it installed, and
exposes it via the container toolkit.
Change into a working directory, i.e., one where slangmod will read its
config file slangmod.toml from and where it will save outputs to, and mount
this directory to the path /workdir inside the container when you run it.
docker run --rm --gpus all -v ./:/workdir yedivanseven/slangmod
This will invoke slangmod -h. If all went well, the “device” entry under
the section “data” should read “cuda”.
In the event that you still want to clean your raw text with the help of
slangmod, you will also have to mount the folder with those dirty files
when your start a docker container.
docker run --rm --gpus all -v ./:/workdir -v /path/to/raw/docs:/raw yedivanseven/slangmod clean ...
For all other command-line options and to find out about this config TOML file, read on …