chat
Once you have successfully trained model, it is time to talk to it. To do so, invoke:
slangmod chat --name my-first-experiment
All chat options are set with --chat.<KEY> <VALUE> on the command line
and/or go into a [chat] section of your config TOML file:
work_dir = "/absolute/path/to/your/working/directory"
log_level = 10
progress = true
[files]
raw = "/absolute/path/to/data/files"
suffix = "pqt"
column = "document"
min_doc_len = 32
cleaners = ["quotes", "encoding"]
encoding = "cp1252"
tokenizer = "tokenizer.json"
checkpoint = "check.pt"
weights = "state.pt"
model = "final.pt"
[tokens]
algo = "bpe"
vocab = 30000
eos_string = "\n\n"
eos_regex = "\n{2,}"
[data]
...
[model]
...
[model.feedforward]
...
[train]
...
[chat]
...
Keep in mind, though, that the you just pre-trained a model. It is not yet fined-tuned to make for an engaging conversation partner or to excel at any specific task. It simply continues writing the text from what you prompted it with. How exactly it does so can be influences by the following parameters.
- generator = “greedy”
Which algorithm to use to generate text. The default “greedy”, the most obvious one, simply picks the most probably next token, one at a time. Other options are:
“top_k” Randomly draw the the next token from the top-k most likely ones. The probability with which any of these top-k tokens will be picked is proportional to the probability produced by the model.
“top_p” Randomly draw the the next token from as many of the most likely ones as are needed to reach a total probability of p = 0.8.
“beam” Perform a beam search for the most probable sequence of tokens.
- max_tokens = 256
How many tokens the model should at most predict if no end-of-sequence is encountered first
- temperature = 1.0
Temperature of the (categorical) probability distribution the next token is sampled from with the
top_kandtop_pmethods. A higher value spreads out the probability among candidate tokens, making model responses more “creative”, while lower values will make responses tend towardsgreedy.- k = 0.1
The fraction or number of most likely tokens to draw from for the
top_kmethod. Will be interpreted as a fraction if strictly smaller than 1.0 and as a count if larger.
- p = 0.8
The sum of the individual probabilities of top most likely next tokens to draw from with the
top_pmethod.- width = 8
The width of the
beamsearch.- boost = 1.0
The higher this value, the longer answers produced by a
beamsearch will become. Conversely, the lower this value, the shorter the model response will be. Must be larger than 0.0- style = “space”
Depending on which data you trained your model one, you might still be able to affect the style the conversation.
“space” The model is asked to directly continue whatever text you prompt it with, no separator whatsoever,
“paragraph” An end-of-sequence token is appended to the user input to incentivize the model to answer with an entire paragraph itself.
“quotes” The user prompt is wrapped into double quotes and terminated by a comma to incentivize the model to comment on what the user said.
“dialogue” User input is wrapped in double quotes and an end-of-sequence token is appended, thus incentivizing the model to respond with a quoted paragraph itself, in the style on might find in a book.
- user = “USR”
The prefix of the user prompt on the console.
- bot = “BOT”
The prefix of the model responses on the console.
- stop = “Stop!”
The text the user should enter to stop the chat client.
- system = “”
The system prompt. This can either be a string or the path to a file with the text of the system prompt in it.