.. _commands: Bash commands reference ======================= List of wordseg commands ------------------------ Once wordseg is installed on your plaform, the commands are available from the terminal as any other command line tool. The commands for which we provide documentation are: * **wordseg-prep** takes as input a text in phonological-like form with tags, and preps it for segmentation by checking format, removing all tags but word boundaries to generate a gold version, and all tags but the minimal unit boundaries to generate what we will call prepared.txt, which is the input for segmentation. * **wordseg-** always takes as input a prepared.txt file, outputing the same text with word boundaries added. Please note that some algorithms require more input than just that. For details, see the :ref:*overview* page. The calls for the algorithms are: - **wordseg-baseline** for random baseline, - **wordseg-dibs** for the diphone based segmentation, - **wordseg-tp** for the transitional probabilities, - **wordseg-puddle** for PUDDLE, - **wordseg-ag** for the adaptor grammar. - **wordseg-dpseg** for the DPSeg or DMCMC (please note that this algorithm is not functional at present), * **wordseg-eval** takes as input a segmented text and a gold version, to compute the precision, recall and f-score at type, token and boundary levels. See more on the :ref:*overview* page. * **wordseg-stats** takes as input a segmented or gold text and computes basic statistics. .. note:: * To get all the details of a command arguments, have a ``wordseg- --help``, * More information on the algorithms and functions are in :ref:`overview`. * For an example of use with all algorithms, see the :ref:`tutorial`. Commands input/output --------------------- All the commands **read from standard input** and **write to standard output** by default (this allows easy communication with other tools). But you can specify input and output files as arguments if you want to. For example the command:: cat my_input.txt | wordseg-prep > my_output.txt is equivalent to:: wordseg-prep my_input.txt -o my_output.txt Logging messages ---------------- The commands write log messages to standard error. The messages are either an **error**, a **warning**, an **info** or a **debug** message. You can choose the level of logging you want to display with the following arguments (available for all commands): * ``-v | --verbose`` displays errors, warnings and infos, * ``-vv | --verbose --verbose`` displays all the messages, * ``-q | --quiet`` does not display any message. When running several commands in scripts or in parallel, the standard error can become a mess. It is possible to redirect stderr to a file using:: wordseg- 2> ./my_log.txt