Standalone Scripts¶

nlpnet includes standalone scripts that may be called from a command line. They are copied to the scripts subdirectory of your Python installation, which can be included in the system PATH variable. There are three such scripts:

nlpnet-train: Script to train a new model or further train an existing one. See Training for detailed information on how to use it.
nlpnet-load-embeddings: Script to load word embeddings trained externally. It accepts different formats. See Importing Word Representations for detailed information on how to use it.
nlpnet-test: Script to measure the performance of a model against a gold data set.
nlpnet-tag: Script to call a model and tag some given text.

Each of them is explained below.

nlpnet-tag
nlpnet-test

nlpnet-tag ¶

This is the simplest nlpnet script. It simply runs the system for a given text input. It should be called with the following syntax:

$ nlpnet-tag.py TASK

Where TASK is either pos or srl. It has also the following command line options:

`-v`	Verbose mode.
`-t`	Disables built-in tokenizer. Tokens are assumed to be separated by whitespace and one sentence per line.
`--lang`	Sets the tokenkizer language (ignored if `-t` is used). Currently, it only accepts `pt` and `en`.
`--no-repeat`	Forces the classification step to avoid repeated argument labels (SRL only).
`--data`	The directory with the trained models (defaults to the current one).

For example:

$ nlpnet-tag.py pos --data /path/to/nlpnet-data/ --lang pt
O rato roeu a roupa do rei de Roma.
O_ART rato_N roeu_V a_ART roupa_N do_PREP+ART rei_N de_PREP Roma_NPROP ._PU

Or with semantic role labeling:

$ nlpnet-tag.py srl --data /path/to/nlpnet-data/ --lang pt
O rato roeu a roupa do rei de Roma.
O rato roeu a roupa do rei de Roma .
roeu
    A1: a roupa do rei de Roma
    A0: O rato
    V: roeu

The first line was typed by the user, and the second one is the result of tokenization.

nlpnet-test ¶

This script is much simpler. It evaluates the system performance against a gold standard.

General options¶

The arguments below are valid for both tasks.

`--task TASK`	Task for which the network should be used. Either `pos` or `srl`.
`-v`	Verbose mode
`--gold FILE`	File with gold standard data
`--data DIRECTORY`
	Directory with trained models

POS¶

--oov FILE

Analyze performance on the words described in the given file.

The --oov option requires a UTF-8 file containing one word per line. Actually, this option is not exclusive for OOV (out-of-vocabulary) words, but rather any word list you want to evaluate.

SRL¶

SRL evaluation is performed in different ways, depending on whether it is aimed at argument identification, classification, predicate detection or all of them. In the future, there may be a more standardized version for this test.

`--id`	Evaluate only argument identification (SRL only). The script will output the score.
`--class`	Evaluate only argument classification (SRL only). The script will output the score.
`--preds`	Evaluate only predicate identification (SRL only). The script will output the score.
`--2steps`	Execute SRL with two separate steps. The script will output the results in CoNLL format.
`--no-repeat`	Forces the classification step to avoid repeated argument labels (2 step SRL only)
`--auto-pred`	Determines SRL predicates automatically. Only used when evaluating the full process (identification + classification)

The CoNLL output can be evaluated against a gold file using the official SRL eval script (see http://www.lsi.upc.edu/~srlconll/soft.html).

Standalone Scripts¶

nlpnet-tag ¶

nlpnet-test ¶

General options¶

POS¶

SRL¶

Table Of Contents

Previous topic

Next topic

This Page

Navigation

Standalone Scripts¶

nlpnet-tag¶

nlpnet-test¶

General options¶

POS¶

SRL¶

Table Of Contents

Previous topic

Next topic

This Page

Quick search

Navigation

nlpnet-tag ¶

nlpnet-test ¶