ASSIN (Avaliação de Similaridade Semântica e INferência textual) is a dataset with semantic similarity score and entailment annotations. It was used in a shared task in the PROPOR 2016 conference.
The full dataset has 10,000 sentence pairs, half of which in Brazilian Portuguese and half in European Portuguese, and can be downloaded here. Either language variant has 2,500 pairs for training, 500 for validation and 2,000 for testing. This is different from the split used in the shared task, in which the training set had 3,000 pairs and there was no validation set. The shared task training set can be reconstructed by simply merging both sets.
You can also see the list of annotators who took part in the creation of the dataset.
Evaluation Script and Baselines
You can find the official ASSIN evaluation script and baseline implementations in the GitHub repository. They are written in Python and require NumPy, SciPy and sklearn. One of the baselines also requires NLTK.
The evaluation script evaluates accuracy and macro F1 (the mean of the F1 scores of all classes) for textual entailment recognition and Pearson correlation and mean squared error for semantic similarity. It can be run as follows:
python assin-eval.py gold-file.xml system-file.xml
Or see its usage instructions with:
python assin-eval.py -h
Published Results on ASSIN
This is a list of published results we are aware of on ASSIN, besides the results of the participants of the shared task. Task indicates whether the paper is on Text Entailment (TE), Semantic Similarity (SS) or both.