cort: coreference resolution toolkit
This Python library consists of two parts: the coreference resolution component implements a framework for coreference resolution based on latent variables, which allows you to rapidly devise approaches to coreference resolution (described in our TACL and ACL'15 Demo papers). The error analysis component provides extensive functionality for analyzing and visualizing errors made by coreference resolution systems (described in our EMNLP'14 and NAACL'15 demo papers). It also provides an implementation of the deterministic multigraph coreference resolution system (described in my ACL-SRW'13 paper).
Furthermore, branches in the github repository linked above contain implementation of an extended version of cort described in my PhD thesis, and of the k-best coreference resolution system described in our EMNLP'17 paper.
art: approximate randomization testing
This package performs approximate randomization testing for corpus-wide differences in F1 score or accuracy. It is easily extensible for other metrics. Furthermore, it ships with a script that transforms the output from the CoNLL scorer for coreference resolution into the suitable format.