Codebase list apertium-oc-ca / eca949a2-d0c7-4102-a0bb-358260d4443a/main ca-tagger-data
eca949a2-d0c7-4102-a0bb-358260d4443a/main

Tree @eca949a2-d0c7-4102-a0bb-358260d4443a/main (Download .tar.gz)

1
2
3
4
5
6
7
8
9
In order to train the Catalan part-of-speech tagger yo need to place here
some corpora:
* For supervised training: 
  - A file called 'ca.tagged' should contain the hand-tagged corpus.
  - A file called 'ca.tagged.txt' should contain the same corpus that
    'ca.tagged' contains, but in RAW format.

* For unsupervised training:
  - A file called 'ca.crp.txt' should contain large corpus in RAW format