Use our language resources
Data
All the data you need to give you the edge for your translation work within the South African language sphere
Corpora
Data available for all official South African languages
PARALLEL CORPORA
DOWNLOAD
MONOLINGUAL CORPORA
DOWNLOAD
Machine Translation Evaluation Data Sets
EVALUATION DATA SETS
DOWNLOAD
TRANSLATION MEMORIES &
GLOSSARIES
A wide variety of translation memories and glossaries are available for download and can be used free of charge. These resources are in standard translation tool format and can be used with the Autshumato ITE or any other TMX-enabled software package.
The TMG is a crowd-sourced platform through which translation resources can be supplied and obtained. By sharing collective translation resources, everyone can benefit. The sharing of translation resources between various affiliations (translation units) and freelance translators can ensure better consistency increased productivity throughout translation projects. Which in turn can provide more access to information to for everyone in their native language.
Users can rate and comment on resources, in order to give others an indication of the quality of a specific resource. The system also remembers the resources that you have uploaded and downloaded and can serve as a cloud storage facility for your translation resource. So should you, by some circumstance, lose all your translation resources, you will be able to easily recover them.
The TMG also makes provision for managers to manage their personnel and their resources on the system.
VISIT:
GLOSSARIES
The data is given as a single UTF-8 text file, with each segment on a new line. The dataset contains existing data sourced for the DAC funded Autshumato project as well as new data sourced for the SADiLaR: Parallel corpora for English into isiXhosa projects.