bootcat:release_notes:1.21

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revisionBoth sides next revision
bootcat:release_notes:1.21 [2019/07/01 10:04] erosbootcat:release_notes:1.21 [2019/07/10 11:53] eros
Line 1: Line 1:
 ====== Version 1.21 ====== ====== Version 1.21 ======
  
-  * **NEW (feature)**: pseudo-XML versions of the extracted plain text files are now created in the ''xml_corpus'' folder;+  * **NEW (feature)**: pseudo-XML versions of the extracted plain text files are now created in the ''xml_corpus'' folder; a single ''corpus.xml'' file is also created, containing the merged version of the pseudo-XML corpus; the XML version of the corpus contains more metadata than the plain text version: ''id'' (the URL of the original file), ''content_type'' of the original file, ''filename'' of the downloaded file; it's also possible to add custom XML attributes to the corpus (see next bullet point);
  
-  * **NEW (feature)**: two new files are created''corpus.txt'' and ''corpus.xml'' containing the merged versions of the plain text and pseudo-XML corpus;+  * **NEW (feature)**: in the "Project Definition" stepyou can now add up to three user-defined XML attributes to the XML version of the corpus; 
 + 
 +  * **NEW (feature)**: a random string is now appended to the names of downloaded files, individual corpus text files and XML corpus files; this makes it possible to easily merge different corpora in the same folder; file names still start with a progressive number;
  
   * **BUGFIX** : fixed a bug that prevented download timeout to work properly, resulting in BootCaT to wait forever for certain URLs to download   * **BUGFIX** : fixed a bug that prevented download timeout to work properly, resulting in BootCaT to wait forever for certain URLs to download
  • bootcat/release_notes/1.21.txt
  • Last modified: 2019/10/29 14:47
  • by eros