This shows you the differences between two versions of the page.
Next revision | Previous revision |
bootcat:release_notes:0.72 [2013/06/17 17:56] – created eros | bootcat:release_notes:0.72 [2014/09/09 15:24] (current) – removed eros |
---|
====== Version 0.72 ====== | |
| |
* **NEW (requirement)**: BootCaT frontend now needs Java 7 (a.k.a. java 1.7) to work; | |
| |
* **NEW (feature)**: full UTF-8 support, which means you'll be able to build corpora in languages which don't use the latin alphabet (i.e. Russian, Japanese, Arabic, Chinese, etc.); | |
| |
* **NEW (feature)**: you can now edit tuples directly, so you can tweak your corpus any way you like (see [[bootcat:help:corpus_creation_mode|this page]] for more info); | |
| |
* **NEW (feature)**: you can now skip directly to the corpus building process, which means you can just feed a list of URLs to BootCaT and it will download and clean them for you, without having to go through the process of selecting seeds, building tuples, etc. (see [[bootcat:help:corpus_creation_mode|this page]] for more info); | |
| |
* **NEW (feature)**: language filtering, pages in the wrong language will be discarded automatically; | |
| |
* **NEW (feature)**: document size filtering: after downloading and cleaning a web page, BootCaT can count the words in the document and discard it if the word count is too low or too high; | |
| |
* **NEW (feature)**: added "Copy" and "Paste" buttons to the "seed selection" and "search engine key" panels; cut/copy/paste has always been supported using a keyboard shortcut, but many users don't know about keyboard shortcuts; | |
| |