Version 0.71

  • NEW (requirement): BootCaT frontend now needs Java 7 (a.k.a. java 1.7) to work (visit this page to see which version of Java you have)
  • NEW (feature): full UTF-8 support, which means you'll be able to build corpora in languages which don't use the latin alphabet (i.e. Russian, Japanese, Arabic, Chinese, etc.);
  • NEW (feature): you can now edit tuples directly, so you can tweak your corpus any way you like (see this page for more info);
  • NEW (feature): you can now skip directly to the corpus building process, which means you can just feed a list of URLs to BootCaT and it will download and clean them for you, without having to go through the process of selecting seeds, building tuples, etc. (see this page for more info);
  • NEW (feature): language filtering, pages in the wrong language will be discarded automatically;
  • NEW (feature): document size filtering: after downloading and cleaning a web page, BootCaT can count the characters in the document and discard it if the character count is too low or too high;
  • NEW (feature): added “Copy” and “Paste” buttons to the “seed selection” and “search engine key” panels; cut/copy/paste has always been supported using a keyboard shortcut, but many users don't know about keyboard shortcuts;
  • bootcat/release_notes/0.71.txt
  • Last modified: 2014/09/10 14:07
  • by eros