Table of Contents

Using an external downloader

When you download a very long list of URLs, sometimes BootCaT will crash. We're trying to fix the problem, but for now here's a handy workaround.

Even if BootCaT crashed while downloading files, you can find a file called urls_list_final.txt in the folder created for the failed attempt at building your corpus: that's the list of all the URLs you collected in the first stage of the corpus creation process.

You can simply try again using the Custom URLs corpus creation mode.

Another solution is downloading the files using an external program and then using BootCaT to clean them using the Local files corpus creation mode.

Here's a step-by-step guide to downloading files using the freeware external downloader WinWget and then turning them into a corpus with BootCaT.

Download and configure WinWget

Downloading URLs

Creating the corpus