Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision | Last revisionBoth sides next revision | ||
bootcat:tutorials:basic_4 [2018/02/07 15:04] – eros | bootcat:tutorials:basic_4 [2018/02/07 15:26] – eros | ||
---|---|---|---|
Line 8: | Line 8: | ||
In this step you can choose to remove URLs you think might not be interesting. Just for fun try unchecking the box next to a couple of URLs: notice how the number of " | In this step you can choose to remove URLs you think might not be interesting. Just for fun try unchecking the box next to a couple of URLs: notice how the number of " | ||
- | {{bootcat: | + | {{ bootcat: |
:!: Notice how the number of " | :!: Notice how the number of " | ||
Line 25: | Line 25: | ||
The purpose of this stage is to get rid of elements which are part of the downloaded web pages, but that are very unlikely to be of interest to corpus users. However, since this process is automated, the cleaning process is far from perfect, so be aware that some unwanted elements will still be present in the corpus. | The purpose of this stage is to get rid of elements which are part of the downloaded web pages, but that are very unlikely to be of interest to corpus users. However, since this process is automated, the cleaning process is far from perfect, so be aware that some unwanted elements will still be present in the corpus. | ||
- | {{bootcat: | + | {{ bootcat: |
Click on "Build corpus" | Click on "Build corpus" | ||
Line 31: | Line 31: | ||
Go make a cup of tea while you wait. | Go make a cup of tea while you wait. | ||
- | {{bootcat: | + | {{ bootcat: |
Once the download is complete click "Open corpus folder" | Once the download is complete click "Open corpus folder" | ||
- | {{bootcat: | + | {{ bootcat: |
The contents of the folder where the corpus data is stored will be displayed. | The contents of the folder where the corpus data is stored will be displayed. | ||
- | {{bootcat: | + | {{ bootcat: |
====== ====== | ====== ====== | ||
---- | ---- | ||
[[bootcat: | [[bootcat: | ||
[[bootcat: | [[bootcat: |