[ Help index ]

Select folder

Select this button to identify where the folder is that contains the files in your corpus.

File list

You may select one of the files in your file list and preview its contents.

Encoding

The encoding is the manner in which the text files are stored. If the characters you see as you preview the files in your corpus seem to be rendered incorrectly, you should adjust the encoding setting until all previews render correctly. This will make an important difference in terms of the results you see and the accuracy of your searches.

Preview rendered incorrectly with UTF8 encoding:

 

Corrected by switching to MacRoman:

 

Supported encodings:

MacRoman An encoding typically used on Macs.
UTF8 An encoding typically used on most all computers. It accommodates the use of various character sets, including Arabic, Chinese, Japanese, and Russian.
ISO Latin 9 An encoding typically used on Windows machines.

Line ending

The line ending is an invisible character that breaks off one line from another in your corpus. If your corpus is not rendered correctly in the preview, you should adjust the line ending setting until all previews render correctly.

Supported line endings:

unix A line ending used on the most recent Macs.
mac A line ending typically used on older Macs.
windows A line ending typically used on Windows machines.