PiscesLogoSmallerStill  Preparing large data sets in a spreadsheet program

Top  Previous  Next

You do not need to create data sets within CAP; in fact the best way to enter large data sets is to use a spreadsheet such as Excel or Lotus 1-2-3, which will give access to a wide range of sorting and editing procedures to ease your task.


A data set should have the following format:

                         Sample 1        Sample 2        Sample 3                

Variable 1                    21                     1                     5                

Variable 2                    15                     5                     0                

Variable 3                     0                     7                     0                



The normal arrangement of community data within CAP is to have the samples (quadrats) as columns and the species as the rows. However, older versions of Excel have a maximum number of columns of 255, which can prove difficult if you have a data set with a very large number of sites/samples. If this is the case, the data can be arranged in Excel with the variables forming the columns, as the Transpose option within CAP can be used to switch columns and rows. Numbers can be either integer or real; some methods may require integers, in most such cases the program will run with real data, which will be automatically rounded.


The above table and the image below show you how the data will look in Excel. The samples are arranged in columns. Each sample has a title field. Start the first sample in column 2. The data consists of the number of individuals observed in the sample. Put in zeros rather than leaving cells blank. The species names (variable names) are input from row 2 in column 1.




When using Excel use the Save As function to save your data as a *.csv file. This will result in a data file in the format used by CAP. Alternatively, you can save as an Excel file (.xls, not .xlsx) which can be imported into CAP. Ensure that the work sheet you are saving only holds the tabulated data for analysis. If your data set has been created using the convention that a blank cell means zero then use the Find and Replace function available in all common spreadsheets to search for blank cells and replace them with 0 (zero).


Occasionally, errors occur because a blank space or a character has been accidentally entered into a cell outside the data matrix. To prevent this happening, it is good practice, before saving your data set as a .csv file, to highlight the first 10 or so blank rows and columns below and to the right of the data matrix, and press 'Delete'. This will clear the cells of any accidentally-entered contents.


The csv file can also be opened and edited in a text editor (e.g. Notepad) or word processor (Microsoft Word), in which case it will appear like this:


Variable 1,21,1,5

Variable 2,15,5,0

Variable 3,0,7,0

Variable 4,1,9,0

Variable 5,0,0,8

Note the leading comma on the first row which will make the first cell blank.


see also: Quick guide to running a dataset; Data entry from within CAP