To request information or make comments contact: D. Sankoff Centre de recherches mathématiques Université de Montréal C.P. 6128, succursale centre-ville Montréal (Québec) H3C 3J7 Telephone: (514) 343-7574 E-mail: sankoff@CRM.UMontreal.CA
This manual describes only how to use GoldVarb. The underlying mathematics as well as the linguistic interpretation and pertinence of variable rule analysis are discussed elsewhere. GoldVarb is a stand-alone application, requiring no other software other than the operating system. The text-editing capabilities of GoldVarb were adapted from the text-editor Utile+ which interested users can obtain from its author.
We assume that the reader has some experience with at least one other Macintosh application. The glossary in §9 contains a few tips for beginners. It is important to use a recent version of the system software : that is, the System, Finder and MultiFinder files, and other associated files.
GoldVarb may be run on any member of the Macintosh family of computers. Nevertheless, it is not recommended for use on models older than the Macintosh Plus.
This type of file will be discussed in more detail in §4. The data in the figure bear on the phenomenon of plural morpheme expression in Nepean noun phrases. Each line starting with an open parenthesis contains the information on one adjective, noun or determiner token from a corpus of informal interaction among Nepean adolescent gang members. The "1"'s and "0"'s in the first column indicate the presence or absence of the plural marker -enmas. The "a", "n" and "d" refer to the grammatical category (adjective, noun or determiner) of the word eligible to be marked. The "c" and "s" in the third column indicate whether the token comes from an object or subject noun phrase, respectively.
The key output of a variable rule analysis consists of a list of numbers, one associated with each factor which may affect the variable being studied. Figure 2 contains part of a GoldVarb output file (see §7 for more detail) from an analysis of the data on Nepean plural marker expression of which the tokens in Figure 1 form part. The variable is plural marker presence or absence, the factors are "a", "n", "d", "c" and "s", grouped into two factor groups, one containing "a", "n" and "d" and the other "c" and "s". The numbers are factor weights, indicating in this instance the degree to which the factor favours marker presence rather than deletion. Higher numbers indicate that the corresponding factor favours marker presence more than the other factor(s) within the same factor group. The input is a kind of average tendency for marker presence. The precise interpretation of all these numbers can be found in the document "Variable Rules".
Open the file Nepean.Tok by double-clicking on its icon; this will automatically open GoldVarb. The screen should appear as in Figure 4. Nepean.Tok contains data on Nepean plural expression; indeed Figure 1 portrays part of this same file. The figure in the lower left corner of its window indicates the largest block of memory still available to GoldVarb . Ignore for the moment the Factor specification box associated with the token file. Token files and factor specifications will be discussed in more detail in §4.
We will use the information contained in Nepean.Tok as is. That is, we will not recode any of the factors. Thus we choose No recode in the Tokens menu. A dialogue box will appear asking for a name for a condition file. Type "Nepean.Cnd" (or whatever name you wish) and click Save. The program will produce a default condition file as in Figure 5. As with the token document, available memory is indicated in the lower left corner of the window. The second line of this document is just a comment, as indicated by the semi-colon in the first column. Note that the token file remains open. Condition files will be discussed in more detail in §5.
The next step is to construct a cell file by choosing Load cells to memory... in the Cells menu. A series of dialogue boxes will appear to establish options and to name files. The first simply requires confirmation that the cells be constructed on the basis of the currently open token and condition files. After clicking Yes, the user is requested to name the cell file, for example "Nepean.Cel". The tokens in Nepean.Tok are counted as they are checked for consistency with the values declared in the Factor specification box discussed in §4. A message appears when this is complete. Click OK to continue. A third dialogue box then appears, asking about application and non-application values. Ignore this for the moment; it will be discussed in §5. Just click OK to accept the default values furnished by the program. Finally, a fourth dialogue box asks for a name for a result file. After the user clicks Save, GoldVarb will construct a cell file by combining all those tokens which are identically coded on the independent variables. This file, Nepean.Cel, appears in Figure 6. The format of a cell file is discussed in §6.
At the same time as the cell file is being constructed, certain information is written to the result file, Nepean.Res. This includes the date and time, the name of the token file, the condition file in its entirety and various statistics computed for Nepean.Tok, Nepean.Cnd and Nepean.Cel. See Figure 7.
Finally, choose Binomial, 1 level in the Cells menu. This will carry out the actual variable rule analysis on Nepean.Cel. This and other options will be discussed in §6. The results of the analysis are written to Nepean.Res, which already contains some results from the cell construction procedure. See Figure 8. In addition, GoldVarb draws a Scatter-gram, shown in Figure 9, comparing the proportion of plural markers actually expressed in each cell of Nepean.Cel with the proportion predicted by the statistical model constructed by the program.
In this section we have followed the pathway from token data (contained in the token file) through cell file (constructed according to definitions in the condition file), to the results of variable rule analysis (contained in the result file). In the following sections we will examine each of these types of files in more detail.
The user who begins working with GoldVarb without having a native token file will need to proceed in one of two ways:
(a) Use the New... command to create a new empty document into which tokens will be typed, or
(b) Use the Import... command to import a non-native token file, i.e. a file created by another application.
These two approaches are discussed in §4.1 and §4.2. In version 2 of GoldVarb, a new feature, discussed in §4.3, allows the creation of a new token file from existing token and condition files. Finally in §4.4 we discuss another new feature introduced in version 2: searching and replacing factors in the token file.
§4.1 Creating a New Token File
When New... is chosen from the File menu, the user is first given the opportunity to name the document which then appears on the screen with the Factor specification box below it. The document will be empty except for a single open parenthesis, prompting the user to type a token. Here are some guidelines for the entering of tokens:
In addition to the tokens, the user must enter the factor specifications in the appropriate box. These are a set of declarations indicating which factors may appear in each group (i.e. each column of the token file) as well as the total number of groups. Here are some guidelines for the entering of factor specifications:
After both tokens and factor specifications have been entered, the user should choose Check tokens from the Tokens menu. This command (which is executed automatically when cells are created from tokens and conditions) determines whether all tokens contain only legal factors, extends any short token with the appropriate fill character, and replaces the character "." by the appropriate default value.
§4.2 Importing a Non-Native Token File
If the user has already prepared a token file using a program other than GoldVarb, this file may be processed by starting up GoldVarb and using the Import... command. This command, unlike Open..., allows the user to open any text file created by any application. Assuming that the format of this file conforms to the guidelines listed in §4.1, there is a short-cut which avoids the sometimes laborious task of entering the factors in the Factor specification box. When the user chooses Generate factor spec's... from the Tokens menu and clicks Yes in the ensuing dialogue box, GoldVarb scans the tokens and builds up a list of factor specifications based on the occurrences of factors in the token file. We can think of this command as a sort of inverse of the command Check tokens discussed above. That is:
If an imported file is saved by GoldVarb, it becomes "native", i.e. it adopts the appropriate GoldVarb document icon as if it had been created by the program.
§4.3 Creating a Token File from an Existing Token File
Normally one uses a token file, subject to recodings defined in a condition file, in order to create a cell file. However, a new feature of GoldVarb version 2 allows one to recode a token file in order to create not a cell file but a new token file. This is convenient if one wishes to do several recodings of the same data using complicated sets of conditions which differ only slightly. A single initial recoding can be used to create a new set of data (tokens) to which much simpler conditions will subsequently be applied.
To use this feature, choose Recode to new token file... from the Tokens menu. When the recoding has been completed, the old token and condition files will be closed and the newly created token file will be opened on screen.
§4.4 Searching and Replacing
For the purposes of preparing and correcting the raw data, i.e. the tokens, GoldVarb provides several search and replace functions. These are accessible via the last two items in the Tokens menu. When the item Search & Replace... is chosen, the dialogue box shown in Figure 11 will appear on screen. As the title of this little window implies, the search and replace functions apply only to the token document, and searching is columnar, i.e. GoldVarb will search for a factor or factors starting only in a particular column. This dialogue contains four buttons, a pair of arrows used to change the column number, and two slots for entering the string to be sought and its replacement if any. If, for example, one wishes to search for the factor "s" in the third group, then type "s" in the first rectangular slot, and then use the little arrows in the bottom right corner to set the column number to 3. To begin searching, click the button Find, or use the command key `. The information about what is being sought will also appear in the last item in the Tokens menu illustrated in Figure 24.
The search example just described is very simple, involving only a single character. A less trivial example would be to search for, say, the string "nc" starting in column #2 -- that is, to search for the simultaneous occurrence of the factor "n" in the second group and the factor "c" in the third group.
Here are some guidelines for searching:
The buttons Change and Change & Find in Figure 11 are used for replacing one occurrence at a time. The former is undoable, using Undo in the Edit menu, as if the replacement had been typed from the keyboard. The latter is not undoable and is equivalent to performing Change followed by Find.
Note that after performing Change all..., only the last modification may be undone using the Undo command.
When using any of these four functions, the text in the token document scrolls automatically so that each occurrence becomes visible as it is found.
As with token files, a condition file may be created by GoldVarb using New..., or it may be created by another program and imported into GoldVarb using Import.... Many users find it difficult to master the syntax required for the construction of a condition file. Thus GoldVarb includes a simplified procedure for this task -- yet another way of creating a condition file -- implemented through the Recode setup... command in the Tokens menu. There is also the No recode option discussed in §3.
§5.1 Using the "Recode setup..." dialogue box
We illustrate some simple recodings on the Nepean plural data. Open Nepean.Tok and choose the Recode setup... command in the Tokens menu. After asking the user to enter a name for the condition file which GoldVarb will generate (we suggest "a & n vs. d" for reasons which will soon be evident) the recoding dialogue box in Figure 13 appears.
On the left side of this box we see a list of the groups and factors in the token file. A list of recoded groups will be build up on the right side. The six buttons in the middle (discussed in Table 1 below) perform various forms of recoding; they are all initially deactivated because none of the groups has as yet been selected. GoldVarb uses the empty space at the bottom of the dialogue box to display occasional messages telling the user what to do next.
Suppose we have reason to ignore the distinction between adjectives and nouns in their effect on plural expression, and to concentrate on the difference between the determiners and the other two categories. Click #1 in the column on the left of the left hand list, and then click Copy. The group with factors 1 and 0 then appears in the right hand list. Now click #2, followed by Recode. The group with factors a, n and d appears in the right hand list, with the a flashing. Type a. Now the n starts flashing. Type a again to indicate that the nouns are being reclassified with the adjectives. The screen should look like Figure 14, with the d flashing. Type d. Finally, copy the #3 factor group to the right hand list and click OK. GoldVarb automatically generates a new condition file, shown in Figure 15.
Let us consider another example using the Recode setup... command. We begin in the same way as above, assigning a file name (here we suggest "Nouns with subjects") and then copying group #1 from the left to the right side. But this time let us construct a new factor group by combining groups #2 and #3 into two new factors: x will represent tokens which consist of a noun in subject position (i.e. those tokens with n in column 2 and s in column 3) and y will represent all other tokens. To do this, select groups #2 and #3 on the left, then click in the AND button to indicate that we are interested in simultaneity of factors in these two groups. Still on the left side of the box, select the factors n and s and type x which will appear as the first factor in group #2 on the right. Since this is the only combination required, click in the space at the bottom as instructed.
The dialogue box should now appear as in Figure 16 with the small black rectangle beside the x flashing, indicating that we must type one more letter as the recode value for all other combinations, that is, all tokens which do not consist of a noun in subject position. Type y, and we are done. Click in the OK button. The dialogue box disappears and the newly generated condition file (Figure 17) appears on screen.
A word about the numbering of groups: In Figures 14 and 16 the groups on the right side of the dialogue box ("Groups after recoding") are numbered consecutively starting at 1. The number appearing in square brackets indicates the group's origin, i.e. the group's number before recoding. Thus the notation "1" means that recoded group number one was taken from group #1 on the left side. If the letter "n" appears instead of a number, this indicates that the group is a new one which did not exist as such in the token file. Thus the notation "2[n]" in Figure 16 indicates that the second recoded group was constructed from a combination of several groups (in this case, two) on the left. In the resulting condition files (Figures 15 and 17), the original group number is shown at the beginning of each set of recode conditions. The number "0" is used for new groups, as shown in Figure 17.
Finally, it should be noted that the Recode setup... dialogue allows easy construction of only the simplest and most common forms of recode. More complicated recodes must be entered by typing directly into the condition file window. Nevertheless, the two methods may be combined -- i.e. one may generate a preliminary condition file using the dialogue and then modify or extend it by typing.
See Table 1 for an explanation of the six buttons in this dialogue box.
* On each button, the arrows, if any, indicate the direction of the operation which the button performs.
* The list on the left side of the Recode setup... box shows the groups contained in the token file, as specified in the Factor specification box. The list on the right side contains the recoded groups, i.e. the groups which will be generated by the condition file which this dialogue box builds.
* A group or factor is selected or de-selected by clicking with the mouse. A factor is selected if it is shown on a black background, while a group is selected if its group number is shown on a black background. It is possible, for example, to select a group without selecting any of its factors, but the opposite is not possible. At most one factor at a time may be selected in a selected group.
* A button is activated only if an appropriate group, set of groups, factor, or set of factors has been selected. For example, the Copy button is activated when one or more groups are selected on the left, while Remove is activated only if one or more groups are selected on the right.
§5.2 The Application Values
Along with the condition file, GoldVarb must be told what values of the dependent variable (the first recoded group), are pertinent for constructing cells. During the final stages of cell creation, i.e. just after processing the token and condition files, GoldVarb will display a dialogue box in which the user is asked to enter the application values. If only one value is entered, then this will be the application value, and all other factors in the group will be counted as non-applications. If more than one value is specified, only these values will be used, while recoded tokens with any other factor in column one will be ignored. The maximum number of values of the dependent variable is 9.
Example: Suppose the dependent variable has factors "abcd" (after recoding). Consider the following choices:
Variable rule computations are possible for only the binomial case (in the current version of GoldVarb). However cell file creation and cross-tabulation are possible in all cases. The adjustment of condition files to eliminate knockout and singleton factors is discussed at the end of §6. Condition file syntax is discussed in further detail in the Appendix.
A cell file, such as the one depicted in Figure 6, consists of the following parts:
With some data sets a knockout factor or a singleton may be flagged by the program beside the tabular results (cf. Figure 7) which show the counts of factors in the cells. A knockout is a factor for which applications occur with frequency 0% or 100%. A singleton is a group which contains only one factor. Variable rule computations cannot logically be performed on a cell file which contains knockouts or singletons. One must generate a new cell file by recoding the original tokens using a different condition file.
In Figure 8, the line starting with "Iterations" keeps track of GoldVarb's progress in finding the "maximum likelihood" estimation of the factor weights to a certain degree of accuracy, at which point "convergence" is indicated. If the number of iterations reaches 20 without convergence, no further iteration is attempted and the current values of the estimates are presented .
A new feature of GoldVarb is the option of comparing the log likelihood of a run and the maximum possible value of such a likelihood. The usefulness of this test remains to be evaluated.
The scattergram (Figure 9) drawn at the completion of a 1-level analysis may be printed, or it may be copied to the clipboard and subsequently pasted into the Scrapbook or into a document of a graphics application in order to save it. Further, while the scattergram window remains open on screen, detailed information about any data point in the scattergram may be obtained (and optionally written to the result document) by positioning the cross-hair cursor over the point and clicking with the mouse button. The size of each point is proportional to the number of tokens in the corresponding cell(s), so that a large point far from the diagonal suggests interaction among its factors.
For compatibility with some previous variable rule programs, GoldVarb also displays, at the end of a 1-level analysis, the Chi-square contribution from each cell as well as the average Chi-square per cell.
Binomial, Up & Down performs a step-by-step analysis, at each level of which only a subset of the factor groups are included and cells are contracted by combining together in one cell all those which differ only in excluded groups. At level 0 no groups are included so the cell list contracts into a single cell, at level 1 only one group is included, at level 2 two groups, etc. GoldVarb begins at level 0 and steps up until it finds no group whose inclusion would significantly (p < 0.05) increase the log likelihood. GoldVarb then starts again but at level "n" (n = the number of independent groups) at which all groups are included and all cells are used without contraction (as in the Binomial, 1 level analysis), and steps down to lower levels until it can no longer find a group whose exclusion does not significantly decrease the log likelihood.
The results from the "Best stepping up run" (see Figure 19) are usually identical to those from the "Best stepping down run". When they are not identical, this indicates some uncertainty about the status of the factor groups included in one analysis but excluded from the other.
Figure 18 shows the Macintosh screen during a Binomial, Up & Down analysis. In addition to the menu bar, the figure shows two windows. The large window is the result document. The small window above it is a dialogue box, called the monitor, indicating the status of the computation and including buttons which allow the user to cancel or temporarily to suspend the analysis. The horizontal bar is filled in as the computation proceeds. In the figure it is less than half filled, since the step-up part has not yet been completed.
As the analysis in Figure 18 proceeds, the cursor rotates, imitating a rolling beach ball. If the user hits the Pause button, the analysis will be suspended and the cursor will take the form until the computation is either resumed by hitting Continue or cancelled by hitting Cancel.
If the program is running under the MultiFinder, then during a variable rule compu-tation (or while pausing), the user may switch to another application by clicking in the little GoldVarb icon in the upper right corner of the screen. This allows one to use the computer for other purposes while the analysis continues in the background. The pause feature is useful in order to allow another application near-exclusive use of the CPU on a temporary basis. If GoldVarb completes its work in the background, the user will be alerted. The user may switch back to GoldVarb by clicking in any of its windows (e.g. the monitor or the result document), or by clicking (possibly more than once) in the little icon in the top right corner of the screen, or by using the menu.
In addition to the three buttons just described, the monitor in Figure 18 also contains a small control giving access to GoldVarb's Auto-save feature. This feature is equivalent to the option "Automatically save textual results" which appears in the Editing options... dialogue box accessible via the Edit menu discussed in §8.3. It is included in the monitor because the menus are inaccessible during variable rule analysis.
Figure 19 shows the result document as it appears immediately after completion of this Up & Down analysis. Unlike the 1-level case, a scattergram is not drawn at the completion of an Up & Down analysis. If a scattergram is desired, a recoding must be done so that only the appropriate groups are used to make cells for a Binomial 1-level analysis.
§8.1 The Menu
In addition to desk accessories, this menu contains the item About GoldVarb.... which, when chosen, displays a box containing information about the program and its authors. Some general documentation summarizing this manual is available by clicking the Help button in this box. More specific information about the various GoldVarb windows is available through the Info & Help... command in the Window menu, discussed in §8.6.
§8.2 The File Menu
The File menu, illustrated in Figure 20, contains commands for opening, closing and printing files.
The commands New..., Open..., Import..., Save... and Save as... are for the three types of data files (tokens, conditions and cells) and for result files. With Open... only files created by this program can be opened, while Import... allows one to open any text file created by any application. The length of documents is limited only by the amount of memory (RAM) available. When the user closes a GoldVarb file which has been modified, or when Save... is chosen from the File menu, the ensuing dialogue box displays the icon of the document so one can tell at a glance what type of file is to be saved.
When a token file is opened, the Factor specification dialogue box which appears is for entering groups and factors which will be used to check the tokens before recoding. On the other hand, the groups and factors listed at the beginning of a cell file are only the independent groups obtained after recoding. For information about the format for entering data, see the Info & Help... command in the Window menu.
The Close command applies to the active (topmost) window, which may be one of the three data types mentioned above, or some other document such as the clipboard, the windows for displaying results (there are two: one for textual results, the other for pictorial results), the window which displays documentation, or the dialogue box for searching.
Only text results, not pictorial, can be saved to a disk file with this version of GoldVarb. However, a picture -- such as a scattergram -- may be printed. It may also be copied to the clipboard and then pasted into the Scrapbook, or into a document in another application such as MacPaint or MacWrite.
The command Print setup... displays a standard dialogue box which allows the user to select some basic options, such as the page orientation, for the printer which is currently chosen.
GoldVarb has two main printing functions. The command Print selection... (shown chosen in Figure 20) is used to print the selection in a text document. If no text is selected, then this function is disabled. The next command Print document... is used to print an entire text or pictorial document. Either item will cause the dialogue box of Figure 21 to be displayed. In this figure the items in the dotted rectangle are proper to GoldVarb while the rest depend on the type of printer currently in use. The button Page layout... gives access to another dialogue box (not illustrated here) which allows the user to change the margins and choose a header and/or a footer. If the option "Preview printing, page-by-page" is chosen, a representation of each printed page will appear on the screen so it can be viewed before being sent to the printer.
A picture will always be printed on a single page, with vertical or horizontal reduction if necessary. When printing text, the number of lines per page depends on the font, font size and font style used to display the document, as well as on the choices of margin, header and footer. The total number of pages of text to be printed is not precalculated in the current version of GoldVarb (hence the question mark in Figure 21) but will be in a future version.
The command Transfer..., an alternative to Quit, allows one to go directly to another program without going first to the Finder. For example one may wish to transfer to another application in order to process result files created by GoldVarb or to prepare data files (although this can also be done within GoldVarb). Of course, if the Macintosh is operating under the MultiFinder then other programs can run simultaneously with GoldVarb, in which case there is no need to transfer or quit in order to switch applications.
§8.3 The Edit Menu
GoldVarb allows text editing in the token, condition and cell windows, and optionally in the result document. A single click with the mouse button sets the insertion point for typing. Double-clicking with the mouse button will select a word, while triple-clicking will select a line. If line numbers are displayed in a window in which text-editing is enabled, then several lines may be selected by dragging the cross-cursor over the appropriate line numbers while holding down the mouse button. The clipboard displays either the last piece of text copied (which will be inserted if the Paste command is executed) or the last picture copied, for example a scattergram or a cross-tabulation.
The Edit menu implements the standard text-editing commands, plus a few options. The command Undo (which changes to Redo when appropriate) allows one to undo or redo the last Cut, Copy, Paste or Clear operation, or the last sequence of characters typed from the keyboard, including backspaces. However, if the user leaves a window in order to work in another, and then return to the first, it will then be impossible to undo/redo Cut, Copy or Paste, because the contents of the clipboard may have been changed. However, keyboard entries and Clear remain undoable/redoable.
The appearance of the Undo item in this menu changes, depending on what operation, if any, can be undone or redone. In Figure 22 it reads Undo Copy, indicating that the user has just copied some text to the clipboard. If this copy is undone, then the text previously stored on the clipboard, if any, will be restored. This restoration of the old clipboard also occurs when Cut is undone. However, if the previous contents were a picture, they cannot be restored by undoing a Cut or Copy operation.
The item Line numbers allows the display of line numbers in the left margin of any window containing text. It works like a toggle switch. A check mark appears on the left if it is "on".
Finally, the last item in the Edit menu, Editing options..., causes GoldVarb to display the dialogue box illustrated in Figure 23. This dialogue allows the user to set several text-editing options.
§8.4 The Tokens Menu
This menu, enabled only if a token document is open, is illustrated in Figure 24. It contains a variety of commands which allow the user to verify, recode or modify the tokens.
Finally, the last two items in the Tokens menu deal with searching and replacing as discussed in §4.4.
§8.5 The Cells Menu
The Cells menu performs various operations on cells, including variable rule analysis. The menu is divided into four parts. The first part contains four items:
The second set of items in the Cells menu performs variable rule analysis on the cells currently in memory. Binomial, Up & Down and Binomial, 1 level are discussed in §7. Multinomial variable rule analysis (i.e. with more than two application values) has not yet been implemented. Hence the item Multinomial, 1 level is not functional.
The third part of the Cells menu is used to set various options. Each functions like a toggle switch, with a check mark displayed on the left if the option is "on".
§8.6 The Window Menu
This menu contains a variety of commands controlling the arrangement of, and giving information about, the windows on the screen.
§8.7 The Font & Style Menus
The Font menu is used to select the font for the active window. Non-proportional fonts such as Courier or Monaco are recommended for tables so that their columns will be properly aligned. However, if the user does a Cross tabulation... (Cells menu) and chooses "Picture", the columns will be properly aligned regardless of which font is used. For such pictorial results, the font or font size may be changed only by recreating the window's contents; i.e. one must re-execute the appropriate command after choosing a different font or font size.
Using the Style menu, illustrated in Figure 28, one may change the size of the font used to display the text in the active window. The appearance will be best if a size displayed in bold/outline in the menu is chosen. For example, in the figure sizes 9, 10, 12, 14 and 18 will appear best. Font sizes greater than 12 are not recommended for scattergrams.
The Style menu also contains several items which control the style of the text. As with fonts and font sizes, the choice of style applies to the entire document, not just the selection. The style Bold may, for example, be used to improve readability when displaying the Macintosh screen on an overhead projector. The style Condense may facilitate printing a text document containing long lines.
The data within a condition file is in the form of a LISP list. Each element of the list is itself a LISP list consisting of two parts: a group number (column number within the coding string) and an optional set of recode conditions. If no recode conditions are specified, the group is used exactly as it is coded in the token. All groups specified in the condition file, and only those groups, are used to build cells. The first group in the condition file list is used as the group containing the dependent variable. The order of groups specified within the condition file determines the order of groups within the cell.
Each recode condition is again a LISP list consisting of two parts: the first part is the recode value, the value to be used for the group for those tokens which meet the second part of the condition, the test clause. The recode value is either a single character or "NIL". If it is "NIL" then tokens which fulfill the test clause are excluded when building cells. Similarly, if the recode value for the dependent variable group is "/", tokens which fulfill the test clause are also excluded. If the recode value for an independent group is "/", the remaining groups for tokens which fulfill the test clause are used in the construction of cells.
There are five test clause predicates: "AND", "OR", "NOT", "COL" and "ELSEWHERE". Case is irrelevant for predicates, e.g. "OR" and "or" are equivalent. However, case is significant for factors! For example "b" and "B" are two distinct factors.
"AND", "OR" and "NOT" are the standard logical operators; "AND" and "OR" take two to 20 predicates as arguments, "NOT" takes a single predicate as argument. If it is necessary to define a recode condition with more than 20 arguments for "AND" or "OR", two or more of the arguments can be nested more deeply. For example:
"COL" takes two arguments, a group number (i.e. column number within the coding strings) and a single character representing a legal factor value for that group; "COL" is true if and only if that column of the coding string contains the specified value.
(AND a1 a2 a3 a4 a5 a6 a7 a8 a9 a10 a11 a12 a13 a14 a15 a16 a17 a18 a19 (AND a20 a21 a22 a23 a24)).
"ELSEWHERE" is always true; it is used as the last test clause within a set of test clauses for a group, and forces the recoding of the group to the specified value if none of the previous conditions for that group has been met.,
Here is an example of a condition file:
Lines with a semi-colon in column 1 are comment lines and are ignored when processing the condition file. In the above example, group #4 is the dependent variable while groups 5, 3, and 0 are the independent variables which will be used to build cells. Group #5 has no recode conditions and therefore will be used exactly as it is coded in the token file. Groups 2 and 8 are interactive factor groups; the interaction is investigated by creating a new group, which is given the number 0.
( (4 (d (OR (COL 4 d) (COL 4 c))) (s (ELSEWHERE))) (5) (3 (/ (OR (COL 3 s) (COL 3 t) (COL 3 u))) (m (OR (OR (COL 3 n) (COL 3 h)) (OR (COL 3 1) (COL 3 2) (COL 3 3) (COL 3 w) (COL 3 u) (COL 3 y) (COL 3 p) (COL 3 t) (COL 3 r) (COL 3 x)))) (x (AND (OR (COL 3 n) (COL 3 h)) (COL 7 n))) (NIL (ELSEWHERE))) ; interactive group (0 (1 (AND (COL 2 x) (COL 8 a))) (2 (AND (COL 2 x) (COL 8 b))) (3 (AND (COL 2 y) (COL 8 a))) (4 (AND (COL 2 y) (COL 8 b))) (/ (ELSEWHERE))) )
Note that GoldVarb processes the recode conditions for each group in the order in which they appear in the conditions. The first condition that is satisfied for each token is used for recoding, and the rest of the conditions for that group are ignored for that particular token. For this reason, "ELSEWHERE" should be used only as the last condition for a group -- it is always true, so any conditions listed after it (for the same group) will be ignored. For example, if the conditions for group #5 are:
then all tokens will be recoded "t" for group #5, even if they were originally coded "x" or "w".
(5 (t (ELSEWHERE)) (a (COL 5 x)) (b (COL 5 w)))
Similarly, the user should make sure that a condition with recode value "NIL" is placed correctly within the list of conditions. Consider the following two sets of recode conditions for group #3:
In the first example, any token coded "s" for group #3 will be recoded,"a", and the remaining conditions will be ignored for that token; therefore, the third condition can never be met. In the second example, any token coded "s" for group #3 and "t" for group #4 will not be used to build cells; therefore, only tokens coded "s" for group #3 and NOT coded "t" for group #4 will be recoded "a". If the user wishes to eliminate unconditionally certain tokens from the cells, it is suggested that these recodes to NIL be placed as the first recodes within the list for the dependent variable. For example, if the dependent variable is factor group #2, and all tokens coded "s" in group #3 and "t" in group #4 are not to be used to build cells, then the first part of thecondition file might look as follows:
1. (3 (a (COL 3 s))) (b (COL 3 w)) (NIL (AND (COL 3 s) (COL 4 t)))) 2. (3 (NIL (AND (COL 3 s) (COL 4 t))) (b (COL 3 w)) (a (COL 3 s)))
When creating new groups which do not exist in the original tokens (indicated by a group number of 0, as in one of the examples above), the user should include an "ELSEWHERE" condition if there is a possibility that for some of the tokens, none of the conditions for creating the new group will be true; otherwise, the first such token will generate an error message and cell creation will be aborted.
( (2 (NIL (AND (COL 3 s) (COL 4 t))) (a (OR (COL 2 x) (COL 2 y))) ...
For those users not familiar with LISP syntax, note that:
The default font for GoldVarb documents is Courier 12. This font has a number of advantages: it prints well on laser printers, it is monospaced (i.e. almost all its characters are of the same width), and it contains very few undefined characters. This last property makes it especially convenient for token files in which some groups have many different factors.
Characters 32 through 255 of the font Courier 12 are illustrated in the table in Figure 29. The ACSII code of any character in the table is found by adding the small figure at the top of the column to the small figure at the left end of the row. Note that three characters have special significance in tokens: `(' used to introduce a token, `/' indicating a token or group to be excluded, and `.' which will be replaced by the default factor. The space (#32) and the option-space (#202) should not be used in tokens. In addition, the following ASCII codes correspond to undefined and/or invisible characters: 127, 174, 190, 206, 207, 222, 223 and 228. With these exclusions, more than 200 characters are still available. In the current version of GoldVarb the maximum number of factors per group is 200.
Return to beginning of manual.
Return to Table of Contents.
27 mai 1999