GoldVarb

Version 2

A Variable Rule Application for the Macintosh

David Rand and David Sankoff


GoldVarb - Version 2.0 - April, 1990

On-line Manual

	To request information or make comments contact:

		D. Sankoff
		Centre de recherches mathématiques
		Université de Montréal
		C.P. 6128, succursale centre-ville
		Montréal (Québec) H3C 3J7

	Telephone:

		(514) 343-7574

	E-mail:

		sankoff@CRM.UMontreal.CA

Table of Contents

  1. Introduction

  2. Data and Results

  3. Getting Started

  4. Token Files

  5. Condition Files

  6. Cell Files

  7. Variable Rule Analysis

  8. Using the Menus

  9. Glossary

Appendices

  1. More on Condition Files

  2. The Macintosh Character Set


§1. Introduction

GoldVarb is a Macintosh application for carrying out variable rule analysis and associated data manipulations and displays. It is based on programs previously circulated by David Sankoff, Pascale Rousseau, Don Hindle and Susan Pintzuk, but, as well as containing many new features, it has been completely restructured and reprogrammed in PASCAL by David Rand. Successive versions have been extensively tested by researchers in the linguistics departments of the University of Pennsylvania and the University of Ottawa and in the Département d'anthropologie of the Université de Montréal. GoldVarb 1.6 was distributed in October of 1988 at the XVIIth NWAVE colloquium held at the Centre de recherches mathématiques of the Université de Montréal[1]. This manual describes version 2.0 which includes improvements implemented in the intervening year and a half, as well as several extensions to the user interface.

This manual describes only how to use GoldVarb. The underlying mathematics as well as the linguistic interpretation and pertinence of variable rule analysis are discussed elsewhere[2]. GoldVarb is a stand-alone application, requiring no other software other than the operating system. The text-editing capabilities of GoldVarb were adapted from the text-editor Utile+[3] which interested users can obtain from its author.

We assume that the reader has some experience with at least one other Macintosh application. The glossary in §9 contains a few tips for beginners. It is important to use a recent version of the system software [4]: that is, the System, Finder and MultiFinder files, and other associated files.

GoldVarb may be run on any member of the Macintosh family of computers. Nevertheless, it is not recommended for use on models older than the Macintosh Plus.


§2. Data and Results

The data for variable rule analysis consist of a list of tokens coded for a certain number of factors. Figure 1 shows part of a GoldVarb data file.


Figure 1.

This type of file will be discussed in more detail in §4. The data in the figure bear on the phenomenon of plural morpheme expression in Nepean noun phrases. Each line starting with an open parenthesis contains the information on one adjective, noun or determiner token from a corpus of informal interaction among Nepean adolescent gang members. The "1"'s and "0"'s in the first column indicate the presence or absence of the plural marker -enmas. The "a", "n" and "d" refer to the grammatical category (adjective, noun or determiner) of the word eligible to be marked. The "c" and "s" in the third column indicate whether the token comes from an object or subject noun phrase, respectively.


Figure 2.

The key output of a variable rule analysis consists of a list of numbers, one associated with each factor which may affect the variable being studied. Figure 2 contains part of a GoldVarb output file (see §7 for more detail) from an analysis of the data on Nepean plural marker expression of which the tokens in Figure 1 form part. The variable is plural marker presence or absence, the factors are "a", "n", "d", "c" and "s", grouped into two factor groups, one containing "a", "n" and "d" and the other "c" and "s". The numbers are factor weights, indicating in this instance the degree to which the factor favours marker presence rather than deletion. Higher numbers indicate that the corresponding factor favours marker presence more than the other factor(s) within the same factor group. The input is a kind of average tendency for marker presence. The precise interpretation of all these numbers can be found in the document "Variable Rules"[2].


§3. Getting Started

The different icons associated with different types of GoldVarb files are illustrated in Figure 3. The program comes with a sample token file, entitled "Nepean.Tok", which we will use to illustrate its functioning. In this section we will show how to get from input to output in the simplest cases. In succeeding sections we discuss the various steps and options in more detail.


Figure 3.

Open the file Nepean.Tok by double-clicking on its icon; this will automatically open GoldVarb. The screen should appear as in Figure 4. Nepean.Tok contains data on Nepean plural expression; indeed Figure 1 portrays part of this same file. The figure in the lower left corner of its window indicates the largest block of memory still available to GoldVarb [5]. Ignore for the moment the Factor specification box associated with the token file. Token files and factor specifications will be discussed in more detail in §4.

We will use the information contained in Nepean.Tok as is. That is, we will not recode any of the factors. Thus we choose No recode in the Tokens menu. A dialogue box will appear asking for a name for a condition file. Type "Nepean.Cnd" (or whatever name you wish) and click Save. The program will produce a default condition file as in Figure 5. As with the token document, available memory is indicated in the lower left corner of the window. The second line of this document is just a comment, as indicated by the semi-colon in the first column. Note that the token file remains open. Condition files will be discussed in more detail in §5.


Figure 4.


Figure 5.

The next step is to construct a cell file by choosing Load cells to memory... in the Cells menu[6]. A series of dialogue boxes will appear to establish options and to name files. The first simply requires confirmation that the cells be constructed on the basis of the currently open token and condition files. After clicking Yes, the user is requested to name the cell file, for example "Nepean.Cel". The tokens in Nepean.Tok are counted as they are checked for consistency with the values declared in the Factor specification box discussed in §4. A message appears when this is complete. Click OK to continue. A third dialogue box then appears, asking about application and non-application values. Ignore this for the moment; it will be discussed in §5. Just click OK to accept the default values furnished by the program. Finally, a fourth dialogue box asks for a name for a result file. After the user clicks Save, GoldVarb will construct a cell file by combining all those tokens which are identically coded on the independent variables. This file, Nepean.Cel, appears in Figure 6. The format of a cell file is discussed in §6.


Figure 6.

At the same time as the cell file is being constructed, certain information is written to the result file, Nepean.Res. This includes the date and time, the name of the token file, the condition file in its entirety and various statistics computed for Nepean.Tok, Nepean.Cnd and Nepean.Cel. See Figure 7.


Figure 7.

Finally, choose Binomial, 1 level in the Cells menu. This will carry out the actual variable rule analysis on Nepean.Cel. This and other options will be discussed in §6. The results of the analysis are written to Nepean.Res, which already contains some results from the cell construction procedure. See Figure 8. In addition, GoldVarb draws a Scatter-gram, shown in Figure 9, comparing the proportion of plural markers actually expressed in each cell of Nepean.Cel with the proportion predicted by the statistical model constructed by the program.

In this section we have followed the pathway from token data (contained in the token file) through cell file (constructed according to definitions in the condition file), to the results of variable rule analysis (contained in the result file). In the following sections we will examine each of these types of files in more detail.


Figure 8.


Figure 9.


§4. Token Files

In §3 we saw how to use a previously saved token file. It may be opened by clicking on its icon (this starts up GoldVarb), or by choosing Open... from the File menu after GoldVarb has been started. We will use the term native to refer to a file created or modified by GoldVarb. Such a file will be represented in the Finder by one of the four distinctive GoldVarb document icons (see Figure 3).

The user who begins working with GoldVarb without having a native token file will need to proceed in one of two ways:

(a) Use the New... command to create a new empty document into which tokens will be typed, or

(b) Use the Import... command to import a non-native token file, i.e. a file created by another application.

These two approaches are discussed in §4.1 and §4.2. In version 2 of GoldVarb, a new feature, discussed in §4.3, allows the creation of a new token file from existing token and condition files. Finally in §4.4 we discuss another new feature introduced in version 2: searching and replacing factors in the token file.

§4.1 Creating a New Token File

When New... is chosen from the File menu, the user is first given the opportunity to name the document which then appears on the screen with the Factor specification box below it. The document will be empty except for a single open parenthesis, prompting the user to type a token. Here are some guidelines for the entering of tokens:


Figure 10.

In addition to the tokens, the user must enter the factor specifications in the appropriate box[7]. These are a set of declarations indicating which factors may appear in each group (i.e. each column of the token file) as well as the total number of groups. Here are some guidelines for the entering of factor specifications:

After both tokens and factor specifications have been entered, the user should choose Check tokens from the Tokens menu. This command (which is executed automatically when cells are created from tokens and conditions) determines whether all tokens contain only legal factors, extends any short token with the appropriate fill character, and replaces the character "." by the appropriate default value.

§4.2 Importing a Non-Native Token File

If the user has already prepared a token file using a program other than GoldVarb, this file may be processed by starting up GoldVarb and using the Import... command. This command, unlike Open..., allows the user to open any text file created by any application. Assuming that the format of this file conforms to the guidelines listed in §4.1, there is a short-cut which avoids the sometimes laborious task of entering the factors in the Factor specification box. When the user chooses Generate factor spec's... from the Tokens menu and clicks Yes in the ensuing dialogue box, GoldVarb scans the tokens and builds up a list of factor specifications based on the occurrences of factors in the token file. We can think of this command as a sort of inverse of the command Check tokens discussed above. That is:

It is far easier to check the factor specifications by eye than to enter them or, of course, to check the tokens by eye.

If an imported file is saved by GoldVarb, it becomes "native", i.e. it adopts the appropriate GoldVarb document icon as if it had been created by the program.

§4.3 Creating a Token File from an Existing Token File

Normally one uses a token file, subject to recodings defined in a condition file, in order to create a cell file. However, a new feature of GoldVarb version 2 allows one to recode a token file in order to create not a cell file but a new token file. This is convenient if one wishes to do several recodings of the same data using complicated sets of conditions which differ only slightly. A single initial recoding can be used to create a new set of data (tokens) to which much simpler conditions will subsequently be applied.

To use this feature, choose Recode to new token file... from the Tokens menu. When the recoding has been completed, the old token and condition files will be closed and the newly created token file will be opened on screen.

§4.4 Searching and Replacing

For the purposes of preparing and correcting the raw data, i.e. the tokens, GoldVarb provides several search and replace functions. These are accessible via the last two items in the Tokens menu. When the item Search & Replace... is chosen, the dialogue box shown in Figure 11 will appear on screen. As the title of this little window implies, the search and replace functions apply only to the token document, and searching is columnar, i.e. GoldVarb will search for a factor or factors starting only in a particular column. This dialogue contains four buttons, a pair of arrows used to change the column number, and two slots for entering the string to be sought and its replacement if any. If, for example, one wishes to search for the factor "s" in the third group, then type "s" in the first rectangular slot, and then use the little arrows in the bottom right corner to set the column number to 3. To begin searching, click the button Find, or use the command key `. The information about what is being sought will also appear in the last item in the Tokens menu illustrated in Figure 24.

The search example just described is very simple, involving only a single character. A less trivial example would be to search for, say, the string "nc" starting in column #2 -- that is, to search for the simultaneous occurrence of the factor "n" in the second group and the factor "c" in the third group.

Here are some guidelines for searching:

Let us suppose that the user wishes to replace all occurrences of the factor "s" in group #3 by "ð". To do so, they would type this latter character in the second rectangular slot in Figure 11 and then hit the button Change all.... The Replace All dialogue box of Figure 12 will then appear, asking the user to confirm or cancel the requested operation. If the user chooses Continue, then GoldVarb will proceed to replace all such occurrences, and the dialogue box (called the monitor) in Figure 12 will remain open until the operation is completed. (If the document is very long, it is possible to switch to another application under the MultiFinder and let GoldVarb continue in the background. The user will be informed when GoldVarb has completed its task.)


Figure 11.


Figure 12.

The buttons Change and Change & Find in Figure 11 are used for replacing one occurrence at a time. The former is undoable, using Undo in the Edit menu, as if the replacement had been typed from the keyboard. The latter is not undoable and is equivalent to performing Change followed by Find.

Note that after performing Change all..., only the last modification may be undone using the Undo command.

When using any of these four functions, the text in the token document scrolls automatically so that each occurrence becomes visible as it is found.


§5. Condition Files

A condition file is necessary when generating cells from a token file. It specifies which groups to use as dependent and independent variables, and how to recode the tokens. By specifying factor groups and recodes within the condition file, the user is able to select and modify the coding strings used to build cells without changing the data in the token file. Since only those factor groups specified in the conditions are used to build cells, some factor groups within the coding string can be used, for example, as flags for sorting the data rather than as variables for variable rule. The recodes allow the user to select the factor groups to be used both for the dependent variable and for the independent variables, to select the tokens to be used to build cells, to combine factors within a factor group, to create new factors within a factor group, and to create new factor groups.

As with token files, a condition file may be created by GoldVarb using New..., or it may be created by another program and imported into GoldVarb using Import.... Many users find it difficult to master the syntax required for the construction of a condition file. Thus GoldVarb includes a simplified procedure for this task -- yet another way of creating a condition file -- implemented through the Recode setup... command in the Tokens menu. There is also the No recode option discussed in §3.

§5.1 Using the "Recode setup..." dialogue box

We illustrate some simple recodings on the Nepean plural data. Open Nepean.Tok and choose the Recode setup... command in the Tokens menu. After asking the user to enter a name for the condition file which GoldVarb will generate (we suggest "a & n vs. d" for reasons which will soon be evident) the recoding dialogue box in Figure 13 appears.

On the left side of this box we see a list of the groups and factors in the token file. A list of recoded groups will be build up on the right side. The six buttons in the middle (discussed in Table 1 below) perform various forms of recoding; they are all initially deactivated because none of the groups has as yet been selected. GoldVarb uses the empty space at the bottom of the dialogue box to display occasional messages telling the user what to do next.

Suppose we have reason to ignore the distinction between adjectives and nouns in their effect on plural expression, and to concentrate on the difference between the determiners and the other two categories. Click #1 in the column on the left of the left hand list, and then click Copy. The group with factors 1 and 0 then appears in the right hand list. Now click #2, followed by Recode. The group with factors a, n and d appears in the right hand list, with the a flashing. Type a. Now the n starts flashing. Type a again to indicate that the nouns are being reclassified with the adjectives. The screen should look like Figure 14, with the d flashing. Type d. Finally, copy the #3 factor group to the right hand list and click OK. GoldVarb automatically generates a new condition file, shown in Figure 15.


Figure 13.


Figure 14.


Figure 15.

Let us consider another example using the Recode setup... command. We begin in the same way as above, assigning a file name (here we suggest "Nouns with subjects") and then copying group #1 from the left to the right side. But this time let us construct a new factor group by combining groups #2 and #3 into two new factors: x will represent tokens which consist of a noun in subject position (i.e. those tokens with n in column 2 and s in column 3) and y will represent all other tokens. To do this, select groups #2 and #3 on the left, then click in the AND button to indicate that we are interested in simultaneity of factors in these two groups. Still on the left side of the box, select the factors n and s and type x which will appear as the first factor in group #2 on the right. Since this is the only combination required, click in the space at the bottom as instructed.

The dialogue box should now appear as in Figure 16 with the small black rectangle beside the x flashing, indicating that we must type one more letter as the recode value for all other combinations, that is, all tokens which do not consist of a noun in subject position. Type y, and we are done. Click in the OK button. The dialogue box disappears and the newly generated condition file (Figure 17) appears on screen.

A word about the numbering of groups: In Figures 14 and 16 the groups on the right side of the dialogue box ("Groups after recoding") are numbered consecutively starting at 1. The number appearing in square brackets indicates the group's origin, i.e. the group's number before recoding. Thus the notation "1[1]" means that recoded group number one was taken from group #1 on the left side. If the letter "n" appears instead of a number, this indicates that the group is a new one which did not exist as such in the token file. Thus the notation "2[n]" in Figure 16 indicates that the second recoded group was constructed from a combination of several groups (in this case, two) on the left. In the resulting condition files (Figures 15 and 17), the original group number is shown at the beginning of each set of recode conditions. The number "0" is used for new groups, as shown in Figure 17.


Figure 16.


Figure 17.

Finally, it should be noted that the Recode setup... dialogue allows easy construction of only the simplest and most common forms of recode. More complicated recodes must be entered by typing directly into the condition file window. Nevertheless, the two methods may be combined -- i.e. one may generate a preliminary condition file using the dialogue and then modify or extend it by typing.

See Table 1 for an explanation of the six buttons in this dialogue box.

>> Copy >>
Copy the selected group(s) from the left to the right.
Exclude
Exclude the factor(s) selected on the left.
The condition file will contain recoding conditions which will exclude from the cells all tokens which contain an excluded factor.
>> Recode >>
Recode the group selected on the left and insert the resulting group on the right.
>> AND >>
Combine the two or more groups selected on the left into a single new group on the right, using the predicate AND.
>> OR >>
Combine the two or more groups selected on the left into a single new group on the right, using the predicate OR.
Remove <<
Remove the group(s) which is/are selected on the right. This is useful for correcting errors.
Table 1: Buttons in the "Recode setup..." dialogue box

NOTE:

* On each button, the arrows, if any, indicate the direction of the operation which the button performs.

* The list on the left side of the Recode setup... box shows the groups contained in the token file, as specified in the Factor specification box. The list on the right side contains the recoded groups, i.e. the groups which will be generated by the condition file which this dialogue box builds.

* A group or factor is selected or de-selected by clicking with the mouse. A factor is selected if it is shown on a black background, while a group is selected if its group number is shown on a black background. It is possible, for example, to select a group without selecting any of its factors, but the opposite is not possible. At most one factor at a time may be selected in a selected group.

* A button is activated only if an appropriate group, set of groups, factor, or set of factors has been selected. For example, the Copy button is activated when one or more groups are selected on the left, while Remove is activated only if one or more groups are selected on the right.

§5.2 The Application Values

Along with the condition file, GoldVarb must be told what values of the dependent variable (the first recoded group), are pertinent for constructing cells. During the final stages of cell creation, i.e. just after processing the token and condition files, GoldVarb will display a dialogue box in which the user is asked to enter the application values. If only one value is entered, then this will be the application value, and all other factors in the group will be counted as non-applications. If more than one value is specified, only these values will be used, while recoded tokens with any other factor in column one will be ignored. The maximum number of values of the dependent variable is 9.

Example: Suppose the dependent variable has factors "abcd" (after recoding). Consider the following choices:

"a"
binomial case; a = application; b, c & d = non-application.
"ab"
binomial case; a = application; b = non-application; c & d omitted.
"abc"
trinomial case; d omitted.
If the user wishes to use any other combination of factors (e.g. a & b applications, c & d non-applications), this must be done by first recoding.

Variable rule computations are possible for only the binomial case (in the current version of GoldVarb). However cell file creation and cross-tabulation are possible in all cases. The adjustment of condition files to eliminate knockout and singleton factors is discussed at the end of §6. Condition file syntax is discussed in further detail in the Appendix.


§6. Cell Files

A native cell file is created by GoldVarb from a token file and a condition file using the Load cells to memory... command in the Cells menu. A previously saved cell file may be opened by double-clicking on its icon (this starts up GoldVarb) or by choosing Open... in the File menu after starting GoldVarb. However, as with token and condition files, the user can create a new cell file by choosing New... and then typing directly into the new window, or a non-native cell file created elsewhere may be imported using the Import... command. In order to facilitate data importation, the format of GoldVarb cell files has been chosen to be compatible with the format used by Susan Pintzuk's IBM-PC programs.

A cell file, such as the one depicted in Figure 6, consists of the following parts:

  1. On the first line, the number of variants (up to 9) of the dependent variable in column 1, followed immediately by a list of these variants.
  2. On the second line, the number of factor groups, not including the dependent variable, right-justified in columns 1 and 2.
  3. One line for each of these factor groups, the number of factors in the group right-justified in columns 1 to 4, immediately followed by a list of the factors in the group.
  4. In the subsequent lines, the cells, each cell occupying two lines.
  5. The end of the cell list is indicated by the value -1 in columns 3 and 4.
When GoldVarb creates a cell file from tokens and conditions, the date and time and the names of the token and condition files are inserted at the end of the list of cells, i.e. after the line containing the "-1".

With some data sets a knockout factor or a singleton may be flagged by the program beside the tabular results (cf. Figure 7) which show the counts of factors in the cells. A knockout is a factor for which applications occur with frequency 0% or 100%. A singleton is a group which contains only one factor. Variable rule computations cannot logically be performed on a cell file which contains knockouts or singletons. One must generate a new cell file by recoding the original tokens using a different condition file.


§7. Variable Rule Analysis

Variable rule analysis can be performed provided that:

  1. cells have been successfully loaded into memory (either generated from tokens and conditions or read from a previously created cell file), with neither knockouts nor singletons, and
  2. only one or two values have been declared in the Choose application value(s) dialogue box. This is the binomial case [9]. When only one application value is declared, this is still binomial since the factors in the first recoded group are split into two sets: the factor which is the application value, and all other factors, which count as non-applications.
In the Cells menu, the items which perform variable rule analysis are: Binomial, Up & Down, and Binomial, 1 level. The latter performs an analysis based on all groups and all cells. An example of this was discussed in §3 and illustrated in Figures 8 and 9.

In Figure 8, the line starting with "Iterations" keeps track of GoldVarb's progress in finding the "maximum likelihood" estimation of the factor weights to a certain degree of accuracy, at which point "convergence" is indicated. If the number of iterations reaches 20 without convergence, no further iteration is attempted and the current values of the estimates are presented [10].

A new feature of GoldVarb is the option of comparing the log likelihood of a run and the maximum possible value of such a likelihood. The usefulness of this test remains to be evaluated.

The scattergram (Figure 9) drawn at the completion of a 1-level analysis may be printed, or it may be copied to the clipboard and subsequently pasted into the Scrapbook or into a document of a graphics application in order to save it. Further, while the scattergram window remains open on screen, detailed information about any data point in the scattergram may be obtained (and optionally written to the result document) by positioning the cross-hair cursor over the point and clicking with the mouse button. The size of each point is proportional to the number of tokens in the corresponding cell(s), so that a large point far from the diagonal suggests interaction among its factors.

For compatibility with some previous variable rule programs, GoldVarb also displays, at the end of a 1-level analysis, the Chi-square contribution from each cell as well as the average Chi-square per cell.

Binomial, Up & Down performs a step-by-step analysis, at each level of which only a subset of the factor groups are included and cells are contracted by combining together in one cell all those which differ only in excluded groups. At level 0 no groups are included so the cell list contracts into a single cell, at level 1 only one group is included, at level 2 two groups, etc. GoldVarb begins at level 0 and steps up until it finds no group whose inclusion would significantly (p < 0.05) increase the log likelihood. GoldVarb then starts again but at level "n" (n = the number of independent groups) at which all groups are included and all cells are used without contraction (as in the Binomial, 1 level analysis), and steps down to lower levels until it can no longer find a group whose exclusion does not significantly decrease the log likelihood.

The results from the "Best stepping up run" (see Figure 19) are usually identical to those from the "Best stepping down run". When they are not identical, this indicates some uncertainty about the status of the factor groups included in one analysis but excluded from the other.

Figure 18 shows the Macintosh screen during a Binomial, Up & Down analysis. In addition to the menu bar, the figure shows two windows. The large window is the result document. The small window above it is a dialogue box, called the monitor, indicating the status of the computation and including buttons which allow the user to cancel or temporarily to suspend the analysis. The horizontal bar is filled in as the computation proceeds. In the figure it is less than half filled, since the step-up part has not yet been completed.

As the analysis in Figure 18 proceeds, the cursor rotates, imitating a rolling beach ball. If the user hits the Pause button, the analysis will be suspended and the cursor will take the form until the computation is either resumed by hitting Continue or cancelled by hitting Cancel[11].

If the program is running under the MultiFinder, then during a variable rule compu-tation (or while pausing), the user may switch to another application by clicking in the little GoldVarb icon in the upper right corner of the screen. This allows one to use the computer for other purposes while the analysis continues in the background. The pause feature is useful in order to allow another application near-exclusive use of the CPU on a temporary basis. If GoldVarb completes its work in the background, the user will be alerted. The user may switch back to GoldVarb by clicking in any of its windows (e.g. the monitor or the result document), or by clicking (possibly more than once) in the little icon in the top right corner of the screen, or by using the menu.

In addition to the three buttons just described, the monitor in Figure 18 also contains a small control giving access to GoldVarb's Auto-save feature. This feature is equivalent to the option "Automatically save textual results" which appears in the Editing options... dialogue box accessible via the Edit menu discussed in §8.3. It is included in the monitor because the menus are inaccessible during variable rule analysis.

Figure 19 shows the result document as it appears immediately after completion of this Up & Down analysis. Unlike the 1-level case, a scattergram is not drawn at the completion of an Up & Down analysis. If a scattergram is desired, a recoding must be done so that only the appropriate groups are used to make cells for a Binomial 1-level analysis.


Figure 18.


Figure 19.


§8. Using the Menus

In this chapter we describe each of GoldVarb's eight menus.

§8.1 The Menu

In addition to desk accessories, this menu contains the item About GoldVarb.... which, when chosen, displays a box containing information about the program and its authors. Some general documentation summarizing this manual is available by clicking the Help button in this box. More specific information about the various GoldVarb windows is available through the Info & Help... command in the Window menu, discussed in §8.6.

§8.2 The File Menu

The File menu, illustrated in Figure 20, contains commands for opening, closing and printing files.


Figure 20.

The commands New..., Open..., Import..., Save... and Save as... are for the three types of data files (tokens, conditions and cells) and for result files. With Open... only files created by this program can be opened, while Import... allows one to open any text file created by any application. The length of documents is limited only by the amount of memory (RAM) available. When the user closes a GoldVarb file which has been modified, or when Save... is chosen from the File menu, the ensuing dialogue box displays the icon of the document so one can tell at a glance what type of file is to be saved.

When a token file is opened, the Factor specification dialogue box which appears is for entering groups and factors which will be used to check the tokens before recoding. On the other hand, the groups and factors listed at the beginning of a cell file are only the independent groups obtained after recoding. For information about the format for entering data, see the Info & Help... command in the Window menu.

The Close command applies to the active (topmost) window, which may be one of the three data types mentioned above, or some other document such as the clipboard, the windows for displaying results (there are two: one for textual results, the other for pictorial results), the window which displays documentation, or the dialogue box for searching.

Only text results, not pictorial, can be saved to a disk file with this version of GoldVarb. However, a picture -- such as a scattergram -- may be printed. It may also be copied to the clipboard and then pasted into the Scrapbook, or into a document in another application such as MacPaint or MacWrite.

The command Print setup... displays a standard dialogue box which allows the user to select some basic options, such as the page orientation, for the printer which is currently chosen.

GoldVarb has two main printing functions. The command Print selection... (shown chosen in Figure 20) is used to print the selection in a text document. If no text is selected, then this function is disabled. The next command Print document... is used to print an entire text or pictorial document. Either item will cause the dialogue box of Figure 21 to be displayed. In this figure the items in the dotted rectangle are proper to GoldVarb while the rest depend on the type of printer currently in use. The button Page layout... gives access to another dialogue box (not illustrated here) which allows the user to change the margins and choose a header and/or a footer. If the option "Preview printing, page-by-page" is chosen, a representation of each printed page will appear on the screen so it can be viewed before being sent to the printer.

A picture will always be printed on a single page, with vertical or horizontal reduction if necessary. When printing text, the number of lines per page depends on the font, font size and font style used to display the document, as well as on the choices of margin, header and footer. The total number of pages of text to be printed is not precalculated in the current version of GoldVarb (hence the question mark in Figure 21) but will be in a future version.


Figure 21.

The command Transfer..., an alternative to Quit, allows one to go directly to another program without going first to the Finder. For example one may wish to transfer to another application in order to process result files created by GoldVarb or to prepare data files (although this can also be done within GoldVarb). Of course, if the Macintosh is operating under the MultiFinder then other programs can run simultaneously with GoldVarb, in which case there is no need to transfer or quit in order to switch applications.

§8.3 The Edit Menu


Figure 22.

GoldVarb allows text editing in the token, condition and cell windows, and optionally in the result document. A single click with the mouse button sets the insertion point for typing. Double-clicking with the mouse button will select a word, while triple-clicking will select a line. If line numbers are displayed in a window in which text-editing is enabled, then several lines may be selected by dragging the cross-cursor over the appropriate line numbers while holding down the mouse button. The clipboard displays either the last piece of text copied (which will be inserted if the Paste command is executed) or the last picture copied, for example a scattergram or a cross-tabulation.

The Edit menu implements the standard text-editing commands, plus a few options. The command Undo (which changes to Redo when appropriate) allows one to undo or redo the last Cut, Copy, Paste or Clear operation, or the last sequence of characters typed from the keyboard, including backspaces. However, if the user leaves a window in order to work in another, and then return to the first, it will then be impossible to undo/redo Cut, Copy or Paste, because the contents of the clipboard may have been changed. However, keyboard entries and Clear remain undoable/redoable.

The appearance of the Undo item in this menu changes, depending on what operation, if any, can be undone or redone. In Figure 22 it reads Undo Copy, indicating that the user has just copied some text to the clipboard. If this copy is undone, then the text previously stored on the clipboard, if any, will be restored. This restoration of the old clipboard also occurs when Cut is undone. However, if the previous contents were a picture, they cannot be restored by undoing a Cut or Copy operation.

The item Line numbers allows the display of line numbers in the left margin of any window containing text. It works like a toggle switch. A check mark appears on the left if it is "on".

Finally, the last item in the Edit menu, Editing options..., causes GoldVarb to display the dialogue box illustrated in Figure 23. This dialogue allows the user to set several text-editing options.

Automatically save textual results
when checked, causes the text in the result document to be automatically written to the appropriate disk file during variable rule analysis. This is useful protection against unexpected interruptions, such as a power failure or a fatal error occurring in another program running simultaneously under the MultiFinder. This option is also available in the monitor as discussed in §7.
Allow editing of textual results
when checked, enables text-editing in the window used for text results. By default this option is chosen, but it may be turned off in order to prevent inadvertent modification of the result document. (No editing is possible in the pictorial result window used for cross-tabulation and scattergrams.)
Automatic indentation
when checked, means that when a carriage return is typed the new line will be indented with the same number of spaces (if any) as in the line just above it. If the "Option" key is help down while the return key is pressed, automatic indentation is temporarily suppressed if currently active, or temporarily activated if not.
Replace tab by ... blanks
when chosen, means that when a tab character is typed in a document, the specified number of blanks is inserted instead of the tab character (ASCII code #9). This has no effect on any tabs already present in the document.


Figure 23.

§8.4 The Tokens Menu

This menu, enabled only if a token document is open, is illustrated in Figure 24. It contains a variety of commands which allow the user to verify, recode or modify the tokens.

Generate factor spec's...
is used to scan through the tokens in order to determine what factors they contain. These factors will be displayed in the Factor specification dialogue box, replacing any factors currently stored. This is convenient when importing an already existing token file, so that one does not have to type the factor specifications directly into the box.
Show factor spec's
causes the groups and factors (which were entered one-by-one in the dialogue box, or which were generated using the above command) to be displayed in their entirety in the result window.
Set fill character...
allows one to specify the character to use to fill short tokens.
Check tokens
verifies that each column in each token contains a valid factor, replaces the character "." by the default character for the appropriate group, and fills short tokens with the specified character.
Note the difference between the commands Generate factor spec's... and Check tokens described above. The former assumes that the tokens are correct and uses them to determine the number of groups and what factors are used for each group, while the latter command uses the factor specifications in the box in order to determine whether the tokens conform to these specifications.

No recode
generates a condition file which corresponds to inclusion of all groups and factors from the token file when creating cells.
Recode setup...
is a short cut for constructing a condition file, allowing one to define in a dialogue box several basic kinds of recode without having to learn the syntax for entering conditions. It is discussed in §5.1.
Recode to new token file...
allows the user to create a new token file from an existing token file and a condition file. The old token and condition files will be closed and the newly created token file will be opened on screen. This command is an alternative to the command Load cells to memory... in the Cells menu discussed in §8.5. The latter is used to create a cell file from tokens and conditions.
Note that each of these three commands just described creates a new GoldVarb file. In the menu, the icon which accompanies the command indicates the type of file which will be created. The same is true of Load cells to memory... in the Cells menu.

Finally, the last two items in the Tokens menu deal with searching and replacing as discussed in §4.4.

Search & Replace...
opens the dialogue box illustrated in Figure 11.
Find next
The appearance of the last item changes, depending on what character string, if any, has been entered in the search-and-replace dialogue box. If no such string has been entered, then this item reads Find next, but is disabled. In Figure 24 this item reads Find next "s" @ col. #3 -- this is how it would appear if the search-and-replace dialogue appears as in Figure 11.


Figure 24.

§8.5 The Cells Menu

The Cells menu performs various operations on cells, including variable rule analysis. The menu is divided into four parts. The first part contains four items:

Load cells to memory...
The most important item since it is used either to generate a new cell file from tokens and conditions, or to load cells from a previously created cell file. In either case, counts of the groups and factors in the cells are written to the result document in tabular form.
Show cells in memory
Once cells have been loaded using the previous command, this command causes a list of them to be written to the result document.
Show application values...
Puts up a small dialogue box displaying the application values which are currently in use.
Cross tabulation...
Allows the user to cross-tabulate the factors in any two groups. The table may be displayed either as a "picture" in a separate window as in Figure 26, or as text which will be written to the result document. The pictorial method gives a more attractive table which, although it cannot be saved with the result document, can be printed or exported (see §8.3).


Figure 25.

The second set of items in the Cells menu performs variable rule analysis on the cells currently in memory. Binomial, Up & Down and Binomial, 1 level are discussed in §7. Multinomial variable rule analysis (i.e. with more than two application values) has not yet been implemented. Hence the item Multinomial, 1 level is not functional.

The third part of the Cells menu is used to set various options. Each functions like a toggle switch, with a check mark displayed on the left if the option is "on".

Rapid computation
Greatly accelerates calculations by sacrificing some accuracy. Results obtained with and without this option should nevertheless agree to about the third or fourth digit after the decimal point.
Centre factors
Determines which method will be used to determine the average probability of the factors in a given group. When this option is chosen, each factor in a group is given equal weight. Otherwise each factor is weighted according to its occurrences relative to total occurrences of all factors in the group. (By "occurrences" is meant the sum of applications and non-applications.)
Shut up
An explanation of this option is left as an exercise for the reader.
Show model fit
Causes GoldVarb to display the maximum likelihood and Chi-square fit during variable rule computations involving more than one independent variable.
Finally, the last command in the Cells menu, Show memory info., is of interest only to the technically oriented user. It displays in the result document various information about GoldVarb's use of dynamically allocated memory.


Figure 26.

§8.6 The Window Menu

This menu contains a variety of commands controlling the arrangement of, and giving information about, the windows on the screen.


Figure 27.

§8.7 The Font & Style Menus

The Font menu is used to select the font for the active window. Non-proportional fonts such as Courier or Monaco are recommended for tables so that their columns will be properly aligned. However, if the user does a Cross tabulation... (Cells menu) and chooses "Picture", the columns will be properly aligned regardless of which font is used. For such pictorial results, the font or font size may be changed only by recreating the window's contents; i.e. one must re-execute the appropriate command after choosing a different font or font size.

Using the Style menu, illustrated in Figure 28, one may change the size of the font used to display the text in the active window. The appearance will be best if a size displayed in bold/outline in the menu is chosen. For example, in the figure sizes 9, 10, 12, 14 and 18 will appear best. Font sizes greater than 12 are not recommended for scattergrams.

The Style menu also contains several items which control the style of the text. As with fonts and font sizes, the choice of style applies to the entire document, not just the selection. The style Bold may, for example, be used to improve readability when displaying the Macintosh screen on an overhead projector. The style Condense may facilitate printing a text document containing long lines.


Figure 28.


§9. Glossary

active window
The topmost window. Its title bar is filled in. In an inactive window, the title bar is white except for the title. An inactive window may be made active by clicking anywhere in it. If the active window is a document (as opposed to a dialogue box) then GoldVarb displays in its bottom left corner the amount of memory (RAM) still available to the program. This information is also displayed in the topmost document when it is not active, i.e. when covered by one or more dialogue boxes.
apple menu
The first menu on the left. The first menu item is About GoldVarb... which displays information about the authors and the current version of the program, and allows access to some documentation. This menu also contains desk accessories.
application
1. A computer program.
2. Presence of the appropriate value of the dependent variable, as in "application value".
caret
The tiny vertical bar which flashes indicating the insertion point in a text document in which editing is enabled. It appears only when the selection is of length zero. It is sometimes called the cursor, at the risk of confusing it with the mouse point.
close box
The small square in the upper left corner of document windows and some dialogue boxes. Clicking in this box will close the window.
cursor
The small symbol which follows the movements of the mouse and which changes form depending on what active window, if any, is beneath it. Also called the mouse point. When over the contents of a text document in which editing is enabled, it takes the form of an I-beam and allows the user to change the position of the caret by clicking. When over a scattergram, or the line numbers in a text document, it takes the form of a cross-hair. When over a scroll bar, or when no active window is beneath it, it takes the form of an arrow. During long computations, the cursor takes the form of a rolling beach ball. When a long computation is temporarily suspended, the cursor takes the following form: .
deactivated button
In a dialogue box, a button whose title is drawn in gray (rather than black) and which is not currently functional.
default button
In a dialogue box, the button which is boldly outlined. Pressing the Return key or the Enter key on the keyboard is equivalent to clicking with the mouse in this button.
desk accessory
A small program which can be called using the apple menu and which is available for use without leaving the current application. Examples: Calculator, Alarm Clock, Key Caps.
Finder
An essential component of Macintosh system software, the Finder is the user interface of the Macintosh file system. It is a program which pictorially displays the files on the disk(s) currently in use, with different icons for different types of files, and which provides basic file management operations, for example: placing files in folders (directories), copying files, deleting files, copying disks, launching applications, etc.
grow box
The small symbol in the lower right corner of a document window. The window's size may be changed by placing the arrow cursor in this box and holding the mouse button down while dragging.
monitor
The dialogue box which is displayed during some time-consuming processes (replace all, variable rule analysis) and which displays the status of the process. Its title indicates the type of process. It contains three buttons -- Continue, Pause and Cancel -- which can be used to suspend or cancel the operation. As the operation proceeds (or while pausing), the user may move or resize windows in GoldVarb. If the program is running under the MultiFinder, the user may switch to another application.
mouse point
See cursor.
MultiFinder
A component of Macintosh system software which allows the user to run several different applications simultaneously. For example, GoldVarb, the Finder and a word processing program can be used simultaneously provided that the computer has sufficient memory (RAM). By default GoldVarb requests 512 kilobytes of RAM when running under the MultiFinder, although the user may easily change this by entering the desired amount in the lower right corner of the "Info" window. To open this window, select the GoldVarb application icon and choose Get Info from the Finder's File menu.
native file
A file created or saved by the application. GoldVarb has four kinds of native files, each with its own distinctive icon: token files, condition files, cell files and result files. See Figure 3.
selection
In a text document or in a list in a dialogue box, the selection is shown on a black background (or possibly on a gray or a colour background on a Macintosh II). Selecting is done by clicking and dragging with the mouse. Text editing operations such as Cut, Copy, Paste and Clear apply to the currently selected text; if no text is selected, then the insertion point is indicated by the flashing caret. These comments apply to the active window only. When a GoldVarb text document is inactive, the selection, if any, is outlined; if there is no selection then the caret is invisible. When an inactive text document containing a selection is made active, the outlined region is filled with the appropriate background colour.
title bar
In a document window and in some dialogue boxes, the bar across the top of the window which contains the title. The window's position on the screen may be changed by placing the arrow cursor in this bar and holding the mouse button down while dragging.

Appendix I. More on Condition Files

The following discussion of condition file syntax has been adapted from Susan Pintzuk's documentation of her IBM-PC variable rule programs.

The data within a condition file is in the form of a LISP list. Each element of the list is itself a LISP list consisting of two parts: a group number (column number within the coding string) and an optional set of recode conditions. If no recode conditions are specified, the group is used exactly as it is coded in the token. All groups specified in the condition file, and only those groups, are used to build cells. The first group in the condition file list is used as the group containing the dependent variable. The order of groups specified within the condition file determines the order of groups within the cell.

Each recode condition is again a LISP list consisting of two parts: the first part is the recode value, the value to be used for the group for those tokens which meet the second part of the condition, the test clause. The recode value is either a single character or "NIL". If it is "NIL" then tokens which fulfill the test clause are excluded when building cells. Similarly, if the recode value for the dependent variable group is "/", tokens which fulfill the test clause are also excluded. If the recode value for an independent group is "/", the remaining groups for tokens which fulfill the test clause are used in the construction of cells.

There are five test clause predicates: "AND", "OR", "NOT", "COL" and "ELSEWHERE". Case is irrelevant for predicates, e.g. "OR" and "or" are equivalent. However, case is significant for factors! For example "b" and "B" are two distinct factors.

"AND", "OR" and "NOT" are the standard logical operators; "AND" and "OR" take two to 20 predicates as arguments, "NOT" takes a single predicate as argument. If it is necessary to define a recode condition with more than 20 arguments for "AND" or "OR", two or more of the arguments can be nested more deeply. For example:


	  (AND  a1  a2  a3  a4  a5  a6  a7  a8  a9  a10
		a11 a12 a13 a14 a15 a16 a17 a18 a19 (AND a20 a21 a22 a23 a24)).
"COL" takes two arguments, a group number (i.e. column number within the coding strings) and a single character representing a legal factor value for that group; "COL" is true if and only if that column of the coding string contains the specified value.

"ELSEWHERE" is always true; it is used as the last test clause within a set of test clauses for a group, and forces the recoding of the group to the specified value if none of the previous conditions for that group has been met.,

Here is an example of a condition file:


	(
	(4 (d (OR (COL 4 d) (COL 4 c)))
	   (s (ELSEWHERE)))
	(5)
	(3 (/ (OR (COL 3 s) (COL 3 t) (COL 3 u)))
	   (m (OR (OR (COL 3 n) (COL 3 h))
		  (OR (COL 3 1) (COL 3 2) (COL 3 3) (COL 3 w) (COL 3 u)
		      (COL 3 y) (COL 3 p) (COL 3 t) (COL 3 r) (COL 3 x))))
	   (x (AND (OR (COL 3 n) (COL 3 h)) (COL 7 n)))
	   (NIL (ELSEWHERE)))
	; interactive group
	(0 (1 (AND (COL 2 x) (COL 8 a)))
	   (2 (AND (COL 2 x) (COL 8 b)))
	   (3 (AND (COL 2 y) (COL 8 a)))
	   (4 (AND (COL 2 y) (COL 8 b)))
	   (/ (ELSEWHERE)))
	)
Lines with a semi-colon in column 1 are comment lines and are ignored when processing the condition file. In the above example, group #4 is the dependent variable while groups 5, 3, and 0 are the independent variables which will be used to build cells. Group #5 has no recode conditions and therefore will be used exactly as it is coded in the token file. Groups 2 and 8 are interactive factor groups; the interaction is investigated by creating a new group, which is given the number 0.

Note that GoldVarb processes the recode conditions for each group in the order in which they appear in the conditions. The first condition that is satisfied for each token is used for recoding, and the rest of the conditions for that group are ignored for that particular token. For this reason, "ELSEWHERE" should be used only as the last condition for a group -- it is always true, so any conditions listed after it (for the same group) will be ignored. For example, if the conditions for group #5 are:


	(5 (t (ELSEWHERE))
	   (a (COL 5 x))
	   (b (COL 5 w)))
then all tokens will be recoded "t" for group #5, even if they were originally coded "x" or "w".

Similarly, the user should make sure that a condition with recode value "NIL" is placed correctly within the list of conditions. Consider the following two sets of recode conditions for group #3:


	1. (3 (a (COL 3 s)))
	      (b (COL 3 w))
	      (NIL (AND (COL 3 s) (COL 4 t))))

	2. (3 (NIL (AND (COL 3 s) (COL 4 t)))
	      (b (COL 3 w))
	      (a (COL 3 s)))
In the first example, any token coded "s" for group #3 will be recoded,"a", and the remaining conditions will be ignored for that token; therefore, the third condition can never be met. In the second example, any token coded "s" for group #3 and "t" for group #4 will not be used to build cells; therefore, only tokens coded "s" for group #3 and NOT coded "t" for group #4 will be recoded "a". If the user wishes to eliminate unconditionally certain tokens from the cells, it is suggested that these recodes to NIL be placed as the first recodes within the list for the dependent variable. For example, if the dependent variable is factor group #2, and all tokens coded "s" in group #3 and "t" in group #4 are not to be used to build cells, then the first part of thecondition file might look as follows:

	(
	(2 (NIL (AND (COL 3 s) (COL 4 t)))
	   (a (OR (COL 2 x) (COL 2 y))) ...
When creating new groups which do not exist in the original tokens (indicated by a group number of 0, as in one of the examples above), the user should include an "ELSEWHERE" condition if there is a possibility that for some of the tokens, none of the conditions for creating the new group will be true; otherwise, the first such token will generate an error message and cell creation will be aborted.

For those users not familiar with LISP syntax, note that:

  1. The list of conditions, i.e. the entire contents of the file, must be enclosed within a set of parentheses;
  2. Each element of the list, i.e. group number plus optional recode conditions, must be enclosed in parentheses;
  3. Each recode, i.e. recode value plus condition, must be enclosed in parentheses;
  4. Each predicate must be enclosed in parentheses.
Other than the restrictions specified above, the format of a condition file is fairly free: it is not necessary that any particular element appear in any particular position on a line, since parentheses completely determine the structure of the data within the condition file. Individual elements within the file are terminated by a space, open parenthesis, close parenthesis, or end of line. Comment lines may be placed anywhere in the condition file. The close parenthesis for the entire condition file must be the last character within the file, except for spaces, carriage returns, or comment lines. All comment lines must be signalled by a semi-colon in column 1.


Appendix II. The Macintosh Character Set

The Macintosh character set is an extension of the familiar ASCII character set and includes a total of 256 characters, numbered from 0 through 255. Of these, the first 32, numbered from 0 through 31, are reserved for special control characters (for example: carriage return, tab, form feed, etc.). This leaves a total of 224 characters available for use in documents. The 52 upper and lower case letters of the English alphabet, as well as basic punctuation characters, are contained in the range 32...127. Characters numbered 128 through 255 (sometimes referred to as high ASCII) are used for characters with diacritical marks and for further punctuation as well as special symbols. In some fonts, many characters -- especially high ASCII characters -- are not even defined. Such characters appear on screen as nondescript little rectangles and may be blank when printed.

The default font for GoldVarb documents is Courier 12. This font has a number of advantages: it prints well on laser printers, it is monospaced (i.e. almost all its characters are of the same width), and it contains very few undefined characters. This last property makes it especially convenient for token files in which some groups have many different factors.

Characters 32 through 255 of the font Courier 12 are illustrated in the table in Figure 29. The ACSII code of any character in the table is found by adding the small figure at the top of the column to the small figure at the left end of the row. Note that three characters have special significance in tokens: `(' used to introduce a token, `/' indicating a token or group to be excluded, and `.' which will be replaced by the default factor. The space (#32) and the option-space (#202) should not be used in tokens. In addition, the following ASCII codes correspond to undefined and/or invisible characters: 127, 174, 190, 206, 207, 222, 223 and 228. With these exclusions, more than 200 characters are still available. In the current version of GoldVarb the maximum number of factors per group is 200.


Figure 29.


Return to beginning of manual.

Return to Table of Contents.

27 mai 1999