Special Edition Using Visual FoxPro 6
Special Edition Using Visual FoxPro 6
Selecting, Viewing, and Ordering Data
- Opening a Data Table
- Opening Tables in Work Areas
- Listing Data to the Screen or Printer
- Sorting Data in the Table
- Planning Indexes for Rushmore
A fundamental operation of any data-management system is opening
tables and viewing the data that they contain. With Visual FoxPro
6, the way that you open and view data is determined by how you
store data. In this new release, you have the capability to store
tables of information either as free tables or as part of an associated
group of tables called a database. Although a database offers
many new data-management options that previously were unavailable
to FoxPro users, you might find yourself supporting many existing
applications that still use the standalone table model. Fortunately,
working with tables bound into databases is not much more difficult
than working with free tables. In fact, databases provide additional
features, such as persistent relations, that make working with
Critical to the viewing of data is the effective use of indexes. The use (and misuse) of indexes greatly affects the way that you retrieve and view data from a table. Visual FoxPro 6 supports three major types of indexes: independent single-index files, structural compound indexes, and non-structural compound indexes.
Indexes serve two primary purposes. You use some indexes primarily to form relations between tables; you use others to help retrieve data and to improve the performance of queries and reports.
Before you view data that is stored in a table or work with table data in any other way, you need to open the table in a work area. Visual FoxPro 6 supports 32,767 work areas. Each work area can hold one table and its related indexes, both independent and compound. It is unlikely that you will ever need to use all 32,767 areas-at least, not at the same time. Other system restrictions typically restrict most users to fewer than 256 files.
Visual FoxPro does have more than 32,767 work areas available. If you use private data sessions, you will have 32,767 available in each private data session you open. See Chapter 9 "Creating Forms," and Chapter 10, "Creating Reports," for more information about private data sessions.
How do you open a table to look at it? If the table is a free table, select one of the 32,767 areas and issue a USE command, as shown in this example:
SELECT 2 USE CUST
The first command tells Visual FoxPro 6 to use the second work area. The second command tries to open CUST.DBF from the current directory. If CUST.DBF is not there, Visual FoxPro 6 displays the following message, which includes the current drive and directory:
File '<current directory>\CUST.DBF' does not exist
You can also retrieve the current directory with this command:
A safer way to open a file is to first test to see whether it exists before issuing the USE command. You can do this with the FILE function, as shown in Listing 5.1.
Listing 5.1 05CODE01.PRG-The FILE Function
* Program 5.1 - Testing if a file exists before using it SELECT 2 IF FILE('CUST.DBF') USE CUST ELSE = MESSAGEBOX('File CUST.DBF does not exist in ' ; + CurDir()) ENDIF
Of course, you can include a drive and directory with the table's name to open it:
The preceding examples select the second work area without first checking to see whether another file is open there. Visual FoxPro 6 automatically closes any file that is open in the selected area when you issue a USE command, whether the command succeeds or not. This means that even if the file being opened is not in the current directory or if VFP cannot open it, VFP closes the file that was open in the selected area. Therefore, you should find out whether the work area contains a file before you use it. Use the ALIAS function, which returns the alias name of a file when present, to test whether a work area is empty, as shown in Listing 5.2.
Listing 5.2 05CODE02.PRG-The ALIAS Function
* Program 5.2 - Use ALIAS to see if table is already open * Check if work area 2 is available (empty) IF EMPTY(ALIAS(2)) && Returns .T. If no file is open in area 2 SELECT 2 ELSE = MESSAGEBOX('Work area 2 is in use') ENDIF
Of course, you should test successive work areas to find an empty one. Although you could write the code to perform such an iteration, there is a faster way. If you use SELECT 0, rather than open a specific work area, Visual FoxPro 6 begins with work area 1 and continues searching until it finds the first empty one. You do not really need to know in which work area VFP opens a file. Visual FoxPro assigns an alias name to a work area when a table is open in it. This alias name is usually the same as the table's name, but it doesn't have to be. Always reference work areas by their alias name after opening them instead of using their work-area numbers.
You don't even need a separate SELECT statement to open a table. Select the work area and open the table at the same time by adding the IN clause to the USE statement. But before I show you the modified code that you use to open a file this way, consider an additional enhancement. If the file is not in the current or specified directory, open a dialog box to enable the user to locate the file. The GETFILE function, shown in Listing 5.3, provides this capability.
Listing 5.3 05CODE03.PRG-The GETFILE Function
* Program 5.3 - Use ALIAS and FILE to open table in next * available work area. * * Open CUST.DBF in the next available work area IF FILE('CUST.DBF') USE CUST IN 0 ELSE = MESSAGEBOX('File CUST.DBF does not exist in ' ; + CURDIR()) lcNewFile = GETFILE('DBF', 'Pick table:', 'Select', 1) IF !EMPTY(lcNewFile) USE (lcNewFile) IN 0 ENDIF ENDIF
Notice that GETFILE() enables you to select a table from any drive or directory and returns the fully qualified pathname to the variable lcNewFile. Use this variable in USE to open the file. The parentheses in this statement tell Visual FoxPro 6 to use the contents of the variable pcNewFile, rather than the name pcNewFile itself.
What if the file that you want to open is already open in a different work area? By default, Visual FoxPro 6 considers this situation to be an error and displays the message File is in use.
Test for this condition with the USED() function, which requires the file alias as its parameter. As shown in Listing 5.4, the function checks to see whether the file alias exists in any work area.
Listing 5.4 05CODE04.PRG-USED() Function
CLOSE DATABASES CLEAR * Test if CUST is already in use IF USED("CUST") = MESSAGEBOX("File CUST.DBF is already in use.") SELECT CUST ELSE * If file CUST.DBF exists, open it cCurDirectory = CURDIR() CD \VFP5BOOK\DATA IF FILE("CUST.DBF") USE CUST.DBF IN 0 = MESSAGEBOX("File CUST.DBF successfully opened.") ELSE * File CUST.DBF is not in default drive, * ask user to locate it = MESSAGEBOX("File CUST.DBF does not exist in "; + CURDIR()) lcNewFile = GETFILE("DBF", "Select a table:", "Select", 1) lcFileName = SUBSTR(lcNewFile, RAT("\", lcNewFile) + 1) lcRootName = LEFT(lcFileName, LEN(lcFileName) - 4) * Check if used selected file is already open. If not, open it IF !EMPTY(lcNewFile) IF USED(lcRootName) = MESSAGEBOX("File " + lcNewFile + " is already in use.") SELECT(lcRootName) ELSE USE (lcNewFile) IN 0 = MESSAGEBOX("File CUST.DBF successfully opened.") ENDIF ENDIF ENDIF CD &cCurDirectory ENDIF
Notice that you have to extract the filename from the value returned by GETFILE before you can use USED() to test for it.
If you ever try to open a table in an empty work area and VFP tells you that the file is in use, but you don't see that file listed when you use the View window to look at the work areas, don't panic. Remember that USED() looks for files based on their alias names. If you open the file CUST.DBF and assign an alias of BUYERS to it, USED('CUST') returns false, even though any attempt to open CUST.DBF a second time in a different work area fails with a File is in use error message. If you open tables with alias names other than their filenames, document this information in the program so that other programmers can find it.
At times, you might want to open a file twice, such as when you perform a recursive reference on a file. Chapter 3 "Defining Databases, Tables, and Indexes," includes an example of a self-referencing relation that finds the supervisor of every employee. That example uses a query and the relation to find each supervisor's name. You can achieve the same result by opening the file in two different work areas using the AGAIN clause of USE. AGAIN enables you to open a file more than once. The advantage is that you can use different indexes and move the record pointer independently in each work area. The following two lines show how to open a table a second time:
USE empl1 IN 0 USE empl1 IN 0 AGAIN
Although Visual FoxPro 6 enables you to open a file multiple times by including the AGAIN clause in USE, remember that it is really the same file. You can even turn an index on in one area and set a relation to it from the other; the record pointer is independent in each work area. Therefore, you can step through the first file to read the employee information, including supervisor IDs, and then you can switch to the second work area to search for the employee who has the first employee's supervisor ID and report their name. Listing 5.5 shows this technique.
Listing 5.5 05CODE05.PRG-Including the AGAIN Clause in USE
CLOSE DATABASES CLEAR * Open EMPL1.DBF in two work areas SELECT 0 USE \VFP5BOOK\DATA\EMPL1 SELECT 0 USE \VFP5BOOK\DATA\EMPL1 AGAIN ; ALIAS SUPERVISOR ; ORDER EMPLOYEEID * Step through the first area SELECT Empl1 SCAN ? EmplID, ; LastName FONT "Foxfont", 10, ; FirstName FONT "Foxfont", 10, ; Supervisor FONT "Foxfont", 10 * Now find the employee corresponding to the supervisor id IF SEEK(Supervisor, 'SUPERVISOR') ?? supervisor.LastName FONT "Foxfont", 10, ; supervisor.FirstName FONT "Foxfont", 10 ENDIF ENDSCAN
If you try to open a table that is bound to a database without first opening the database, VFP uses a back link to identify and open the corresponding database automatically.
VFP stores this back link in the table to identify its parent database. It stores this information at the end of the field list in the table header. The reference includes the name of the database and a relative path to the database from the table. If you open the table with a full pathname, VFP bases its relative path search for the database on that pathname rather than the current directory.
If you ever attempt to open a table and VFP reports that it cannot locate the database, one of two things probably happened: Either you moved the table relative to the database, or you moved the database relative to the table.
Suppose that you begin with database PTOFSALE and table CUSTOMER in the same directory. In this case, the back link in CUSTOMER simply is PTOFSALE.DBC. In other words, VFP looks in the same directory as the table for the database.
On the other hand, suppose that you store the database as \app1\PTOFSALE.DBC and the table as \app1\data\CUSTOMER.DBF. Now the back link is ..\PTOFSALE. In this case, the double dot at the beginning of the back link means to return to the parent directory of the current directory.
Similarly, suppose that you store the database as \app1\database\PTOFSALE.DBC and the table as \app1\data\CUSTOMER.DBF. Now the back link becomes ..\database\PTOFSALE. In this case, the back link returns first to the parent directory and then goes down a different branch: the \database subdirectory.
Following this logic, you can see that moving the entire tree structure that contains the database and table continues to work because the relative paths remain the same. You can even move the tree to a different directory or drive without a problem. You cannot move the tables relative to the database, however.
To be more explicit in your coding, precede commands that open a bound table with a command that opens the database, as shown in the following example:
OPEN DATABASE C:\VFP5BOOK\DATA\PTOFSALE USE C:\VFP5BOOK\DATA\CUSTOMER
VFP requires the database to be open because bound tables can use long table names, validation rules, triggers, or other database enhancements. Visual FoxPro would know nothing about these new features unless it opened the database first.
Visual FoxPro has two commands-DISPLAY and LIST-that create simple data listings. Following is the syntax for both commands:
DISPLAY [[FIELDS] FieldList] [Scope][FOR IExpression1][WHILE Iexpression2] [OFF] [TO PRINTER [PROMPT]|TO FILE FileName] [NOCONSOLE] [NOOPTIMIZE] LIST [FIELDS FieldList] [Scope][FOR IExpression1][WHILE Iexpression2] [OFF] [TO PRINTER [PROMPT]|TO FILE FileName] [NOCONSOLE] [NOOPTIMIZE]
In addition to listing the contents of tables, Table 5.1 lists
other clauses that are available with LIST and DISPLAY.
|CONNECTIONS||Information on named connections to external data sources, such as SQL Server, Access, dBASE, and Paradox|
|DATABASE||Information about the current database|
|FIELDS||Contents of table fields|
|MEMORY||Contents of memory variables|
|OBJECTS||Information about objects|
|PROCEDURES||Information on stored procedures|
|TABLES||Tables in an open database|
|VIEWS||Information on SQL views|
DISPLAY with no additional clauses shows the current
record of the current table. On the other hand, LIST
shows all the records from the current table, beginning with the
first one. You can modify the scope of either command with ALL,
NEXT, RECORD, or REST. Table 5.2
defines these modifiers.
|ALL||Every record in the table|
|NEXT n||The next n records, where n can be any integer, beginning with the current record|
|RECORD n||The nth record in the table (based on physical record numbers, not logical orders defined by a sort)|
|REST||All records from the current one to the end of the table|
You can also specify a subset of fields by using the FIELDS clause for both commands. You need to list only the fields; the keyword FIELDS is optional. Following are the various syntax forms for these two commands:
USE C:\VFP5BOOK\DATA\CUST DISPLAY && Displays the first record only LIST && Lists all records, scrolling if necessary DISPLAY NEXT 4 && Displays the next four records DISPLAY REST && Display the rest of the record in the table && from the current record pointer position DISPLAY ALL && Display all the records in the table LIST cLast, cFirst && List the last and first names of && every record in the table
Another difference between these two commands appears when you are displaying more records than will fit onscreen. When LIST reaches the bottom of the screen, it does not stop; it simply clears the screen and begins again from the top without pausing. DISPLAY pauses when it reaches the bottom of the screen and waits until you press a key to continue. Therefore, you might want to use LIST with the TO PRINTER clause to redirect output to the printer so that Visual FoxPro 6 does not pause. On the other hand, when you are viewing records onscreen, use DISPLAY to see each record. You can also list records to a file by using the TO FILE clause.
To display records onscreen, use the following:
DISPLAY ALL cCustId, cCompany
To list records to the printer, use the following:
LIST cCustId, cCompany TO PRINTER
To list records to a file, use the following:
LIST cCustId, cCompany TO FILE CUSTOMER.TXT
When you issue a LIST REST, you might not get all the records that you expect to get. The scope modifiers REST and NEXT both begin retrieving records from the current record pointer. If you move the record pointer before issuing either of these commands, you might not get the records that you expect.
If you have only a few records in a table, listing or displaying all records in a table to find the one that you want might not seem to be a bad idea. As tables grow to hundreds or thousands of records, however, LIST and DISPLAY by themselves become impractical. You need to jump immediately to the record or records that match the desired condition.
LOCATE helps you find records for a specific condition, even if the records are scattered throughout the table. Suppose that you want to see records for customers who have outstanding bills. Unless you have a really bad collection history, you really do not want to see every record. The following statement jumps to the first record that matches the search condition:
SELECT CUST LOCATE FOR cBillDue > 0
LOCATE always begins at the first record in a file and checks records sequentially until it finds the first one that passes the condition test. To see more information about this customer, use DISPLAY by itself to print to the screen all fields for the current record. You can also use LIST NEXT 1.
But suppose you suspect that more customers have outstanding bills. If you reissue the LOCATE command, Visual FoxPro 6 merely displays the same record because it always begins its search from the top of the file. If you use CONTINUE instead, the search begins at the record immediately after the current one and finds the next record that matches the condition. In fact, you can continue issuing CONTINUE until VFP reaches the end of the file. The code segment in Listing 5.6 shows you how to use these commands to view all customers who have outstanding bills.
Listing 5.6 05CODE06.PRG-Using LOCATE and CONTINUE
* View customers with outstanding bills USE CUST LOCATE FOR cBillDue > 0 DISPLAY OFF cCustId, cBillDue DO WHILE !EOF() CONTINUE IF NOT EOF() DISPLAY OFF cCustId, cBillDue ENDIF ENDDO
When you use LOCATE or CONTINUE, EOF() remains .F. until the search fails to find additional records that match the condition. Alternatively, you can use FOUND(), which returns .T. as long as records are found.
The advantage with LOCATE is that you can define a condition on any field or portion of a field in the table; it does not need an index. To find all customers who have the word CONSULTANTS in their company names, use the commands in Listing 5.7.
Listing 5.7 05CODE07.PRG-Putting Them All Together
* Locate all customers who are consultants USE CUST LOCATE FOR 'CONSULTANTS' $ UPPER(cCompany) LIST OFF cCompany DO WHILE !EOF() CONTINUE IF FOUND() LIST OFF cCompany ENDIF ENDDO
When you compare strings, convert both sides of the string to all uppercase or all lowercase before comparing them to make comparisons independent of case.
Even though the LOCATE command does not require that any indexes exist, it is a good idea to maintain indexes on all the fields that you refer to in your LOCATE commands. This is because the optimizing technology in Visual FoxPro 6, Rushmore, uses available indexes to improve the speed at which it can evaluate your FOR clause on the LOCATE command.
You can even use LOCATE with a condition that combines several fields. Although LOCATE does not require an index, if one exists that matches the form of the condition, LOCATE automatically optimizes the search for matching records. What this means to you is improved performance with almost instantaneous response, even in tables that have millions of records.
Another command that you can use to find selected records in a table is SEEK. Unlike LOCATE, SEEK requires the use of an index. Furthermore, if the index is a compound one, or if you open multiple simple indexes, SEEK works only with the current one. Therefore, first SET ORDER TO the index that you want SEEK to use. SEEK has a simple syntax. Simply follow SEEK with the value that you want it to find. To find CUSTOMER 0025, enter the code shown in Listing 5.8.
Listing 5.8 05CODE08.PRG-Using SEEK
* Locate customer '0025' USE CUST SET ORDER TO TAG custid SEEK '0025' IF FOUND() DISPLAY ELSE = MESSAGEBOX('CUSTOMER 0025 was not found') ENDIF
SEEK requires a value of the same data type as the current index. A seek expression does not always have to match the index value exactly, however. If SET EXACT is OFF, the index value needs to match character-for-character for only as many characters as are in the seek value. Suppose that you want to search CUST.DBF for a company whose name begins with Laser. Use the following code:
USE C:\VFP5BOOK\DATA\CUST SET ORDER TO TAG company SEEK 'Laser'
If SET EXACT is OFF, Visual FoxPro 6 finds the record. If SET EXACT is ON, the seek value must match the index value exactly, in total number of characters as well as each individual character. Thus, if you index CUSTOMER.DBF by company and perform the following SEEK with SET EXACT OFF, VFP places the record pointer at the first company whose name begins with L:
SET EXACT OFF SEEK 'L'
The SET EXACT command also works for other conditional tests, such as in the FOR clause described earlier in this chapter. In these cases, however, an inexact search matches characters from left to right until a mismatch is found or until the expression on the right side of the = operator ends. You can temporarily override inexact comparisons for a single conditional test by using the == operator. Therefore, the following two LOCATEs are equivalent:
SET EXACT ON LOCATE FOR cFirst = 'NATASHA '
SET EXACT OFF LOCATE FOR cFirst == 'NATASHA '
Even though == and SET EXACT ON seem to be equivalent, they are slightly different. SET EXACT ON compares two strings until the one on the right side of the = is exhausted, and then checks to see if both strings are of equal length. ==, on the other hand, pads the shorter string with spaces to force it to be the same length and then checks the two strings to see if they are the same.
Normally, when a LOCATE or SEEK fails, Visual FoxPro 6 leaves the record pointer at the end of the table. In some cases, however, you might not know exactly what value to search for, so you guess. You want the pointer to stop at the next record after the position of the search value, if that value exists.
Suppose that you don't know the exact name of the company, but you think that it's Golf Heaven, so you write the code shown in Listing 5.9. If no company name in CUST.DBF has this exact spelling, the record pointer moves to the end of the file. If you first enter the command SET NEAR ON, however, VFP stops at the first company name that alphabetically follows Golf Heaven: Goofer's Arcade Heaven in table CUST.DBF.
Table 5.3 might help clarify interpreting a SEEK with
SET NEAR ON and SET EXACT OFF.
|Exact match of search string found.|
|No exact match found, but record pointer is on the next record alphabetically.|
|No exact match found; search string is beyond the last value in the table.|
Listing 5.9 05CODE09.PRG-Results
of SEEK with
SET NEAR ON
* Locate customer data for GOLF HEAVEN * Open table and set tag to COMPANY USE C:\VFP5BOOK\DATA\CUST SET ORDER TO TAG company * Save current setting of NEAR, and turn NEAR ON curnear = SYS(2001, 'NEAR') SET NEAR ON * Find record closest to 'GOLF HEAVEN' SEEK 'GOLF HEAVEN' * If search goes beyond last records, display last record. IF EOF() GOTO BOTTOM ENDIF DISPLAY * Reset NEAR to original value SET NEAR &curnear
Use SET EXACT ON only if you search or compare with strings of the same length as the index or comparison field. Otherwise, SET EXACT OFF enables you to search by entering a substring. If you know that the substring that you want to search with matches a field in the table character for character, you can SET NEAR OFF. Otherwise, SET NEAR ON finds the next record alphabetically after the substring.
If you are not sure that an entered value will exactly match a value in the table, use SET NEAR ON to find the next record alphabetically. Use this technique when you are setting ranges for parameters in reports. On the other hand, if you need to create a relation between customer ID in the order file and in the customer file, you want an exact match and should use both SET EXACT ON and SET NEAR OFF.
Many operations call for a subset of the table that is being used. Perhaps you want to see only the customers in Pennsylvania or those customers in California who have outstanding bills. In either case, define a filter condition using the SET FILTER TO command followed by LIST or DISPLAY, as follows:
USE C:\VFP5BOOK\DATA\CUST SET FILTER TO cBState = 'PA' LIST
Even though it does not use a FOR clause, SET FILTER is Rushmore-optimizable.
You have the same capability to filter records directly from within both LIST and DISPLAY. The FOR clause enables you to define a condition that selects records from the table. To select records for Pennsylvania customers, for example, use an expression like the following:
FOR cBState = 'PA'
Use the following expression to DISPLAY the results:
DISPLAY ALL cCustId, cCompany FOR cBState = 'PA'
This command steps through the records sequentially from the first record and displays only those that match the condition of the FOR expression. The potential problem with this method is apparent if you envision that the customer records will be sorted but not indexed by state: Visual FoxPro reads more records than it needs to. When Visual FoxPro reads the table using a FOR clause, it tests each record to find the Pennsylvania ones. Furthermore, even after VFP displays the last record for Pennsylvania customers, it continues reading records until it reaches the end of the file. The FOR clause makes no assumptions about sort orders; it is a brute-force way of processing all records that match the condition.
If the FOR expression is Rushmore-optimizable, performance is greatly improved. If the FOR expression is not optimized, it can be extremely slow because it reads every record in the table. The programmer has the responsibility of ensuring that the FOR expression is Rushmore-optimizable.
An alternative method of finding records that match an expression uses the WHILE clause. To begin, open a table and an index that uses the same order as the search condition. Next, find the first record that matches the search condition. To do so, use LOCATE or SEEK. If EOF() returns .T. or FOUND() returns .F., no records match the search condition. Having found the first record that matches the condition, you know that all additional matching records in the table follow it sequentially. Read these records one at a time, using a simple loop such as DO WHILE or SCAN, until the search condition fails. Then you can ignore the rest of the table without searching it because you know that no other records will match.
Listing 5.10 shows a variation on the preceding search. It requires a nonoptimizable search, because it uses a substring expression to find a customer whose name begins with the letter M.
Listing 5.10 05CODE10.PRG-Using WHILE
SELECT CUST SET ORDER TO TAG COMPANY LOCATE FOR LEFT(cCompany,1) = 'M' SCAN WHILE LEFT(cCompany,1) = 'M' ? cCustId, cCompany ENDSCAN
The real difference between FOR and WHILE is that FOR will check every record in the command scope and only affect those that meet the condition. WHILE will affect records until it encounters one that does not meet the condition. This means that WHILE is always faster than FOR because it deals with fewer records.
You can combine FOR and WHILE clauses in the same command line. For example, if you want to see the customers in Texas that owe you more than $1,000, you could use the following:
USE Customer ORDER TAG State SEEK 'TX' DISPLAY WHILE State = 'TX' FOR Balance > 1000
You might wonder which condition is best to put in the WHILE and which in the FOR. You want to put the most restrictive condition-that is, the one that provides the smallest number of records-in the WHILE. In the preceding example, if there were 20,000 customers in Texas and 1,000 that owe more than $1,000, the fastest way would be the following:
USE Customer ORDER TAG Balance SEEK 1000 DISPLAY WHILE Balance>1000 FOR State = 'TX'
If you need to retrieve records based on an expression that does not have a corresponding index, you probably need to use a nonoptimized FOR. Because such expressions are slow, you might want to consider creating a separate index for this situation if it occurs frequently.
Using SEEK and a WHILE clause provides performance that is better than that of a Rushmore-optimized FOR expression.
One additional enhancement can be added to this example. SEEK has an equivalent function call that returns .T. when it finds a match. You can place this function directly in the DO WHILE statement, as shown in Listing 5.11.
Listing 5.11 05CODE11.PRG-Using SEEK and DO WHILE
SELECT CUST SET ORDER TO TAG CUSTSTATE IF SEEK('PA') && Execute the code block on if PA is found DISPLAY REST cCustId, cCompany ; WHILE cBState = 'PA' ENDIF
In the example in Listing 5.11, notice the scope word REST used on the DISPLAY command. This scope word is optional in this situation because a WHILE clause implicitly includes a scope of REST.
You can sort data in a table in two fundamental ways. The first is a physical sort, also called a permanent sort. To create a physical sort, you need to rewrite the table in the desired sort order. The advantage of this method is that you need no additional file other than the data file itself. The disadvantage is that the sort order is difficult to maintain if users make frequent updates to the data. A physical sort requires more code to maintain or requires frequent re-sorting.
A further disadvantage to sorting occurs with large files. If your tables get large, a sort will take considerable time to execute.
The second type of sort uses a second file called an index. An index stores values for the indexed field(s) and pointers to the location of those values in the database. The pointer is a record number. Visual FoxPro 6 stores the indexed field values in a b-tree (balanced binary tree) structure that it can search quickly.
As mentioned earlier in this chapter, you are unlikely to add records to the table in a sorted order. But if you do, what happens if you need more than one sort order? You certainly cannot have two physical sort orders in one table. The alternative-maintaining two or more separate copies of the data in tables with different sort orders-is a major nightmare. Yet at times, you want to store data in sorted order. Other than entering the records that way, how can you create a sorted table?
The SORT command creates another copy of the table and enables you to sort on any field or combination of fields. SORT's syntax is rather rich, as follows:
SORT TO TableName ON FieldName1 [/A][/D][/C] [FieldName2 [/A][/D][/C]...] [ASCENDING|DESCENDING] [Scope][FOR Iexpression1][WHILE Iexpression2] [FIELDS FieldNameList|FIELDS LIKE Skeleton |FIELDS EXCEPT Skeleton] [NOOPTIMIZE]
Suppose that you need to generate a report that lists customers alphabetically by state, and that within each state you want to list customers with the largest annual purchases first and continue to the smallest. You need an ascending sort on state and a descending sort on purchases. SORT can create this file with the following statement:
USE \VFP5BOOK\DATA\CUSTOMER SORT TO STATSALE ON cBillingStateProvince /A/C + ; nPurchasesYearToDate /D
SORT creates a free table. Therefore, it truncates the long field names defined in the original table of the example to 10-character fields.
You can create indexes that have different orders (ascending/descending) on the parts of the index, such as:
Another disadvantage of using SORT is the quick obsolescence of the data. As soon as you create the table, it begins to become obsolete. Someone else might immediately add, delete, or modify a record. One situation in which physically sorting the records in a table can benefit an application is when a table has a very large number of records and there is one primary order in which those records are used. In this case, it is beneficial to sort the table on that order whenever you are doing maintenance on the table (such as re-creating the indexes or something else).
In Chapter 6 "Creating Basic Queries," you learn to create CURSORS with the SQL SELECT command. Cursors provide the same advantages as SORT (sort directions by field). In addition, Visual FoxPro 6 automatically deletes cursors when you exit VFP or open another table in the same work area.
Indexes are the best way to provide different sort orders for data. Visual FoxPro 6 can create, use, and maintain three types of indexes:
- Independent indexes (.IDX)
- Compound structural indexes (.CDX)
- Compound non-structural indexes (.CDX)
Indexes are not static in the same way that tables created with SORT are. Rather, indexes are dynamic. Indexes adapt automatically to added, deleted, and modified records, as long as you keep them open. The following is the syntax for the INDEX command:
INDEX ON eExpression ; TO IDXFileName|TAG TagName [OF CDXFileName] [FOR IExpression] [COMPACT] [ASCENDING|DESCENDING] [UNIQUE|CANDIDATE] [ADDITIVE]
Chapter 3 "Defining Databases, Tables, and Indexes," discusses how to create compound structural indexes by using the index page of the Table Designer. You can also create and use any of the three indexes directly from the Command window or from within a program.
Simple Independent Indexes Independent indexes on a single expression were the first indexes used by FoxBase, the predecessor of FoxPro. For each index expression, you create a separate index file with its own name. The root names of the table (.DBF) and index (.IDX) do not have to be the same. In fact, if you have more than one simple index, it is not possible for all the indexes to have the same root name. Many programmers developed the habit of naming the primary index with the same root name as the table. The following lines use the INDEX command to create an independent index on customer ID in CUST.DBF, which has 10-character field names:
USE CUST INDEX ON cCustId TO CUSTID.IDX
At this point, the table and index are open. But in most programs, you need to specifically open previously created indexes with the table, as in the following command:
USE CUST INDEX CUSTID
If you have more than one index for a table, open the indexes at the same time if you plan to make modifications to the records, as follows:
USE CUST INDEX CUSTID, COMPANY, ZIPCODE
If you do not open each index, VFP cannot maintain the indexes that are not open if you add, delete, or modify records. (Because the indexes can have any name, how would FoxPro know about them?) If you attempt to use the table with one of these indexes later, VFP might display records out of order or even point outside the table due to deleted records.
To get the best performance, use the COMPACT option with single indexes. Except for backward compatibility, continued use of independent indexes is not recommended. It is too easy for the indexes to get out of sync, and they are much slower than compact indexes. The exception is when you need to use an index temporarily and will delete it when the task is complete. n
Compact independent indexes are not compatible with older versions of FoxBase.
Compound Indexes Although independent indexes work, compound indexes are a better alternative for most indexing needs. These indexes enable an unlimited number of separate index expressions in one file. You cannot lose files or forget the names of all the index files that belong to one table. If you assign the same root name to the index files used by the table, you do not even have to worry about opening the index file; Visual FoxPro 6 opens that file automatically when it opens the table. These compound indexes have a special name: structural indexes.
When you are working with compound indexes, you need to know not only how to set a tag, but also which tag is current, the names of the defined tags associated with a table, and their index expressions. You saw earlier that SET ORDER TO sets the current index when you are using standalone index files and SET ORDER TO TAG sets the current tag for compound tags. But how can you determine which indexes are associated with a table?
The TAG word in SET ORDER TO is optional. SET ORDER TO TAG City and SET ORDER TO City are equivalent as long as there is no simple index file named City.idx open.
You can use a combination of TAG() and TAGCOUNT() to get a list of the index tags in a compound index. TAGCOUNT() tells you how many tags the index has and TAG() tells you name of each tag.
No relation exists between a table and its independent indexes or non-structural indexes. The only index that you can automatically associate with a table is its structural index. Even in the context of a program that opens independent and non-structural indexes with their table, you cannot be sure that you have opened all the indexes.
For all the reasons just mentioned, it is recommended that you use structural compound indexes on your tables. The only exception is when you are creating a temporary index that will be deleted immediately after its use.
You can examine the indexes opened with a table, however, by using several functions. The NDX() function returns the names of any open .IDX files. (Originally, index files had an .NDX extension, and they still do in dBASE. Therefore, the function to list them became NDX(). The name NDX() remains the same in Visual FoxPro 6, even though its indexes have an .IDX extension.)
To find the names of open compound indexes, use the CDX() or MDX() functions. These two functions perform the same task. At minimum, each function requires a numeric parameter. A value of 1 returns the name of the structural index, if present. Subsequent values return the names of other compound indexes in the order in which they are opened. These two functions, along with NDX(), return an empty string when the index number exceeds the number of open indexes. Therefore, test for this condition to determine when to stop.
Both the CDX() and MDX() functions support a second parameter that enables you to determine the index names associated with tables in another work area. Enter the second parameter as the work-area number or the file alias, as in the following example:
When you know the names of the compound indexes, find their tag names. The TAG() function returns the name of each tag in the current compound index, as in the following example:
i = 1 DO WHILE !EMPTY(TAG(i)) ? TAG(i) i = i + 1 ENDDO
You also can include arguments to specify the compound-index name, as well as the table name or work area.
Although the preceding functions help you define the index files and tag names that are associated with an open table, they do not tell you which index or tag currently controls the order. Two functions provide this information. ORDER() returns the controlling index tag name, and SET('ORDER') returns the controlling index tag name and the compound index file's name with the full path.
Listing 5.12 provides a routine that documents the tables and indexes that are currently open in an application.
Listing 5.12 05CODE12.PRG-Determining Which Tables and Indexes Are Open
LOCAL lnCurArea, lnHighestArea, lnIwork, lnIndx, lnIcdx, lnItag CLEAR * Current information on open tables * Save the current work area lnCurArea = SELECT() * Find the highest work area in use SELECT 0 lnHighestArea = SELECT(0) - 1 * Loop through the work areas FOR lnWork = 1 to lnHighestArea WAIT WINDOW "Examining workarea: "+STR(m.lnWork,5) NOWAIT SELECT (lnWork) IF EMPTY(DBF()) LOOP ENDIF ? 'Work area ' + STR(m.lnWork,5) + ': ' ? ' Table: ' + DBF() * Next scan for simple indexes ? ' Simple Indexes' FOR lnIdx = 1 to 256 IF EMPTY(NDX(lnIdx)) IF lnIdx = 1 ? ' NONE' ENDIF EXIT ENDIF ? ' Index: ' + NDX(lnIdx) ? ' Expression: ' + TAG(lnIdx) * Check if this IDX is the master index IF ORDER(ALIAS(),1) = NDX(lnIdx) ? ' This is the MASTER index' ENDIF ENDFOR * Scan for compound indexes ? ' Compound Indexes' FOR lnCdx = 1 to 256 IF EMPTY(CDX(lnCdx)) IF lnCdx = 1 ? ' NONE' ENDIF EXIT ENDIF ? ' Index: ' + CDX(lnCdx) * Check if this CDX holds the master index IF ORDER(ALIAS(),1) = CDX(lnCdx) ? ' MASTER index: ' + SYS(22) ? ' expression: ' + TAG(VAL(SYS(21))) ENDIF * Loop for each tag in the compound index FOR lnTag = 1 TO 256 IF EMPTY(TAG(CDX(lnCdx),lnTag)) EXIT ENDIF ? ' Tag Name: ' + TAG(CDX(lnCdx),lnTag) ? ' Expression: ' + KEY(CDX(lnCdx),lnTag) ENDFOR ? ENDFOR ENDFOR * Return to original area SELECT (lnCurArea)
To keep the index tags balanced, Visual FoxPro 6 retains the old tags and marks them obsolete in a .CDX file after re-creating a specific tag. This arrangement causes the .CDX file to grow. To prevent this growth, DELETE TAG ALL for the table and re-create the tags using INDEX ON. How can you do this in a program? Use the concepts from the preceding program to save the tag names and expressions for your selected .CDX in an array; then delete the .CDX and re-create the tags from the array.
When dealing with tables in a database, things can get tricky. The persistent relations stored in the database are based on index tags. If you use DELETE TAG ALL, you will destroy any persistent relationships for that table in the database. To solve this problem, you need to use DbGetProp to get all the relations for the table into memory variables or an array. After you create the index tags again, you need to use the ALTER TABLE command to reset all the persistent relationships.
Indexing is more than just sorting records for views or reports. Indexes have a more direct impact on the application's performance than ever before. Since the release of FoxPro 2.0, when Microsoft introduced Rushmore, this optimization technique has dramatically improved the performance of applications by locating records in large tables faster. Rushmore achieves this task through the use of indexes. It uses any index type-.IDX files, compact .IDXs, or .CDXs. In fact, it is common today to define indexes that you never intend to use for retrieving or viewing data; rather, these indexes exist solely to enhance the performance of SEEKs and REPORTs. Developers have reported search improvements 100 to 1,000 times faster when they use an optimizable Rushmore expression compared with a nonoptimized expression. The greatest improvements come from using compact .IDX or .CDX files, principally because VFP can read more of the index into memory at one time.
Listing 5.13 times how long it takes to count the number of zip codes in Pennsylvania.
Listing 5.13 05CODE13.PRG-Testing Index Search Time
CLOSE ALL CLEAR SET OPTIMIZE OFF ? "Counting the number of zip codes in Pennsylvania." USE \VFP5BOOK\DATA\ZIPCODE nStart = SECONDS() COUNT FOR (STATE == 'PA') TO nZipCnt nEnd = SECONDS() ? ' COUNT WAS: ' + STR(nZipCnt, 12, 0) ? 'ELAPSED TIME WAS: ' + STR(nEnd - nStart, 8, 2) SET OPTIMIZE ON
On a test system, it took 4.61 seconds to find 2,217 records out of 42,818. Elapsed time on your system will vary depending on the processor speed, drive speed, physical sector allocation, and other factors. If you remove both SET OPTIMIZE statements, the required time drops to 0.16 second. Although this figure is an increase of only 29 times, the sample table is small; improvements become more dramatic as the number of records grows. The difference can easily extend into minutes and even hours for large tables that have millions of records.
How can Rushmore provide such improved performance? First, by reading an index, it quickly locates the first record reference required by the condition and then reads the index only until the condition changes. Second, it reads only the index, rather than the table; it does not have to use the record pointer to read the actual table data. In the zip code table, Rushmore reads the index tag STATE, which contains an entry for each record in the table. This entry contains the state name and a pointer to the table record. There is no reason to go to the table. Because Rushmore can directly read the state name stored in the index, it can simply count the number of times that the state PA occurs.
Rushmore must compare the index expression with a memory variable or constant. In other words, you can compare the index expression with a string, number, or date value, or store a value in a constant and use it. But Rushmore cannot optimize an expression that compares an index expression with another variable in the table. Suppose that you have a table that includes birth dates and hire dates. Rushmore cannot optimize an expression such as the following, which lists records for people 16-years old or older:
To use Rushmore, you don't need to know the details of how it works internally any more than you need to know the details of how an internal-combustion engine works to drive a car. But obviously, you need to know how to define indexes that Rushmore can optimize. More important, you need to know what Rushmore likes and dislikes.
This chapter has already mentioned a few rules that you must obey to use Rushmore successfully. Another rule is that Rushmore can optimize only those expressions based on regular, primary, and candidate indexes. These indexes (as mentioned in Chapter 3 contain a reference for every record in the table, even if the index value is not unique. If you think about it, how else would Rushmore count the number of zip codes in a state if you define the state index as unique? You would never get a value other than 0 or 1.
Rushmore also requires that any search or query exactly match the index key. This statement has led many developers to believe that they needed a separate index tag for every field in their tables. Then the developers realized that they also needed more complex index expressions for reports or other activities, so they created additional indexes. Before long, the total number of indexes grew out of control, and the size of the .CDX file rivaled or exceeded that of the data itself. Performance degrades not just because of the size of the .CDX, but because Visual FoxPro 6 has more indexes to maintain every time you add, delete, or modify a record.
Often, you can delete many of these indexes without seriously affecting the application's performance. Returning to the customer file, suppose that you have the following set of indexes (using short field names in CUST.DBF):
INDEX ON UPPER(cLast) TAG LASTNAME INDEX ON UPPER(cFirst) TAG FIRSTNAME INDEX ON UPPER(cCompany) TAG COMPANY INDEX ON UPPER(cLast)+UPPER(cFirst) TAG STAFF INDEX ON UPPER(cCompany) + UPPER(cLast) + ; UPPER(cFirst) TAG EMPLOYEES
Although these tags appear to cover every contingency, they affect performance when you add, delete, or modify records because VFP must change more information in more tags. An alternative method takes advantage of the fact that Rushmore can optimize concatenated indexes. If you use SET EXACT OFF, you do not need to supply every field as an index.
If you want to search on the company field, for example, you could use either the COMPANY or EMPLOYEES tag, as follows:
LOCATE FOR cCompany = 'RENARD CONSULTANTS'
Alternatively, you could use the following:
SET EXACT OFF LOCATE FOR cCompany + cLast + cFirst = ; 'RENARD CONSULTANTS'
In the first LOCATE, Rushmore looks at the index expression on the left and decides to use the COMPANY tag. In the second, LOCATE uses the EMPLOYEES tag. The second expression is only slightly slower than the first. The only consideration is that you must include the entire index expression on the left side.
A similar argument applies when you are searching for names and choosing between the tags STAFF and LASTNAME. The only individual field tag that you might still need is FIRSTNAME. Therefore, you can eliminate at least two of these five indexes, and maybe more, depending on what the application really needs to do.
Although Rushmore can optimize an expression that matches only a portion of the concatenated fields, when you have SET EXACT OFF, the shorter string must be on the right side of the expression. Also, there is no way to skip a field in a multiple-field index to search only for later components.
Notice that if you eliminate tags LASTNAME and COMPANY, Rushmore does not optimize a search like the following because the index expression no longer exactly matches an index tag:
LOCATE FOR cCompany = 'RENARD CONSULTANTS'
Other things that Rushmore does not like and will not handle include index expressions that contain NOT or !. Surprisingly, you cannot include NOT or ! in the FOR expression and have Rushmore optimize it.
Rushmore optimizes expressions that contain exactly equal relational operators (==), but not the contained-in operator ($), AT(), ATC(), or RAT(). It will not use indexes that include FOR conditions.
On the other hand, you can build compound FOR expressions that Rushmore can optimize (or at least partially optimize). A compound expression joins two or more simple expressions with an AND or an OR. In this case, each expression can reference a different index expression. When it is looking at compound expressions two at a time, Rushmore optimizes the entire expression only if it can optimize both individual expressions. If it can only optimize one of the expressions, and you join the expressions with AND, Rushmore executes the part that it can optimize first. Then Rushmore takes this intermediate result set and performs the nonoptimizable portion on it, rather than on the entire database. This arrangement usually results in some improvement over a completely nonoptimizable expression.
Visual FoxPro 6 will partially optimize a compound expression that contains one optimizable expression and one nonoptimizable expression joined with an OR because it still must read the entire table to evaluate the nonoptimizable expression. Finally, if both individual expressions are nonoptimizable, Rushmore cannot optimize any part of the expression.
Table 5.4 displays possible combinations of expressions and their
Potentially, Rushmore can optimize every VFP command that supports a FOR clause. You must work with a single table, however, and follow the rules described earlier in this section. The following list shows the commands that Rushmore supports:
AVERAGE DISPLAY REPORT FORM BROWSE EDIT SCAN CALCULATE EXPORT SET FILTER TO CHANGE LABEL SORT COPY TO LIST SQL SELECT COPY TO ARRAY LOCATE SUM COUNT RECALL TOTAL DELETE REPLACE
Remember that Rushmore works with these commands only if you use them with a single table. To query or gather data from multiple tables, use SQL SELECT. Only the SQL SELECT command supports Rushmore optimization across multiple tables.
If you have a simple SQL SELECT on a single table with no special functions, groups, or sort orders, Visual FoxPro 6 uses Rushmore to create a filter on the database that returns the result set extremely fast. VFP does use Rushmore on multiple-table queries, but only if you use SQL SELECT.
At the beginning of this section, you learned that SET OPTIMIZE OFF can turn Rushmore off. Why turn off a tool that usually results in better performance? You typically don't want to turn Rushmore off when you are using expressions that it can optimize. When you are using nonoptimizable expressions, however, leaving it on actually lowers performance. In such cases, Rushmore needs extra machine cycles just to know that it is not needed. Therefore, turn optimization off by using the SET OPTIMIZE statement or by adding the clause NOOPTIMIZE after the FOR expression, as follows:
LOCATE FOR cFirst = 'NATASHA' NOOPTIMIZE
Another reason for turning off optimization involves the particular command you are using; for example, the REPLACE command can have a FOR clause. If you were replacing one of the fields used in the Rushmore optimized FOR clause, the actual command itself would be causing the Rushmore selected set of records to be incomplete.
REPLACE ALL State WITH 'NY' FOR City = 'New York' ; OR State = 'NY'
You can improve the performance of Rushmore by deselecting any default index or tag before executing a command that Rushmore can optimize. The following command leaves the index file open but turns off its active use:
SET ORDER TO 0
When there is an index order set, Rushmore must sort the records it finds on the index expression. This extra step will slow down the result. Although this is true, Rushmore-optimized expressions will still be faster than nonoptimizable expressions even with an index order set.
As a general rule, you add as many indexes as necessary to search for and display data. You should periodically review whether you require all of them, however.
When an application's performance slows due to index overload, you need to identify indexes that you use only occasionally. You should delete those index tags and enable the program to re-create them as needed.
To delete a tag, open the table and use the command DELETE TAG tagname.
You might want to create temporary indexes on a local rather than a network drive. You might even gain some performance by creating a cursor with a SQL SELECT statement to gather and sort the data for a report. Listing 5.14 shows a method that might appear to be awkward at first, yet yields quite respectable performance when the selected records are widely scattered and need to be reused. The routine begins by creating a temporary index; then it uses a Rushmore-assisted SEEK to find the first record. The routine processes the desired records using a SCAN loop; generates a report with the same index; and, finally, deletes the temporary index.
Listing 5.14 05CODE14.PRG-Using Rushmore
CLOSE DATABASES CLEAR * Open the customer table (free table) USE \VFP5BOOK\DATA\cust * Set NEAR ON to find first customer with more than * $200 of outstanding bills cCurNear = SYS(2001, 'NEAR') cCurExact = SYS(2001, 'EXACT') cCurSafety = SYS(2001, 'SAFETY') SET NEAR ON SET EXACT OFF SET SAFETY OFF * Create a temporary index on outstanding bills * This assumes that you have previously created a directory * with the name C:\TEMP INDEX ON nBillDue TO C:\TEMP\BILLS.IDX * Find records and change default billing method to cash 'CA' SEEK 200 SCAN WHILE nBillDue >= 200 REPLACE cPayMeth WITH 'CA' ENDSCAN * Create a report for sales representative of customer that * must pay cash REPORT FORM \VFP5BOOK\PROGRAMS\BillMeth ; FOR nBillDue => 200 * Reset environment SET NEAR &cCurNear SET EXACT &cCurExact SET SAFETY &cCurSafety * Delete temporary index USE ERASE C:\temp\bills.idx RETURN
A SEEK combined with SCAN WHILE structure always executes faster than Rushmore. Also notice the use of SET NEAR ON to find the first record above a specific value, just in case no records have a value of exactly 200.
The best indexing method varies from one situation to the next. Your choice depends on the specific task that you want to accomplish, the current organization of the data, the specific commands that you use, and (probably) the phase of the moon.
The BROWSE command illustrates this point. Reopen ZIPCODE.DBF and turn off the index, as follows:
USE C:\VFP5BOOK\DATA\ZIPCODE SET ORDER TO
The natural order of this data is city within state. Thus, all the zip codes for Alaska appear first, and all the zip codes for Wyoming appear at the end of the file. Now try the following simple BROWSEs:
BROWSE FOR cBState = 'AK' BROWSE FOR cBState = 'WY'
The first BROWSE comes up almost immediately. The second BROWSE opens a window immediately but takes several seconds to display the first screen of data. BROWSE displays the first screen of data as soon as it has enough records to populate it. Thus, the second command pauses with a blank window, while BROWSE searches through the file until it finds Wyoming.
Suppose you know that the selected data either appears very near the beginning of the file or that it encompasses the majority of the data (such as BROWSE FOR !DELETED()). A simple BROWSE statement without an index might appear to be as quick as one with an index using Rushmore, because BROWSE does not need to find every record that matches the criteria-only enough to display a screen of records. That explanation might be stretching the point just a little, but you should try several alternatives before you assume that Rushmore and another index tag provide the best performance.
© Copyright, Sams Publishing. All rights reserved.