File sd4feats.tar.Z is a unix-compressed tarfile. Uncompressing, then untarring it produces directory sd4feats, containing text files train.dat and test.dat. File train.dat contains training feature vectors made from the 2000 first-rolling fingerprints of NIST Special Database 4, and file test.dat contains testing feature vectors made from the 2000 corresponding second-rolling fingerprints of that database. Along with the feature vectors, these files contain target vectors indicating the correct classes of the fingerprints. These files are being provided for use in testing pattern classification programs. Each feature vector consists of 112 floating point numbers, made by a feature extractor that ends with the Karhunen-Loeve transform. Details about the feature extraction process used may be found in: C. L. Wilson, G. T. Candela, P. J. Grother, C. I. Watson, and R. A. Wilkinson. Massively Parallel Neural Network Fingerprint Classification System. Technical Report NIST IR 4880, National Institute of Standards and Technology, July 1992. G. T. Candela and R. Chellappa. Comparative Performance of Classification Methods for Fingerprints. Technical Report NIST IR 5163, National Institute of Standards and Technology, April 1993. The above reports are publically available on the anonymous FTP server sequoyah.nist.gov in the directory /pub/nist_internal_reports, as compressed PostScript files. The file README in that directory contains a comprehensive listing of all reports there. Documentation for Special Database 4, and for other databases published by the Image Recognition Group, is also available on the FTP server in the directory /pub/databases/manuals. The Special Database 4 documentation is in the file fingerdoc.ps.Z. Special Database 4 is too large to make available on the FTP server; it is available on CD-ROM. For information about ob- taining Special Database 4 or other Image Recognition Group databases, phone NIST Standard Reference Data at 301/975-2208. The format of files train.dat and test.dat is as follows. The first two lines of each file are header information; following lines consist of the Karhunen-Loeve feature vector, then the target vector (indicating correct class), for each fingerprint. Line 1: N: Number of examples in the file (N = 2000) n: Dimensionality of the features contained in the file (n = 112) L: Number of classes (L = 5) Line 2: L strings. Text labels for the L classes. (Abbreviations A, L, R, T, W for classes Arch, Left Loop, Right Loop, Tented Arch, Whorl.) Lines 3 and onwards: For each of the N examples: n floating point values of the leading K-L coefficients L floating point target values. The position of the 1.000000 indicates which class the pattern belongs to. In all other positions there is a 0.000000. For example, if a fingerprint is a Left Loop, then this is class 2, and the target vector for this fingerprint is therefore: 0.000000 1.000000 0.000000 0.000000 0.000000 If you have any questions about train.dat and test.dat, contact: G. T. Candela email: jerry@magi.nist.gov phone: (301) 975-3388 mail : Image Recognition Group Building 225, Room A-216 National Institute of Standards and Technology Gaithersburg, MD 20899