To evaluate the performance of the proposed multimodal fusion in real-world applications, real multimodal databases are built for two fusions. (a) Database for NIR Face and VL Face. For the fusion of NIR and VL face, the capture device consists of two CMOS cameras. One is for the NIR image and the other is for the VL image, with resolution of 640*480 (in pixel). Therefore, a pair of NIR and VL face are captured from one object simultaneously. All the face images are taken near-frontal but in an uncontrolled indoor environment with varying pose, expression, and lighting. Some examples of typical NIR and VL face pairs in the database are shown in Figure 9.6. Both NIR and VL face images are then cropped into 144 112 according to the eye coordinates detected automatically. Figure 9.7 shows some examples of the cropped images.
Figure 9.7. Cropped VL face examples (upper) and NIR face examples (lower) in database.
Multimodal Biometrics Based on Near-Infrared Face Recognition
The NIR and VL face database is composed of 3940 pairs of images from 197 subjects, with 20 pairs per person. All the images are divided into training set and test set randomly. The training set includes 3000 pairs of images from 150 subjects, while the test set includes the left 940 pairs of images from 47 subjects. So the training set and the testing set have no intersection of persons and images either. In the training phase, we construct the AdaBoost classi ers for NIR and VL face modalities respectively and utilize the training set for LDA and PSM learning based fusion. In testing phase, each input NIR face and VL face image pair is matched with 2 all of the other image pairs in the test set. This generates 47 C20 = 8930 intraclass 2 (positive) and 20 20 C47 = 432, 400 extra-class (negative) samples. (b) Database for NIR Face and Iris. To capture a high-resolution image including face and iris information suf ciently, we use a 10-megapixel CCD digital camera with up to 3648 2736 pixels. The camera is placed about 60 80 cm away from the subject. Around the camera lens, active NIR LED lights of 850 nm are mounted to provide frontal lighting. We use a band-pass optical lter on the camera lens to cut off visible light while allowing NIR light to pass. An NIR face + iris database is built containing 560 high-resolution (2736*3648 pixels) NIR images. It includes 112 subjects of 55 females and 57 males, aged from 17 to 35, with 10 images for 76 subjects and 5 images for other 34 subjects. Figure 9.8
Figure 9.8. A high-revolution face image and separated face and both iris images. (a) High resolution NIR face images. (b) NIR image segmented from (a). (c) Left iris segmented from (a). (d) Right iris segmented from (a).
9.4 Experiments Table 9.1. Relationship Between GAR and the Order of the Power Series Model on Training Data R GAR(%) (FAR = 0.1%) EER(%) 1 95.6 1.21 2 94.9 1.35 3 95.7 1.17 4 95.9 1.11 5 94.8 1.33 6 95.4 1.20
7 95.7 1.16
shows some examples of face images and the segmented iris parts. The training set includes 250 images from 50 subjects. The test set includes 310 images from 62 subjects, which are totally different from the subjects of the training set.
Results for NIR Face and VL Face Fusion
For PSM-based fusion, the parameter order R in uences the performance of the fusion algorithm, so it needs to be optimized rst. To determine the value of parameter R, we use the training set to evaluate the performance of varying the value of parameter. Table 9.1 shows the genuine acceptance rate (GAR) when the false acceptance rate (FAR) is at 0.1% and the equal error rate (EER) is at various values of R. From the result, we can see that the PSM-based fusion method achieves the lowest error rate when R is 4 in the training set. Therefore, in the following experiments, we choose R = 4 for the PSM-based method. In this experiment, AdaBoost classi er is used in both NIR and VL face recognition. The output score of AdaBoost is a posterior probability P(y = +1|x) that ranges from 0 to 1. Thus both output scores of NIR and VL face classi ers are well normalized in [0, 1] by AdaBoost, and no further score normalization process is needed when fusing them. We compare six score-level fusion methods: PSM [17], LDA [29], sum rule, product rule, min rule, and max rule [30]. Table 9.2 shows the match results of
Table 9.2. GAR and EER for Score Fusion of NIR Face and VL Face GAR(%) (FAR = 0.1%) PSMSF LDA SUM PRODUCT MIN MAX NIR VIL 93.2 92.2 91.7 91.7 89.4 90.7 90.1 84.0 EER(%) 1.84 1.94 2.80 2.81 4.56 2.04 2.34 5.27
