monitoring%illegal%fishing%through%image%classificaon%cs229.stanford.edu/proj2016/poster/geislerfrostmahajan-monitoring... ·...
TRANSCRIPT
Modeling
SIFT
Fisher Encoding (FE) (GMM Model on SIFT to construct
a visual dic:onary)
Linear SVM
Bag of Visual Words (K-‐Means on SIFT)
Linear SVM
CIFA
R-‐10
CNN
VGG-‐16 Random Forest (15 trees)
SVM (RBF, linear)
Monitoring Illegal Fishing through Image Classifica:on Jason Frost
[email protected] Taylor Geisler
[email protected] Antariksh Mahajan [email protected]
ProblemNature Conservancy, an NGO, monitors illegal fishing by installing cameras on fishing boats. Image classification using computers can help them to speed up and automate the process. We built a multiclass classification system for 8 different species.
Feature Extraction
1. Scale-Invariant Feature Transform (SIFT)• Identifies keypoints (shapes with
directional orientations)• Invariant to rotations and translations• Dense SIFT (dSIFT) runs SIFT on a grid
2. VGG-16 Convolutional Neural Network• 16-layer CNN, has won image recognition
competitions by ImageNet• Pre-trained weights, outputs 1000 features
3. CIFAR-10 Convolutional Neural Network• 10 convolutional layers• Trained our own weights
Data• Original dataset: 3777 images, 8 labels• Data preprocessing and augmentation: • Feature-wise centering set input mean to 0• Random rotations, shifts, flips increased
dataset size to >20,000 images• Rescaled to 224x224 for easier processing
• Challenge: fish are tiny part of most images
Label: ALB BET
Baseline models• Naïve Bayes (by occurrences of colors): 8%• 1-nearest neighbor: 97%!• But this is because similar boats catch
similar fish. Need more complex model for greater generalizability.
Training image with SIFT keypoints
Image 2(conv-‐64), maxpool, ReLU
2(conv-‐128), maxpool, ReLU
3(conv-‐256), maxpool, ReLU
6(conv-‐512), maxpool, ReLU
2(FC-‐4096), 1(FC-‐1000)
ResultsExtrac'on Method Model Training Acc. Test Acc.
VGG-‐16 RF 99.5% 83.4%
SVM-‐RBF 44.3% 45.6% SVM-‐Linear 54.1% 53.3%
-‐ CIFAR-‐10 95.6% 77.3%
Conclusions• VGG-16 to RF performed best• However, difficult to extract most important
features from VGG-16• Better and more generalizable results can
be obtained by extracting fish from images, possibly using another CNN
Monitoring program for pharmaceuticals, illegal substances, and contaminants in farmed fish. Revised
Annual report 2008: Monitoring program for residues of therapeutic agents, illegal substances, pollu