object detection and pixel classification algorithm

43
Video and Image Analytics for Multiple Environments (VIAME) Quick-Start and Installation Manual Object Detection and Pixel Classification Stereo Measurement Object Tracking Video Search and Rapid Model Generation Image Enhancement + = Image Registration and Mosaicing Calibration, 3D and Altitude Estimation Algorithm Evaluation

Upload: others

Post on 16-Oct-2021

20 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Object Detection and Pixel Classification Algorithm

Video and Image Analytics for Multiple Environments (VIAME)

Quick-Start and Installation Manual

Object Detection and Pixel Classification

Stereo Measurement

Object Tracking

Video Search and Rapid Model Generation

Image Enhancement

+ =

Image Registration and Mosaicing

Calibration, 3D and Altitude Estimation

Algorithm Evaluation

Page 2: Object Detection and Pixel Classification Algorithm

Main website:

https://www.viametoolkit.org/

Github:

https://github.com/VIAME/VIAME

Discussion forum:

https://github.com/VIAME/VIAME/discussions

Tutorial videos:

https://www.youtube.com/channel/UCpfxPoR5cNyQFLmqlrxyKJw

Bug reporting:

https://github.com/VIAME/VIAME/issues, https://github.com/VIAME/DIVE/issues

All documentation links:

https://viame.readthedocs.io/en/latest/section_links/documentation_overview.html

Important LinksInstallers (pre-built binaries), docker images, and source code are hosted on GitHub. Pre-built binaries are for users, the source code and build instructions are for developers.

There are 5 types of documentation: the quick-start guide (this document), tutorial videos, user forums, example readmes, and the full manual.

Page 3: Object Detection and Pixel Classification Algorithm

VIAME FlavorsVIAME comes in a few different interfaces with slightly different capabilities, as shown in the following slides and tables.

Some interfaces are deprecated (e.g. VIEW) in favor of newer replacements (DIVE), though will still remain as an option in VIAME for a few specialized cases, just not developed significantly further.

Lastly, not listed in this document thoroughly, are programming APIs for developers.

DIVE – Web and Desktop Annotator

Originally created as the VIAME-Web interface (with a public server hosted at https://viame.kitware.com), a desktop version of this web annotator and model trainer is to be released in January 2021 in both windows .msiinstallers and .zip release formats.

This tool is currently the most general purpose annotator, and supports polygons, lines, points, or boxes, and can train models over multiple videos or image sequences using standard models.

Page 4: Object Detection and Pixel Classification Algorithm

VIEW- Original Desktop Annotator

SEARCH - Standalone Search Tool

SEAL – Multi-Model Desktop Annotator

An older tool, used explicitly for image/video search and rapid model generation through a procedure called iterative query refinement (IQR), wherein the user provides an exemplar of what they’re looking for then the system provides new results for the user to accept or reject. While this is happening, a simple model is trained for the query which can be saved out and re-used in annotators.

Original VIAME desktop annotator for generating either detection or track-level annotations (boxes or polygons). Contains many optimizations for annotating and running pipelines on large (high resolution) imagery. Coded in C++ for efficiency. Can only train models over a single video or image sequence, with limited model selection. Some users prefer its style of track annotation or use it on high resolution clips.

Supports annotating detection or track boxes in multiple camera views simultaneously, if a transformation is loaded mapping pixels from one view to the other (e.g. boxes created in one camera view will show up in the other). Supports 2-4 cameras side-by-side in the viewer. Can only train models on one sequence from one camera at a time.

Page 5: Object Detection and Pixel Classification Algorithm

Project files are a collection of scripts targeting either groups of images or videos. They are documented later on in this guide. Project files are also used to launch some of the annotation GUIs in the desktop version of the software, or to train models across multiple sequences headless (without a GUI) to prevent the GUI from using any system resources while training, e.g. VRAM, reserving more for the training process.

Feature Example Files

Project Files

VIEW Interface

SEARCH Interface

SEAL Interface

DIVE Interface

Runnable from Desktop Installers on Local Desktop

Runnable Remotely over RDP (Windows) or VNC (Linux)

Runnable in Web Browser using Remote Server

Windows .zip / Linux .tar.gz Installations Provided

Windows .exe / .msi Installers Provided

Docker Instances Provided

Project Files

Example Folders

Full Support Partial Support Planned (Short-Term) Planned (Long-Term) No Support

In the “examples” folder of a VIAME install are a series of standalone .bat (Windows) or .sh (Linux) launchers broken down based on functionality covering all aspects of the system.

Capabilities Breakdown

Page 6: Object Detection and Pixel Classification Algorithm

Feature Example Files

Project Files

VIEW Interface

SEARCH Interface

SEAL Interface

DIVE Interface

Standard box-level annotation support in GUI THRU → THRU → Can only confirm or reject boxes

Polygon-level annotation support in GUI THRU → THRU →

Pixel-mask annotation support in GUI

Key-point annotation support in GUI THRU → THRU → Via drawing small boxes

Joint annotation across 2 to 4 cameras simultaneously

Detection model training on single sequence or video SVM models only

Detection model training on multiple sequences or video SVM models only

Ability to run arbitrary detection or tracking pipelines

Ability to run detection pipelines on multiple cameras

Ability to perform image search and iterative refinement THRU → THRU →

Annotation support on very large images in GUI THRU → THRU → Basic support longer load time

Basic support longer load time

Annotation support on images of varying resolutions THRU → THRU →

Ability to run stereo measurement pipelines Poor visualization of results

Ability to run image enhancement under the hood

Ability to output enhanced images THRU →

Ability to output mosaiced images

Automatic scoring and evaluation of detections

Page 7: Object Detection and Pixel Classification Algorithm

- Many algorithms can run with less and a generic 4 Gb patch is available on install page

- Also depends on if talking about just inference (pre-trained model running, uses less) or train

VIAME is designed to run on 8 Gb+ VRAM NVIDIA Graphics cards (1 or more), but….

Some algorithms are meant to run on CPU(motion tracker, baseline pixel classification)

Some algorithms are meant to run on GPU, but can run on CPU(deep frame classification)

Some are designed for GPU, and can run on but take forever on CPU(most deep CNN detectors, many deep learning training routines [classifiers or detection])

GPU (typically a Graphics Card) vs CPU Installations

Additionally…

- This is just for algorithms and processing pipelines; annotation GUIs can be run on CPU

Page 8: Object Detection and Pixel Classification Algorithm

How do I know if I have a GPU already?

https://www.google.com/search?q=how+do+i+know+if+i+have+a+gpu

On windows look in device manager, sometimes computers have more than one card (one embedded on the motherboard, then a 2nd in a plugin slot). Next, search for the card to know it’s specifications. On Linux, many

terminal commands can tell you which you have.

Page 9: Object Detection and Pixel Classification Algorithm

Types of Annotation and Detection Models

9

Box-Level Frame-Level

Pixel-Level Keypoints

Page 10: Object Detection and Pixel Classification Algorithm

10

Detections vs Tracks

Detections and tracks are synonymous across examples and user-interfaces. A track is a (temporal) sequence of single-frame detections, but a detection can also be viewed

as a track with just a single state.

Detection Track

Page 11: Object Detection and Pixel Classification Algorithm

3 Core Model Training Workflows

11

New Classifier Training via IQR

New Detector or Tracker Training

via Deep Learning

Higher-Level Analytics(Automated Detection Quantities, Heatmaps, Occurrence vs Time Plots)

Input Data

New Annotations or Annotation Correction

User Fixes Boxes, Labels, or Masks

Running Existing Detectors and Trackers

(Traditional or Deep)

Pre-Trained Box Generators

Page 12: Object Detection and Pixel Classification Algorithm

Workflow #1: Traditional Deep Learning Model Generation from Scratch.

Process:

(a) Load up Imagery in Annotator

(b) Annotate Imagery Manually

(c) Export Detection or Tracks Files

(d) Repeat for as Many Sequences as Possible in Diverse Backgrounds

(e) Run Model Training

(f) Evaluate Model Performance

(g) Repeat (d) thru (f) as Desired on Detector Fail Cases

- Focusing additional annotation on sequences with the most errors

Pros:

• Models perform better than most other solutions when trained with enough training data

Cons:

• Requires a large amount of training data and user time to generate it

3 Core Model Training Workflows

Page 13: Object Detection and Pixel Classification Algorithm

Workflow #2: Traditional Deep Learning Model Generation with Partial Automation

Process:

(a) Load up Imagery in Annotator

(b) Run an Automated Detector• Can be IQR based

• Can be default model or other pre-trained detector

• Can be a user generated deep detector

(c) Correct and Export Detection or Tracks Files

(d) Repeat for as Many Sequences as Desired in Diverse Backgrounds

(e) Run Model Training

(f) Evaluate Model Performance

(g) Repeat (b) thru (f) as Desired on Detector Fail Cases

Pros:

• Can speed up annotation if automated detector is decent enough

Cons:

• If automated detector is poor it can take more effort to correct automated outputs instead doing annotations from scratch

3 Core Model Training Workflows

Page 14: Object Detection and Pixel Classification Algorithm

3 Core Model Training Workflows

Workflow #3: IQR (Video Search with Adjudication) for Rapid Model Generation

Process:

(a) Create searchable index for a video archive- Either at full frame level, detection level on top of pre-trained detectors, or track level

(b) Launch search GUI

(c) Use search GUI to generate IQR (.svm) models

(d) Save models to category directory

(e) Evaluate models

Pros:

• Can be done with very little user effort, mostly computer runtime

• Can be used to rapidly generate models for new classes

Cons:

• GUI generally crashes after about 6 iterations due to memory issues

• Models generally not as good as deep models trained on enough training data (but can be better for cases with not a lot of training data, often, depends on the exact case)

Page 15: Object Detection and Pixel Classification Algorithm

(Supported by some GUIs)

VIAME CSV is the primary input/output format supported by default, with also some support for COCO

(Supported by each GUI)

Annotation Formats

https://viame.readthedocs.io/en/latest/section_links/detection_file_conversions.html

COCO JSON Adaptation

https://cocodataset.org/#home Added track support

VIAME-CSV

Page 16: Object Detection and Pixel Classification Algorithm

Annotation Best Practices

16

Need to consider efficiency (time) vs quality tradeoffs when deciding to do

boxes vs pixel masks, box quality, keypoints + boxes, etc…

Page 17: Object Detection and Pixel Classification Algorithm

Installation Methods

.zip files (Windows) or .tar.gz files (Linux) – Desktop or RDP/VNC

Installers provided are provided in compressed .zip or .tar.gz format for full desktop installations. Using these types of installers are documented in the following sections. This format of installer isalso useful for when users do not have admin privileges on their machines.

.exe / .msi (Windows Only) – Desktop or RDP/VNC

Desktop installers will also be provided via Windows installation wizards in early 2021. This will allow users to select which components they want on, with known defaults.

Algorithm Docker Containers – Desktop or Web

Algorithm-only docker containers with examples, project file, and command-line interface support are provided on docker hub. See github page for more examples and links. This allows users to install the algorithms on local, remote or 3rd-party (e.g. AWS, Azure) servers easier for deployments, e.g. if one wants to just train up models as opposed to hosting a full web GUI interface.

Web-App Docker Containers – Web / Web Browser App

Installers for the full web application are available for hosting instances of the data manager graphical interface (which is accessed via web browser) plus annotation pipelines on local or remote servers. See github page for more information. An example of this is hosted at: viame.kitware.com. In this case users don’t need GPUs, rather just one central server does.

.dmg / .app (Mac Only) – Desktop

A reduced version of the software with just certain annotators (DIVE) to be released in early 2021. Most Macs stopped shipping with NVIDIA cards a few years back, making training pipelines difficult.

Page 18: Object Detection and Pixel Classification Algorithm

Desktop Installation from ZIP/TAR Files

Windows 7/8/10, 64-Bit

Requirements:Windows 7, 8, or 10, 64-Bit

Recommendations:NVIDIA GPU with >= 4 Gb Video RAM (partial image processing support)

>= 8 Gb Video RAM (full image processing support)

A. Download Binaries

Goto: https://github.com/VIAME/VIAME

Download Correct Pre-Built Binary for your Operating System:

Binaries are currently large (~4Gb) due to the inclusion of multiple model files for training different methods.

B. Uninstall Previous Versions

Only perform this step if you have a previously installed version.

Typically located at C:\Program Files\VIAME

Remove this directory, optionally backing it up until you validate your new installation

Linux (Ubuntu, CentOS, RHEL, etc…)

Requirements:Ubuntu (e.g. 16.04, 18.04), CentOS 7/8, many others

Recommendations:NVIDIA GPU with >= 4 Gb Video RAM (partial image processing support)

>= 8 Gb Video RAM (full image processing support)

Goto: https://github.com/VIAME/VIAME

Download Correct Pre-Built Binary for your Operating System:

Binaries are currently large (~4Gb) due to the inclusion of multiple model files for training different methods.

Only perform this step if you have a previously installed version.Typically located at /opt/noaa/viame

Remove this directory (‘rm –rf /opt/noaa/viame’), optionally backing it up until you validate your new installation. To optionally backup, open terminal:

cd /opt/noaamv viame viame-bckup

After validating the new installation, remove old version:rm -rf /opt/noaa/viame-bckup

Page 19: Object Detection and Pixel Classification Algorithm

19

C. Install Dependency – NVIDIA Drivers

Only perform this step if you don’t have CUDA or appropriate NVIDIA drivers installed ahead of time and are using GPU-enabled binaries.

Drivers can be found here:

https://www.nvidia.com/Download/index.aspx?lang=en-us

(Version 451.82 or above is required for installation)

Or alternatively get CUDA (installing CUDA is no longer required, even though it use to be, only the drivers are, but they are included in CUDA providing another path to get the drivers. CUDA also has some other useful tools, such as nvidia-smi.exe, which is useful for monitoring GPUresources):

*Windows 7, unlike 8 and 10, requires some updates and service packs installed, e.g. KB2533623 alongside drivers or else you will get errors using GPU-dependent code.

Only perform this step if you don’t have CUDA or appropriate NVIDIA drivers installed ahead of time. Drivers can be found here:

https://linuxhint.com/ubuntu_nvidia_ppa/ (Ubuntu)https://www.nvidia.com/Download/index.aspx?lang=en-us (Other)

(Version 450.51 or above is required for installation)

Or alternatively get CUDA (CUDA is no longer required, only the drivers are, even though it use to be, but they are included in CUDA):

Goto: https://developer.nvidia.com/cuda-toolkit-archive

The best way to install the drivers depends on your Linux version. We recommend using package managers (like the above PPA for Ubuntu) when able, but if that fails falling back to one of NVIDIA’s standalone installers.

Desktop Installation from ZIP/TAR Files

Page 20: Object Detection and Pixel Classification Algorithm

20

D. Extract Downloaded VIAME Binaries

Choose an installation directory for VIAME.

We recommend C:\Program Files\VIAME, from here on out this will be known as [viame-install]

Extract the binaries from step A, for example if using winrar select the below, or alternatively ‘Extract All’ using the default windows option:

Installation Complete

Choose an installation directory for VIAME.

We recommend /opt/noaa/viame, from here on out this will be known as [viame-install]

Extract the binaries from step A, for example, right click on the downloaded .tar.gz binaries file and click “Extract Here”

The alternative is to untar the file on the command line type:

tar -xvf VIAME-v*-Ubuntu-64Bit.tar.gz

Navigate to the folder with the extracted ‘viame’ folder and move it to [viame-install], for example:

mkdir –p /opt/noaa/mv viame /opt/noaa

The contents of the folder should look like the below.

Depending on your system, you may need to get permission to modify your install directory (e.g. /opt/noaa/viame)

Installation Complete

Desktop Installation from ZIP/TAR Files

Page 21: Object Detection and Pixel Classification Algorithm

21

Your GPU isn’t very powerful, or maybe you’re running too many things at once, or have an old

process which has stuck around. Maybe try logging out and in again if you suspect the later.

Your install looks like the left, not the right. Something went wrong and your download and

extraction of the binaries was only partially complete (Low disk space? Dropped connection?),

redownload and re-extraction is recommended.

Common Installation or Run Issues

Page 22: Object Detection and Pixel Classification Algorithm

Ways to Run System Components (Desktop)

22

[viame-install] = Where you installed the software to (C:\Program Files\VIAME, /opt/noaa/viame)

(1) Running examples in the [viame-install]/examples sub-folder

The full manual for these is available here: https://viame.readthedocs.io/en/latest/

And are mirrored in the comments of the github repository here: https://github.com/VIAME/VIAME/tree/master/examples

Both the manual and corresponding examples are broken down by functionality (annotation, object detection, measurement, etc…). This method can be useful if you just want to run one component in a standalone capacity without much effort.

(2) Project files located in [viame-install]/configs/prj-(linux or windows)/*

Useful for bulk processing data, as well as training object detection and tracking models with three to four different techniques. Contains run scripts for multiple functions in the same folder. Can be easily copied to a new directory, e.g. your computer’s desktop (or anywhere else of choice) and run from there.

(3) Run graphical user interfaces using desktop or menu links (exe/msi installer only)

[RECCOMENDED]

Page 23: Object Detection and Pixel Classification Algorithm

How to Run Assorted Functionality in Examples or Project Scripts

23

On Windows:All examples and project file launchers are .bat scripts

You just need to double click them to run them

(sometimes a security dialogue will pop up but just ignore them)

On Linux:You may need to know 3 command line commands:

'bash' - for running commands, e.g. 'bash run_annotation_gui.sh' which launches the application

'ls' - for making file lists of images to process, e.g. ‘ls *.png > input_list.txt' to list all png image files in a folder

'cd' - go into an example directory, e.g. 'cd annotation_and_visualization' to move down into the annotation_and_visualization example directory. 'cd ..' is another useful command which moves one directory up, alongside a lone 'ls' command to list all files in the current directory.

Alternatively, you can make .sh scripts double-clickable via:

https://askubuntu.com/questions/138908/how-to-execute-a-script-just-by-double-clicking-like-exe-files-in-windows

to not have to deal with launching things from the terminal.

As a second alternative you can make shortcuts to your VIAME annotation GUI and common scripts:

https://askubuntu.com/questions/854373/how-to-create-a-desktop-shortcut

Page 24: Object Detection and Pixel Classification Algorithm

How to Run From Examples Folder

24

(1) Go to the examples directory

(2) Choose your example

(3) Read the corresponding entry in either the manual of example webpage

(4) Run the example scripts

In some examples knowing how to make image lists are also helpful for some examples. There are scripts showing how to do this in the project files which can be edited in notepad++ if necessary on Windows. On Linux running ‘ls [path]*.png > input_list.txt’ will generate a list as well, for a directory of .pngs in ‘path’. You can also mount network drives and do your processing off them.

Example run files can also be run from different directories if you open up the launch script in a notepad editor and change the ‘VIAME_INSTALL’ path to point to your VIAME installation directory.

Page 25: Object Detection and Pixel Classification Algorithm

Using Project Files

25

Project files are broken down based on operating system (Windows vs Linux/Mac) then class, in [viame-install]/configs:

Lot of launchers here, but more on that later….

Page 26: Object Detection and Pixel Classification Algorithm

Using Project Files

26

for_images - is designed to process a single list of imagery at time

(except for training detectors which can be on multiple lists)

for_videos - is designed to process either a folder of videos, a single video, or a folder of folders of images

To use them copy them to a folder you want to work out of, hopefully one with a bit of disk space if you plan on doing model training.

Possible requirement: if you installed VIAME to a non-default location, in order to run any of the scripts you will need to change VIAME_INSTALL at the top of your run script to point to your installation, i.e.:

Page 27: Object Detection and Pixel Classification Algorithm

Using Project Files

27

The video project file processes a folder of videos called ‘videos’ in the project directory by default. You can alternatively let ‘videos’ be a shortcut to a network drive or other non-local location (e.g. an external hard drive). Or have the project file be on the alternative drive or network.

‘videos’ can also be a directory of directory of images

Page 28: Object Detection and Pixel Classification Algorithm

Using Project Files

28

Also in project files (for videos) are frame rates to process videos at, these can be changed along with the video input directory. The later can be modified to point to other locations on disk or network as desired.

Page 29: Object Detection and Pixel Classification Algorithm

Using Project Files

29

The image project file processes a list of images called “input_list.txt” at either train (model generation) or test time (model application). This file point to imagery to process, 1 file per line.

On windows, the ‘make_image_list.bat’ command can be used

to make this list if you put your images in a folder called

‘images’ then double click make_image_list.bat; similarly on

Linux the ‘ls’ command can be used. The glob pattern can also

be changed from ‘*’ to ‘*.png” to list images of only a certain

type and not all files in the directory.

Example link to image directory stored on external drive

Page 30: Object Detection and Pixel Classification Algorithm

30

How to Perform Annotation - VIEW (for workflows #1 and #2)

Annotation should be performed in the format of what you want your classifier to output. For example, if you only care about full frame properties, you should only annotate full frame properties

If you want to perform species-level classification, you need to assign types to your

detections or tracks, the type being the species or subspecies identifier. Annotating

at a finer level is usually best, as later you can then train on the merging of lower-

level categories, but still have the option to train at the finest level.

A full annotation guide and annotation example videos can be found below, we

recommend reading through them and learning how to use the annotator:

https://viame.readthedocs.io/en/latest/section_links/annotation_and_visualization.html

In the project file, just like in the examples, ‘launch_annotation_gui.bat’ launches

the annotator.

Page 31: Object Detection and Pixel Classification Algorithm

31

Running Trained Detector Pipelines

Detector pipelines can be run from the GUI on a single sequence or video of data via the Tools->Execute Pipeline->[pipeline] option, as shown above; results then appear in the GUI.

Detector pipelines can be run from the project file in batch over multiple sequences and then the result of each sequence opened in the annotation GUI automatically

Page 32: Object Detection and Pixel Classification Algorithm

32

Running Trained Detector Pipelines

Empty full frame labels can be generated via this pipeline

Page 33: Object Detection and Pixel Classification Algorithm

33

VIEW Annotation Tips n’ Tricks

• When you select a directory of images to process a glob pattern appears in the dialogue (below) as “*”. “*” means try to open any file in the directory, but if you have non-images in a directory it can also be helpful to put “*.png” or “*.sgi” to only load images of a known output extendion. This can also be used to select only the left or right camera if you have stereo data in the same folder.

• VIAME provides 3 main pre-trained pipelines: a default, generic, and motion pipeline.

– Default: Detect vertebrate and invertebrates using a default per-image CNN

– Generic: Put boxes around all arbitrary objects in an image using a CNN

– Motion: Put boxes around all moving targets using traditional approaches

• These models can be run from the GUI in either per-frame detection-only or tracking form

• Other pipelines runnable from the GUI allow the running of local (user-trained) models or ones generated from IQR rapid model generation. VIAME add-ons add more runnable models from this dialogue as well.

• Pixel level classification via polygons currently only works with scene element annotations right now, track polygons are broken

Page 34: Object Detection and Pixel Classification Algorithm

34

VIEW Annotation Tips n’ Tricks

When saving out detections/tracks they can be saved out either as filtered tracks or non-filtered tracks. Filtered tracks will have the threshold you set in the filtering options enabled and is generally the way to go if correcting detectors.

Page 35: Object Detection and Pixel Classification Algorithm

3 Core Detection Training WorkflowsIn workflows #1 and 2 model training is performed via saving your annotations and source data (or creating shortcuts to it) into the ‘training_data’ folder in a project file. The folder structure required for this is shown in the below example and also below:

https://github.com/VIAME/VIAME/tree/master/examples/object_detector_training

Project files use the same data partition strategy, just in their ‘training_data’ folders, then the train_deep_model_[arch].sh/.bat scripts can be used to train models in bulk, where [arch] is the desired detection architecture.

The labels.txt contains 1 line per desired output

category and a list of synonyms on the same line,

e.g. the above would produce a detector which

produces 2 outputs: fish and shark, but including the

other categories in those super-categories.

Page 36: Object Detection and Pixel Classification Algorithm

Supported Detector Architectures

36

There are several detection and classification models in the system:

• YOLOv2, Gridded YOLOv2, YOLOv3, Gridded YOLOv3, YOLO-WTF

• Cascade Faster RCNN, Mask Cascade Faster RCNN

• Scallop-TK

• ResNet50 Full Frame Classifier

• SVM Models on Top of ResNet50 Late Features

• AdaBoost on top of Manual Features for Pixel Classification

• FCNN variants for feature-point identification

Cascade FRNN (CFRNN) is typically more accurate than YOLO, but slower and requires much more annotations to train successfully. The SVM models require the least number of annotations to train successfully, but is dependent on the off-the-shelf generic object detector producing decent results for your problem. It can often be hard to know which method is best without looking at the problem and trying different techniques.

The with temporal features (WTF) variants are typically better at detecting small movers in the background and worse at using color as a discriminator between categories, but they are still experiment. Unlike the others, Mask RCNN and AdaBoost methods also performs pixel-level classification.

Page 37: Object Detection and Pixel Classification Algorithm

Training from the VIEW Annotation GUI

37

Models can be trained from the VIEW GUI, but currently only on a single

sequence, unlike the examples and project files. It will use all loaded data and all

categories in the sequence.

Page 38: Object Detection and Pixel Classification Algorithm

3 Core Detection Training Workflows

Method #3: Rapid Model Generation (IQR) Example Manuals:

https://viame.readthedocs.io/en/latest/section_links/search_and_rapid_model_generation.html

https://github.com/VIAME/VIAME/tree/master/examples/search_and_rapid_model_generation/viame_ingest

The above manuals correspond to the examples folder, which are similar to the project file. The project file contains scripts for creating indexes and models both on bounding box detections, bounding box tracks, and full frame classifiers with similar names.

IQR uses active learning to progressively fine-tune a model towards a target query via progressive rounds of supplied user feedback

Page 39: Object Detection and Pixel Classification Algorithm

Stereo Techniques

The system contains two methods for performing stereo measurement: a simple segmentation approach and between learned fixed points. To annotate points to train the fixed-point classifier, small boxes should be placed where the center of the box is on the fixed key-point of interest. Keypoints should be given consistent names, e.g. ‘head’ or ‘tail’.

Page 40: Object Detection and Pixel Classification Algorithm

Running Trained Models in Project Files

All trained output models get put in the ‘category_models’ folder regardless of which technique you are using. The ‘deep_training’ and ‘augmented_images’ folders contain intermediate outputs that can be deleted when finished, although ‘deep_training’ will output models during the training process that can also be used if the training processes are terminated early.

All of the ‘generate’ scripts in the project file use the detection models in the category_modelto produce automatically computed results. They are divided based on whether or not they produce tracks or just per-frame detection, and whether or not they’re applying deep models to data or svm (IQR) models.

These scripts look for all .svm files or a detector.pipe file which get produced by training scripts. Note: older versions of the tool did not generate detector.pipe files at training time, so if you need help upgrading them ping the developers email.

After using the generate scripts on a larger archive of videos, whenever you launch the annotation GUI it will also ask you if you want to load the detections or tracks you computed on the new data.

Page 41: Object Detection and Pixel Classification Algorithm

Running Trained Models in Project Files

Outputs for all generation scripts (e.g. computed track files) get put in the ‘database’ folder for every sequence or video.

For ‘for_image’ project files the ‘input_list.txt’ file is similarly consumed for the generate_* scripts.

For ‘for_videos’ project files every folder or video file in the ‘videos’ folder is consumed

Outputs can then be opened either in the annotation GUI new project dialogue or via the launch script which will query which stored computed results in the database to load.

Page 42: Object Detection and Pixel Classification Algorithm

Running Trained Models in Project Files

Lastly, there are many features that can be added to custom project files not in this document. One example is the ‘MOUSS Project’ which outputs per-species plots and Max-N counts for each input video using the trained models, but there are others as well. The ‘launch_timeline_viewer.bat/sh’ option also allows for this in a custom GUI.

Page 43: Object Detection and Pixel Classification Algorithm

For Questions:

[email protected] – General Technical

[email protected] – VIAME Web Questions and Support

https://github.com/VIAME/VIAME/discussions - Open Discussion Board

https://github.com/VIAME/VIAME/issues – Bug Reporting (General)

https://github.com/VIAME/DIVE/issues – Bug Reporting (DIVE Interface)