Monday, January 29, 2007

Current work

Now that we've gotten familiar with Piotr's code, our next step is to run it using the entire smart vivarium dataset. We'll look at the results to find which video clips were labeled incorrectly and figure out how positional information can help prevent those errors.

cuboids!

We've been going over Piotr's matlab code for cuboids and feel comfortable using it now. We first ran the recognition demo on the face dataset, afterwards, we modified the demo to run on mice behavior dataset. Due to memory constraints, we couldn't finish running it, but we'll fix this by tonight.

Here's a sample clip from the smart vivarium dataset, drink02.avi from set00:



Here are the cuboids obtained from that video clip set to loop 10 times, each cuboid lasts approximately 1 second:


Here we have a sample clip of the cuboids clustered together by prototypes from the smart vivarium dataset:




copyright info:
This database is Copyright © 2005 The Regents of the University of California. All Rights Reserved. Permission to use, copy, modify, and distribute this database and its documentation for educational, research and non-profit purposes, without fee, and without a written agreement is hereby granted, provided that the above copyright notice, this paragraph and the following three paragraphs appear in all copies. Permission to incorporate this database into commercial products may be obtained by contacting:

Technology Transfer Office
9500 Gilman Drive,
Mail Code 0910
University of California La Jolla,
CA 92093-0910
(858) 534-5815
invent@ucsd.edu

This database and documentation are copyrighted by The Regents of the University of California. The database and documentation are supplied "as is", without any accompanying services from The Regents.

IN NO EVENT SHALL THE UNIVERSITY OF CALIFORNIA BE LIABLE TO ANY PARTY FOR DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES, INCLUDING LOST PROFITS, ARISING OUT OF THE USE OF THIS DATABASE AND ITS DOCUMENTATION,

EVEN IF THE UNIVERSITY OF CALIFORNIA HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. THE UNIVERSITY OF CALIFORNIA SPECIFICALLY DISCLAIMS ANY WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE DATABASE PROVIDED HEREUNDER IS ON AN "AS IS" BASIS, AND THE UNIVERSITY OF CALIFORNIA HAS NO OBLIGATIONS TO PROVIDE MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS, OR MODIFICATIONS.

Wednesday, January 24, 2007

Cuboids Code

Piotr gave us the access to his cuboids package. We will start learning how the code works and will show the cuboids on Monday.

Monday, January 22, 2007

Our Behavior Descriptor

E-mailed Piotr last week asking for his implementation of cuboids. Since we'll be going through his code, we've decided to start working on our behavior descriptor. Specifically, how we'll represent the spatial relationships between cuboids.
Agarwal et al. keep track of spatial relationships between detected parts by dividing the angle between each pair into bins of 45 degrees and measuring the distance between parts by window size. They represent this information in the feature vector of each training image. Their feature vector is set up as a series of binary features, indicating whether or not a part or relationship is present.
Our task is to extend this into the spatio-temporal domain.
Possible ways to do this are:
  1. Calculate distance and angle between each pair of cuboids in x, y coordinates, store time difference in a separate field.
  2. Calculate euclidean distance between each pair of cuboids in 3d using x, y, and t coordinates.
Once we have the relationships between parts, we include them in our final behavior descriptor.
We'll most likely be using a histogram of the cuboid types present and the relationships between these.

Last week's presentation

Uploaded last week's presentation at:
http://www.sharebigfile.com/file/66521/cuboidsIntro-ppt.html
it's based on the presentations we've linked to below in the previous post.

Wednesday, January 10, 2007

Useful Links

Here are a few links and brief summaries for some of the papers we'll be working from:

Behavior Recognition via Sparse Spatio-Temporal Features by Dollár et al.
The main paper we'll be using for this project. This paper introduces the use of a response function based on a quadrature pair of gabor filters applied temporally and a 2d Gaussian applied along the spatial dimension. Cuboids (small spatio-temporal video clips) are then extracted at each local maxima given by the response function applied to a clip of video. A transformation is then applied to the cuboid (the paper tests several) and a feature vector is then created. Since the amount of cuboids possible is large, but only a few types are possible, similar cuboids are then clustered together to form cuboid prototypes. Each behavior is then described as a video clip in which a given set of cuboid prototypes is present.








Learning to Detect Objects in Images via a Sparse, Part-Based Representation by Agarwal et al.
One of the first papers to use sparse features for object detection. Agarwal et al. use the Föstner corner detector to find interest points. They then use 2d windows around the interest points to create a vocabulary of parts. Objects are described by the presence and relative positioning of each part wrt other parts. SNoW is used to train a classifier based on those features.

Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words by Niebles et al.
Based on Dollár et al. this paper uses the same response function, however, they use a probabilistic Latent Semantic Analysis (not quite sure how this works yet) model to determine behavior.


Some useful tutorials and manuals:
Gabor Filters
Fairly complex tutorial, still having a hard time completely understanding Gabor filters.
SNoW (Sparse Network of Winnows)
Described by Roth as a "multi-class classifier", the executable is available on their website. We might use this to include relative cuboid positioning in our project.

Presentations:
Object Recognition using sparse features
Presentation for Agarwal et al.'s paper. Includes a short demo on SNoW.
Behavior Recognition using Cuboids
Presentation for Dollár et al.'s paper.

Datasets:
Mouse behavior dataset
sets of clips obtained from the smart vivarium