2011年6月8日 星期三

[aMMAI] Paper Summary: Online Dictionary Learning for Sparse Coding

Title: Online Dictionary Learning for Sparse Coding
Author: Julien Mairal, Francis Bach, Jean Ponce, Guillermo Sapiro
Publication: ICML 2009



Sparse coding: modeling data vectors as sparse linear combination of basis elements.


This paper propose a new stochastic online algorithm for learning dictionaries adapted to sparse coding task, and proven its convergence.

Sparse coding optimize the empirical cost function:
n: number of samples
D: m*k dimension dictionary
Loss function l(x,D) should be small if D is “good” at representing the signal x.

Define l( x , D ) as the optimal value of the ℓ 1 -sparse coding problem:
To prevent D from being arbitrarily large, it is common to constrain its columns d to have an ℓ 2 norm less than or equal to one.
Since the computation of α dominates the cost of each iteration, a second-order optimization technique can be used in this case to accurately estimate D at each step when α is fixed.

Contrary to classical first-order stochastic gradient descent, it does not require explicit learning rate tuning and minimizes a sequentially quadratic local approximations of the expected cost.

Online Dictionary Learning
Two motivations:


Dictionary Update
Updating the dictionary uses block-coordinate descent with warm restarts, and one of its main advantages is that it is parameter-free and does not require any learning rate tuning, which can be difficult in a constrained optimization setting.


Optimizing the Algorithm
  • Handling Fixed-Size Datasets.
    The size of the training set is often finite. In this situation, the same data points may be examined several times, and it is very common in online algorithms to simulate an i.i.d. sampling of p( x ) by cycling over a randomly permuted training set
  • Mini-Batch Extension.
    Improve the convergence speed of our algorithm by drawing η  >  1 signals at each iteration instead of a single one, which is a classical heuristic in stochastic gradient descent algorithm
  • Purging the Dictionary from Unused Atoms.
    Some of the dictionary atoms are never used. A common practice is to replace them during the optimization by elements of the training set.



Contribution:
  • The dictionary learning problem as the optimization of a smooth nonconvex objective function over a convex set, minimizing the (desired) expected cost when the training set size goes to infinity.

  • Propose an iterative online algorithm that solves this problem by efficiently minimizing at each step a quadratic surrogate function of the empirical cost over the set of constraints.
  • Experimentally their algorithm is significantly faster than previous approaches to dictionary learning on both small and large datasets of natural images.




2011年6月1日 星期三

[aMMAI] Paper Summary: Fast concurrent object localization and recognition

Title: Fast concurrent object localization and recognition
Author: Tom Yeh, John J. Lee, Trevor Darrell
Publication: CVPR, 2009



這篇paper提出一個方法來對多個不同物件做 localizationrecognition

傳統ESS的方式,要找多個不同的物件,直接的想法就是每個物件都去找一次。
但是這樣很耗時,ESS的複雜度: O(N^4),用ESSM個物件: O(M*N^4)

作法:
  1. 基於ESS的方法,加上物件的維度。
  2. (xj,yj,vj(i)): feature j 在位置(x,y)且該feature對物件i的分數v(i)
  3. v(i)的計算方式: 就是將有屬於該物件i的總和,最後在乘上SVM的權重。
  4. 計算出分數後,再用branch-of-bound的方式去切割。
  5. 切割時是切在feature上面,不過最後面框出來的視覺上看起來差別不大。
  6. 切物件時,每次都會針對分數作排序,因為是堆疊的行式,所以會先放低方再放高分。


上面的做法是有幾項假設:
  1. 物件M是固定的,paper中用了120種物件。
  2. 每張圖中的物間只有一種。如果有多種物件在同一張圖裡面,作法上就要多加上,找到一種物件後,就將該範圍的feature拿掉。
  3. 如果不同物件間位置太相近,就很可能出錯。


最後面有與幾項baseline做實驗。




BOF最快,但是無法定位。

ISM最準,但是如果不做orientation位置的正確度就會下降,如果做了,時間複雜度就會上升。

然後有提到可以用非長方體的方式來做bounding box,像是利用多個長方體或是三角形來框想要的物件,算出來的分數會比較好。