A Comparison of Imitation Learning Algorithms for Bimanual Manipulation论文精读

Conecpts

Behavior Cloning

\begin{align*} \mathbf{\hat{\theta}} = \argmax_{\theta} \mathbb{E}_{(\mathbf{s},\mathbf{a})\sim \mathbf{\tau_E}}[\log(\pi_{\theta}(\mathbf{a|s}))] \end{align*}

Action Chunking Transformer

Implicit Behavior Cloning

\begin{align*} \mathbf{\hat{a}} = \argmin_{\mathbf{a}} E_{\theta}(\mathbf{s,a}) \end{align*}

Methodology

0%