본문 바로가기

분류 전체보기

(37)
Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - 693 https://www.youtube.com/watch?v=yceNl9C6Ir0 Attention vs state space model :  Attention does kv cache with selection (softmax selection), state-space model does the compression.  State-space model is hard to recover its past data.  Attention works great on the well-defined tokenizer, which every of its tokens has meaningful values, but needs     compression. Many works are integrating this two a..
DepGraph: Towards Any Structural Pruning Problem :  Structural pruning enables model acceleration by removing structurally-groupd parameters form NN.However, the parameter-grouping patterns vary widely across different models, making architecture-specific pruners, which rely on manually-designed grouping schemes, non-generalizable to new architictures. Abstract : We study any structural pruning, to tackle general structural pruning of ..
EfficientML.ai Lecture 3 Pruning and Sparsity Part II
EfficientML.ai Lecture 2 Pruning and Sparsity Part I
EfficientML.ai Lecture 1 Basics of Neural Network
YOLOv10 1. Abstract We aim to further advance the performance-efficiency boundary of YOLOs from both the post-processing and the model architecture. We first tackle the problem of redundant predictions in the post-processing by presenting a consistent dual assignments strategy for NMS-free YOLOs with the dual label assignments { one-to-many head and one-to-one head } and consistent matching metric.It al..
Contrastive Representation Learning : 1. Contrastive Training Objectives Reference from : https://lilianweng.github.io/posts/2021-05-31-contrastive/#contrastive-loss Contrastive Representation LearningThe goal of contrastive representation learning is to learn such an embedding space in which similar sample pairs stay close to each other while dissimilar ones are far apart. Contrastive learning can be applied to both supervised and unsupervised settingslilianweng.git..
VQ-VAE : Neural Discrete Representation Learning