외부 저장소

[논문 리뷰] Attention Is All You Need

2025.04.05

논문 링크: https://arxiv.org/abs/1706.03762 Attention Is All You NeedThe dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a newarxiv.org 1. 서론1.1 논문 선정 이유"Attention Is All You Need"는 Transformer 모델을 제안한 기념..

Literature Review

[논문 리뷰] DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

2025.04.03

논문 링크: https://arxiv.org/abs/2501.12948 DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement LearningWe introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrates remarkable reasoninarxiv.org 1. 서론1.1 ..

Literature Review

[논문 리뷰] AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head

2025.04.01

논문 링크: https://arxiv.org/abs/2304.12995 AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking HeadLarge language models (LLMs) have exhibited remarkable capabilities across a variety of domains and tasks, challenging our understanding of learning and cognition. Despite the recent success, current LLMs are not capable of processing complex audio informaarxiv.org 1. 서론1.1 논문 선정 ..

Literature Review

[논문 리뷰] FastSpeech 2: Fast and High-Quality End-to-End Text to Speech

2025.03.30

논문 링크: https://arxiv.org/abs/2006.04558 FastSpeech 2: Fast and High-Quality End-to-End Text to SpeechNon-autoregressive text to speech (TTS) models such as FastSpeech can synthesize speech significantly faster than previous autoregressive models with comparable quality. The training of FastSpeech model relies on an autoregressive teacher model for duratioarxiv.org 1. 서론1.1 논문 선정 이유FastSpeech 2는 ..

Literature Review

[논문 리뷰] FastSpeech: Fast, Robust and Controllable Text to Speech

2025.03.29

논문 링크: https://arxiv.org/abs/1905.09263 FastSpeech: Fast, Robust and Controllable Text to SpeechNeural network based end-to-end text to speech (TTS) has significantly improved the quality of synthesized speech. Prominent methods (e.g., Tacotron 2) usually first generate mel-spectrogram from text, and then synthesize speech from the mel-spectrogram usarxiv.org 1. 서론1.1 논문 선정 이유FastSpeech는 딥러닝 기반 ..

Languages/Python

[Pytorch] view, reshape, transpose, permute 함수 사용법과 contiguous 특성

2025.03.09

Pytorch를 이용하여 코드를 구현할 때 데이터의 차원을 수정하거나 조작할 때 가장 많이 사용되는 함수는 view, reshape, transpose, permute이다. 이 함수들에 대해 제대로 된 사용법과 특징을 정리할 필요가 있다 생각하여 남겨두고자 한다. 또한 contiguous한 특성에 따라 어떤 함수를 사용하여야 하는지도 함께 정리하려 한다. view()메모리를 공유하며 차원 변경연속된(Contiguous) 메모리를 사용할 때만 가능원본 텐서와 메모리를 공유 (즉, view()를 바꿔도 원본도 바뀜) import torchx = torch.arange(6) # [0, 1, 2, 3, 4, 5]y = x.view(2, 3) print(y)# 출력# tensor([[0, 1, 2], # ..

전체 글

[논문 리뷰] Attention Is All You Need

[논문 리뷰] DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

[논문 리뷰] AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head

[논문 리뷰] FastSpeech 2: Fast and High-Quality End-to-End Text to Speech

[논문 리뷰] FastSpeech: Fast, Robust and Controllable Text to Speech

[Pytorch] view, reshape, transpose, permute 함수 사용법과 contiguous 특성

티스토리툴바