genmusic_demo_list Download - genmusic_demo_list Download do código-fonte

genmusic_demo_list

Código-Fonte de IA

1.0.0

Baixar

A list of demo websites for automatic music generation research

texto para música/áudio

Condicionamento Multi-Aspecto (difusão; maman24): https://benadar293.github.io/multi-aspect-conditioning/
Presto (difusão; novack24arxiv): https://presto-music.github.io/web/
MMGen (difusão; wei24arxiv): https://awesome-mmgen.github.io/
Seed-Music (difusão+transformador; bai24arxiv): https://team.doubao.com/en/special/seed-music
SongCreator (difusão; lei24arxiv): https://songcreator.github.io/
MSLDM (difusão; xu24arxiv): https://xzwy.github.io/MSLDMDemo/
Multi-Track MusicLDM (difusão; karchkhadze24arxiv): https://mt-musicldm.github.io/
FluxMusic (difusão; fei24arxiv): https://github.com/feizc/FluxMusic
controle-transferência-difusão (difusão; demerlé24ismir): https://nilsdem.github.io/control-transfer-diffusion/
Adaptador AP (difusão; tsai24arxiv): https://rebrand.ly/AP-adapter
MusiConGen (transformador; lan24arxiv): https://musicongen.github.io/musicongen_demo/
Áudio estável aberto (difusão; evans24arxiv): https://stability-ai.github.io/stable-audio-open-demo/
MEDIC (difusão; liu24arxiv): https://medic-zero.github.io/
MusicGenStyle (transformador; rouard24ismir): https://musicgenstyle.github.io/
MelodyFlow (transformador + difusão; lelan24arxiv): https://melodyflow.github.io/
MelodyLM (transformador + difusão; li24arxiv): https://melodylm666.github.io/
JASCO (fluxo; tal24ismir): https://pages.cs.huji.ac.il/adiyoss-lab/JASCO/
MusicFlow (difusão; prajwal24icml): N/A
Diff-A-Riff (difusão; nistal24ismir): https://sonycslparis.github.io/diffariff-companion/
DITTO-2 (difusão; novack24ismir): https://ditto-music.github.io/ditto2/
SoundCTM (difusão; saito24arxiv): N/A
Instruct-MusicGen (transformador; zhang24arxiv): https://foul-ice-5ea.notion.site/Instruct-MusicGen-Demo-Page-Under-construction-a1e7d8d474f74df18bda9539d96687ab
QA-MDT (difusão; li24arxiv): https://qa-mdt.github.io/
Áudio estável 2 (difusão; evans24ismir): https://stability-ai.github.io/stable-audio-2-demo/
Melodista (transformador; hong24arxiv): https://text2songmelodist.github.io/Sample/
SMITIN (transformador; koo24arxiv): https://wide-wood-512.notion.site/SMITIN-Self-Monitored-Inference-Time-INtervention-for-Generative-Music-Transformers-Demo-Page-983723e6e9ac4f008298f3c427a23241
Áudio estável (difusão; evans24arxiv): https://stability-ai.github.io/stable-audio-demo/
MusicMagus (difusão; zhang24ijcai): https://wry-neighbor-173.notion.site/MusicMagus-Zero-Shot-Text-to-Music-Editing-via-Diffusion-Models-8f55a82f34944eb9a4028ca56c546d9d
Idem (difusão; novack24arxiv): https://ditto-music.github.io/web/
MAGNeT (transformador; ziv24arxiv): https://pages.cs.huji.ac.il/adiyoss-lab/MAGNeT/
Mustango (difusão; melechovsky24naacl): https://github.com/AMAAI-Lab/mustango
Music ControlNet (difusão; wu24taslp): https://musiccontrolnet.github.io/web/
InstrumentGen (transformador; nercessian23ml4audio): https://instrumentgen.netlify.app/
Coco-Mulla (transformador; lin23arxiv): https://kikyo-16.github.io/coco-mulla/
Compositor JEN-1 (difusão; yao23arxiv): https://www.jenmusic.ai/audio-demos
UniAudio (transformador; yang23arxiv): http://dongchaoyang.top/UniAudio_demo/
MusicLDM (difusão; chen23arxiv): https://musicldm.github.io/
InstructME (difusão; han23arxiv): https://musicedit.github.io/
JEN-1 (difusão; li23arxiv): https://www.futureverse.com/research/jen/demos/jen1
MusicGen (Transformer; copet23arxiv): https://ai.honu.io/papers/musicgen/
MeLoDy (Transformador + difusão; lam23arxiv): https://efficient-melody.github.io/
MusicLM (Transformer; agostinelli23arxiv): https://google-research.github.io/seanet/musiclm/examples/
Noise2Music (difusão; huang23arxiv): https://noise2music.github.io/
ERNIE-Música (difusão; zhu23arxiv): N/A
Rifusão (difusão;): https://www.riffusion.com/

texto para áudio

MambaFoley (mamba; xie24arxiv): n/a
PicoAudio (difusão; xie24arxiv): https://zeyuxie29.github.io/PicoAudio.github.io/
AudioLCM (difusão; liu24arxiv): https://audiolcm.github.io/
UniAudio 1.5 (transformador; yang24arxiv): https://github.com/yangdongchao/LLM-Codec
Tango 2 (difusão; majumder24mm): https://tango2-web.github.io/
Baton (difusão; liao24arxiv): https://baton2024.github.io/
T-FOLEY (difusão; chung24icassp): https://yoonjinxd.github.io/Event-guided_FSS_Demo.github.io/
Audiobox (difusão; vyas23arxiv): https://audiobox.metademolab.com/
Anfion (zhang23arxiv): https://github.com/open-mmlab/Amphion
VoiceLDM (difusão; lee23arxiv): https://voiceldm.github.io/
AudioLDM 2 (difusão; liu23arxiv): https://audioldm.github.io/audioldm2/
WavJourney (; liu23arxiv): https://audio-agi.github.io/WavJourney_demopage/
CLIPSynth (difusão; dong23cvprw): https://salu133445.github.io/clipsynth/
CLIPSonic (difusão; dong23waspaa): https://salu133445.github.io/clipsonic/
SoundStorm (Transformador; borsos23arxiv): https://google-research.github.io/seanet/soundstorm/examples/
AUDITORIA (difusão; wang23arxiv): https://audit-demo.github.io/
VALL-E (Transformer; wang23arxiv): https://www.microsoft.com/en-us/research/project/vall-e/ (para fala)
modelos de difusão multi-fonte (difusão; 23arxiv): https://gladia-research-group.github.io/multi-source-diffusion-models/
Make-An-Audio (difusão; huang23arxiv): https://text-to-audio.github.io/ (para sons gerais)
AudioLDM (difusão; liu23arxiv): https://audioldm.github.io/ (para sons gerais)
AudioGen (Transformer; kreuk23iclr): https://felixkreuk.github.io/audiogen/ (para sons gerais)
AudioLM (Transformer; borsos23taslp): https://google-research.github.io/seanet/audiolm/examples/ (para sons gerais)

texto para midi

text2midi (Transformador; bhandari25aaai): https://huggingface.co/spaces/amaai-lab/text2midi
MuseCoco (Transformador; lu23arxiv): https://ai-muzic.github.io/musecoco/

geração de música no domínio de áudio

VampNet (transformador; garcia23ismir): https://hugo-does-things.notion.site/VampNet-Music-Generation-via-Masked-Acoustic-Token-Modeling-e37aabd0d5f1493aa42c5711d0764b33
JukeBox rápido (jukebox + destilação de conhecimento; pezzat-morales23mdpi): https://soundcloud.com/michel-pezzat-615988723
DAG (difusão; pascual23icassp): https://diffusionaudiosynthesis.github.io/
música! (GAN; pasini22ismir): https://huggingface.co/spaces/marcop/musika
JukeNox (VQVAE + Transformador; dhariwal20arxiv): https://openai.com/blog/jukebox/
UNAGAN (GAN; liu20arxiv): https://github.com/ciaua/unagan
dadabots (amostraRNN; carr18mume): http://dadabots.com/music.php

dado canto, gerar acompanhamentos

Lambada (transformador; trinh24arxiv): https://songgen-ai.github.io/llambada-demo/
FastSAG (difusão; chen24arxiv): https://fastsag.github.io/
SingSong (VQVAE + Transofmrer; donahue23arxiv): https://storage.googleapis.com/sing-song/index.html

dado áudio sem bateria, gere acompanhamentos de bateria

JukeDrummer (VQVAE + Transofmrer; wu22ismir): https://legoodmanner.github.io/jukedrummer-demo/

síntese de canto em domínio de áudio

InstructSing (ddsp; zeng24slt): https://wavelandspeech.github.io/instructsing/
Freestyler (transformador; ning24arxiv): https://nzqian.github.io/Freestyler/
Prompt-Singer (transformador; wang24naacl): https://prompt-singer.github.io/
StyleSinger (difusão; zhang24aaai): https://stylesinger.github.io/
BiSinger (transformador; zhou23asru): https://bisinger-svs.github.io/
HiddenSinger (difusão; hwang23arxiv): https://jisang93.github.io/hiddensinger-demo/
Make-A-Voice (transformador; huang23arxiv): https://make-a-voice.github.io/
RMSSinger (difusão; he23aclf): https://rmssinger.github.io/
NaturalSpeech 2 (difusão; shen23arxiv): https://speechresearch.github.io/naturalspeech2/
NANSY++ (Transformador; choi23iclr): https://bald-lifeboat-9af.notion.site/Demo-Page-For-NANSY-67d92406f62b4630906282117c7f0c39
UniSyn (; lei23aaai): https://leiyi420.github.io/UniSyn/
VISinger 2 (zhang22arxiv): https://zhangyongmao.github.io/VISinger2/
xiaoicesing 2 (Transformador + GAN; wang22arxiv): https://wavelandspeech.github.io/xiaoice2/
WeSinger 2 (Transformador + GAN; zhang22arxiv): https://zzw922cn.github.io/wesinger2/
U-Singer (Transformador; kim22arxiv): https://u-singer.github.io/
Cantando-Tacotron (Transformador; wang22arxiv): https://hairuo55.github.io/SingingTacotron/
KaraSinger (GRU/Transformer; liao22icassp): https://jerrygood0703.github.io/KaraSinger/
VISinger (fluxo; zhang2): https://zhangyongmao.github.io/VISinger/
Cantor MLP (blocos de mixagem; tae21arxiv): https://github.com/neosapience/mlp-singer
LiteSing (wavenet; zhuang21icassp): https://auzxb.github.io/LiteSing/
DiffSinger (difusão; liu22aaai)[sem modelagem de duração]: https://diffsinger.github.io/
HiFiSinger (Transformador; chen20arxiv): https://speechresearch.github.io/hifisinger/
DeepSinger (Transformador; ren20kdd): https://speechresearch.github.io/deepsinger/
xiaoice-multi-singer: https://jiewu-demo.github.io/INTERSPEECH2020/
xiaoicesing: https://xiaoicesing.github.io/
bytes: https://bytesings.github.io/
mellotron: https://nv-adlr.github.io/Mellotron
modelo de Lee (lee19arxiv): http://ksinging.mystrikingly.com/
http://home.ustc.edu.cn/~yiyh/interspeech2019/

transferência de estilo de canto no domínio de áudio / conversão de voz cantada

ROSVC (; takahashi22arxiv): https://t-naoya.github.io/rosvc/
DiffSVC (difusão; liu21asru): https://liusongxiang.github.io/diffsvc/
FastSVC (CNN; liu21icme): https://nobody996.github.io/FastSVC/
SoftVC VITS(): https://github.com/svc-develop-team/so-vits-svc
Assem-VC (; kim21nipsw): https://mindslab-ai.github.io/assem-vc/singer/
iZotope-SVC (codificador/decodificador conv; nercessian20ismir): https://sites.google.com/izotope.com/ismir2020-audio-demo
VAW-GAN (GAN; lu20arxiv): https://kunzhou9646.github.io/singvaw-gan/
polyak20interspeech (GAN; polyak20interspeech): https://singing-conversion.github.io/
SINGAN (GAN; sisman19apsipa): N/A
[MSVC-GAN] (GAN): https://hujinsen.github.io/
https://mtg.github.io/singing-synthesis-demos/voice-cloning/
https://enk100.github.io/Unsupervised_Singing_Voice_Conversion/
Yong&Nam (DSP; yong18icassp): https://seyong92.github.io/singing-expression-transfer/
cybegan (CNN + GAN; wu18faim): http://mirlab.org/users/haley.wu/cybegan/

conversão de fala em canto no domínio de áudio

AlignSTS (codificador/adaptor/aligner/diff-decoder; li23facl): https://alignsts.github.io/
discurso2sing2 (GAN; wu20interspeech): https://ericwudayi.github.io/Speech2Singing-DEMO/
Speech2sing (codificador/decodificador; parekh20icassp): https://jayneelparekh.github.io/icassp20/

correção de canto no domínio de áudio

deep-autotuner (CGRU; wagner19icassp): http://homes.sice.indiana.edu/scwager/deepautotuner.html

transferência de estilo de domínio de áudio (geral)

WaveTransfer (difusão; baoueb24mlsp): https://wavetransfer.github.io/
MusicTI (difusão; li24aaai): https://lsfhuihuiff.github.io/MusicTI/
DiffTransfer (difusão; comanducci23ismir): https://lucacoma.github.io/DiffTransfer/
RAVE-Latent Diffusion (difusão;): https://github.com/moiseshorta/RAVE-Latent-Diffusion
RAVE (VAE;caillon21arxiv): https://anonymous84654.github.io/RAVE_anonymous/; https://github.com/acids-ircam/RAVE
VAE-GAN (VAE-GAN; bonnici22ijcnn): https://github.com/RussellSB/tt-vae-gan
VQ-VAE (VQ-VAE; cifka21icassp): https://adasp.telecom-paris.fr/rc/demos_companion-pages/cifka-ss-vq-vae/
MelGAN-VC (GAN; pasini19arxiv): https://www.youtube.com/watch?v=3BN577LK62Y&feature=youtu.be
RaGAN (GAN; lu19aaai): https://github.com/ChienYuLu/Play-As-You-Like-Timbre-Enhanced-Multi-modal-Music-Style-Transfer
TimbreTron (GAN; huang19iclr): https://www.cs.toronto.edu/~huang/TimbreTron/samples_page.html
string2woodwind (DSP; wagner17icassp): http://homes.sice.indiana.edu/scwager/css.html

TTS

NaturalSpeech 3 (difusão; ju24arxiv): https://speechresearch.github.io/naturalspeech3/
VITS (transformador + fluxo + GAN; kim21icml): https://github.com/jaywalnut310/vits

conversão de voz/clonagem de voz

Applio (): https://github.com/IAHispano/Applio

codificador de voz (geral)

MusicHiFi (GAN + difusão; zhu24arxiv): https://musichifi.github.io/web/
BigVGAN (GAN; lee23iclr): https://bigvgan-demo.github.io/
HifiGAN (GAN; kong20neurips): https://jik876.github.io/hifi-gan-demo/
DiffWave (difusão; kong21iclr): https://diffwave-demo.github.io/
WaveGAN paralelo (GAN; yamamoto20icassp): https://r9y9.github.io/projects/pwg/
MelGAN (GAN; kumar19neurips): https://melgan-neurips.github.io/

codificador de voz (cantando)

GOLFE (DDSP; yu23ismir): https://yoyololicon.github.io/golf-demo/
DSPGAN (GAN; song23icassp): https://kunsung.github.io/DSPGAN/
Sifi-GAN (GAN; yoneyama23icassp): https://chomeyama.github.io/SiFiGAN-Demo/
SawSing (DDSP; wu22ismir): https://ddspvocoder.github.io/ismir-demo/
Multi-Singer (wavenet; huang21mm): https://multi-singer.github.io/
SingGAN (GAN; chen21arxiv): https://singgan.github.io/

tokenzier de áudio

RVQGAN aprimorado (VQ; kumar23arxiv): https://descript.notion.site/Descript-Audio-Codec-11389fce0ce2419891d6591a68f814d5
Codec HiFi (VQ; yang23arxiv): https://github.com/yangdongchao/AcademiCodec
EnCodec (VQ; défossez22arxiv): https://github.com/facebookresearch/encodec
SoundStream (VQ; zeghidour21arxiv): https://google-research.github.io/seanet/soundstream/examples/

super-resolução de áudio

AudioSR (difusão; liu23arxiv): https://audioldm.github.io/audiosr/

geração de loop de domínio de áudio

PJLoopGAN (GAN; yeh22ismir): https://arthurddd.github.io/PjLoopGAN/
LoopGen (GAN; hung21ismir): https://loopgen.github.io/

dada partitura, gera áudio musical (atuação): somente piano

MIDI para áudio baseado em TTS (Transformer-TTS; shi23icassp): https://nii-yamagishilab.github.io/sample-midi-to-audio/
Wave2Midi2Wave (transformador + wavenet; hawthorne19iclr): https://magenta.tensorflow.org/maestro-wave2midi2wave
BasisMixer (RNN + FFNN; chacon16ismir-lbd): https://www.youtube.com/watch?v=zdU8C6Su3TI

dada partitura, gera áudio musical (atuação): Não limitado a Piano [também conhecido como MIDI-to-audio]

Deep Performer (Transformer; dong22icassp): https://salu133445.github.io/deepperformer/
PerformanceNet (CNN + GAN; wang19aaai): https://github.com/bwang514/PerformanceNet
Wavenet condicionada (Wavenet; manzelli18ismir): http://people.bu.edu/bkulis/projects/music/index.html

síntese de áudio/timbre

gen-inst (transformador; nercessian24ismir): https://gen-inst.netlify.app/
GANStrument (narita22arxiv): https://ganstrument.github.io/ganstrument-demo/
NEWT (DDSP; hayes21ismir): https://benhayes.net/projects/nws/
CRASH (difusão; rouard21ismir): https://crash-diffusion.github.io/crash/
DarkGAN (GAN; nistal21ismir): https://an-1673.github.io/DarkGAN.io/
MP3net (GAN; broek21arxiv): https://korneelvdbroek.github.io/mp3net/
Michelashvili (inspirado em dsp; michelashvili20iclr): https://github.com/mosheman5/timbre_painting
GAAE (GAN+AAE; haque20arxiv): https://drive.google.com/drive/folders/1et_BuZ_XDMrdsYzZDprLvEpmmuZrJ7jk
MANNe(): https://github.com/JTColonel/manne
DDSP (inspirado em dsp; lamtharn20iclr): https://storage.googleapis.com/ddsp/index.html
MelNet (auto-regressivo; vasquez19arxiv): https://audio-samples.github.io/
AdVoc (; neekhara19arxiv): http://chrisdonahue.com/advoc_examples/
GANSynth (CNN + GAN; engel19iclr): https://magenta.tensorflow.org/gansynth
SynthNet (schimbinschi19ijcai): https://www.dropbox.com/sh/hkp3o5xjyexp2x0/AADvrfXTbHBXs9W7GN6Yeorua?dl=0
TiFGAN (CNN + GAN; marafioti19arxiv): https://tifgan.github.io/
CANTAR (defossez18nips): https://research.fb.com/wp-content/themes/fb-research/research/sing-paper/
WaveGAN (CNN + GAN; donahue19iclr): https://github.com/chrisdonahue/wavegan
Autoencoder WaveNet (WaveNet; engel17arxiv): https://magenta.tensorflow.org/nsynth

imagem para música/áudio

Art2Mus (difusão; rinaldi24ai4va): https://drive.google.com/drive/u/1/folders/1dHBxLWnyBqhVMJgUkTk0hKnFbGDVhw__
MeLFusion (difusão; chowdhury24cvpr): https://schowdhury671.github.io/melfusion_cvpr2024/
Vis2Mus (codificador/decodificador; zhang22arxiv): https://github.com/ldzhangyx/vis2mus
ConchShell (codificador/decodificador; fan22arxiv): n/a

vídeo para música/áudio

SONIQUE (difusão; zhang24arxiv): https://github.com/zxxwxyyy/sonique
Herrmann-1 (LLM + transformador; haseeb24icassp): https://audiomatic-research.github.io/herrmann-1/
Diff-BGM (difusão; li24cvpr): https://github.com/sizhelee/Diff-BGM
Frieren (difusão; wang24arxiv): https://frieren-v2a.github.io/
Video2Music (transformador; kang23arxiv): https://github.com/AMAAI-Lab/Video2Music
LORIS (difusão; yu23icml): https://justinyuu.github.io/LORIS/

composição musical interativa com várias faixas

Tocando com Yating (RNN; hsiao19ismir-lbd): https://www.youtube.com/watch?v=9ZIJrr6lmHg

composição de piano interativa

Piano Genie (RNN; donahue18nips-criatividade): https://piano-genie.glitch.me/
Dueto de IA (RNN; roberts16nips-demo): https://experiments.withgoogle.com/ai/ai-duet/view/

composição musical monoaural interativa

[musicalspeech] (Transformer; d'Eon20nips-demo): https://jasondeon.github.io/musicalSpeech/

compor melodia

MelodyT5 (transformador; wu24ismir): https://github.com/sanderwood/melodyt5
MelodyGLM (transformador; wu23arxiv): https://nextlab-zju.github.io/melodyglm/
TunesFormer (transformador; wu23arxiv): https://github.com/sander-wood/tunesformer
MeloForm (transformador; lu22arxiv): https://ai-muzic.github.io/meloform/
parkR (markov; frieler22tismir): https://github.com/klausfrieler/parkR
xai-lsr (VAE; bryankinns21nipsw): https://xai-lsr-ui.vercel.app/
Trans-LSTM (Transformador+LSTM; dai21ismir): N/A...
difusão (difusão+musicVAE; mittal21ismir): https://storage.googleapis.com/magentadata/papers/symbolic-music-diffusion/index.html
MELÕES (Transformador; zhou21arxiv): https://yiathena.github.io/MELONS/
Sketchnet (VAE + GRU; chen20ismir): https://github.com/RetroCirce/Music-SketchNet
SSMGAN (VAE+LSTM+GAN; jhamtani19ml4md): https://drive.google.com/drive/folders/1TlOrbYAm7vGUvRrxa-uiH17bP-4N4e9z
StructureNet (LSTM; medeot18ismir) https://www.dropbox.com/sh/yxkxlnzi913ba50/AAA_mDbhdmaGJC9qj0zSlqCea?dl=0
MusicVAE (LSTM + VAE; roberts18icml): https://magenta.tensorflow.org/music-vae
MidiNet (CNN + GAN; yang17ismir): https://richardyang40148.github.io/TheBlog/midinet_arxiv_demo.html
C-RNN-GAN (LSTM + GAN; mogren16cml): http://mogren.one/publications/2016/c-rnn-gan/
folkRNN (LSTM): https://folkrnn.org/

compor música de piano de faixa única

MusicMamba (mamba; chen24arxiv): n/a
EMO-Disentanger (transformador; huang24ismir): https://emo-disentanger.github.io/
MuseBarControl (transformador; shu24arxiv): https://ganperf.github.io/musebarcontrol.github.io/musebarcontrol/
WholeSong (difusão; 24iclr): https://wholesonggen.github.io/
MGM (transformador; 24tmm): https://github.com/hu-music/MGM
Polifusão (difusão; min23ismir): https://polyffusion.github.io/
EmoGen (Transformador; kang23arxiv): https://ai-muzic.github.io/emogen/
Compor e embelezar (Transformer; wu22arxiv): https://drive.google.com/drive/folders/1Y7HfExAz3PpPbFl0OnccxYDNF1KZUP-3
Transformador de tema (Transformer; shih21arxiv): https://atosystem.github.io/ThemeTransformer/
EMOPIA (Transformador; hung21ismir): https://annahung31.github.io/EMOPIA/
dadagp (Transformador; sarmento21ismir): https://drive.google.com/drive/folders/1USNH8olG9uy6vodslM3iXInBT725zult
Transformador CP (Transformador; hsiao21aaai): https://ailabs.tw/human-interaction/compound-word-transformer-generate-pop-piano-music-of-full-song-length/
PIANOTREE VAE (VAE + GRU; wang20ismir): https://github.com/ZZWaang/PianoTree-VAE
Transformador de guitarra (Transformer; chen20ismir): https://ss12f32v.github.io/Guitar-Transformer-Demo/
Transformador de música pop (Transformer; huang20mm): https://github.com/YatingMusic/remi
Transformador de música condicional (Transformer; choi19arxiv): https://storage.googleapis.com/magentadata/papers/music-transformer-autoencoder/index.html; e https://magenta.tensorflow.org/transformer-autoencoder
PopRNN (RNN; yeh19ismir-lbd): https://soundcloud.com/yating_ai/sets/ismir-2019-submission/
VGMIDI (LSTM; ferreira19ismir): https://github.com/lucasnfe/music-sentneuron
Amadeus (LSTM+RL; kumar19arxiv): https://goo.gl/ogVMSq
VAE modularizado (GRU+VAE; wang19icassp): https://github.com/MiuLab/MVAE_Music
BachProp (GRU; colombo18arxiv): https://sites.google.com/view/bachprop
Transformador de música (Transformer; huang19iclr): https://magenta.tensorflow.org/music-transformer

Reorganização (por exemplo, pop2piano)

PiCoGen2 (transformador; tan24ismir): https://tanchihpin0517.github.io/PiCoGen/
PiCoGen (transformador; tan24icmr): https://tanchihpin0517.github.io/PiCoGen/
Pop2Piano (transformador; choi23icassp): https://sweetcocoa.github.io/pop2piano_samples/
audio2midi (GRU; wang21arxiv): https://github.com/ZZWaang/audio2midi
InverseMV (GRU; lin21arxiv): https://github.com/linchintung/VMT

compor música polifônica de faixa única combinando as existentes

CollageNet (VAE; wuerkaixi21ismir): https://github.com/urkax/CollageNet

compor músicas com várias faixas

Cadenza (transformador; lenz24ismir): https://lemo123.notion.site/Cadenza-A-Generative-Framework-for-Expression-Ideas-Variations-7028ad6ac0ed41ac814b44928261cb68
SymPAC (transformador; chen24ismir): n/a
MMT-BERT (transformador; zhu24ismir): n/a
Transformador de música aninhado (transformador; ryu24ismir): https://github.com/JudeJiwoo/nmt
MMT-GI (transformador; xu23arxiv): https://goatlazy.github.io/MUSICAI/
MorpheuS: https://dorienherremans.com/morpheus
Transformador de música antecipatória (; espessurastun23arxiv): https://crfm.stanford.edu/2023/06/16/anticipatory-music-transformer.html
SCHmUBERT (difusão; plasser23ijcai): https://github.com/plassma/symbolic-music-discrete-diffusion
DiffuseRoll (difusão; wang23arxiv): n/a
Museformer (Transformador; yu22neurips): https://ai-muzic.github.io/museformer/
SymphonyNet (Transformer; liu22ismir): https://symphonynet.github.io/
CMT (Transformador; di21mm): https://wzk1015.github.io/cmt/
CONLON (GAN; angioloni20ismir): https://paolo-f.github.io/CONLON/
MMM (Transformador; ens20arxiv): https://jeffreyjohnens.github.io/MMM/
MahlerNet (RNN + VAE; lousseief19smc): https://github.com/fast-reflexes/MahlerNet
Medida por medida (RNN): https://sites.google.com/view/pjgbjzom
JazzRNN (RNN; yeh19ismir-lbd): https://soundcloud.com/yating_ai/sets/ismir-2019-submission/
MIDI-Sandwich2 (RNN + VAE; liang19arxiv): https://github.com/LiangHsia/MIDI-S2
LakhNES (Transformador; donahue19ismir): https://chrisdonahue.com/LakhNES/
MuseNet (Transformador): https://openai.com/blog/musenet/
MIDI-VAE (GRU + VAE; brunner18ismir): https://www.youtube.com/channel/UCCkFzSvCae8ySmKCCWM5Mpg
Multitrack MusicVAE (LSTM+VAE; simon18ismir): https://magenta.tensorflow.org/multitrack
MuseGAN (CNN + GAN; dong18aaai): https://salu133445.github.io/musegan/

compor capas multitrack (geração de capa; precisa de referência MIDI)

FIGARO (Transformador; rütte22arxiv): https://github.com/dvruette/figaro

dado acorde, compor melodia

MelodyDiffusion (difusão; li23mathematics): https://www.mdpi.com/article/10.3390/math11081915/s1
H-EC2-VAE (GRU+VAE; wei21ismir): N/A...
MINGUS (Transformador; madaghiele21ismir): https://github.com/vincenzomadaghiele/MINGUS
BebopNet (LSTM): https://shunithaviv.github.io/bebopnet/
JazzGAN (GAN; trieu18mume): https://www.cs.hmc.edu/~keller/jazz/improvisor/
Banda XiaoIce (GRU; zhu18kdd): http://tv.cctv.com/2017/11/24/VIDEo7JWp0u0oWRmPbM4uCBt171124.shtml

dada melodia, compor acorde (harmonização da melodia)

ReaLchords (RL; wu24icml): https://storage.googleapis.com/realchords/index.html
EMO-Harmonizer (transformador): https://yuer867.github.io/emo_harmonizer/
LHVAE (VAE+LSTM; ji23arxiv): n/a
DeepChoir (transformador; wu23icassp): https://github.com/sander-wood/deepchoir
DAT-CVAE (transformador-vae; zhao22ismir): https://zhaojw1998.github.io/DAT_CVAE
SurpriseNet (VAE; chen21ismir): https://github.com/scmvp301135/SurpriseNet
Harmonizador MT (RNN; yeh21jnmr)

dada a letra, componha a melodia

CSL-L2M (LLM; wang25aaai): https://lichaiustc.github.io/CSL-L2M/
MuDiT/MuSiT (LLM; wang24arxiv): N/A
SongComposer (LLM; ding24arxiv): https://pjlab-songcomposer.github.io/
ROC (transformador; lv22arxiv): https://ai-muzic.github.io/roc/
melodia pop (transformador; zhang22ismir): N/A
ReLyMe (transformador; chen22mm): https://ai-muzic.github.io/relyme/
TeleMelody (transformador; ju21arxiv): https://github.com/microsoft/muzic
LSTM-GAN condicional (LSTM + GAN; yu19arxiv): https://github.com/yy1lab/Lyrics-Conditioned-Neural-Melody-Generation
iComposer (LSTM; lee19acl): https://www.youtube.com/watch?v=Gstzqls2f4A
SongWriter (GRU; bao18arxiv): N/A

compor bateria MIDI

Geração condicional de bateria por Markis (BiLSTM/Transformer): https://github.com/melkor169/CP_Drums_Generation
Modelo de Nuttall (Transformer; nuttall21nime): https://nime.pubpub.org/pub/8947fhly/release/1?readingCollection=71dd0131
Modelo de Wei (VAE + GAN; wei19ismir): https://github.com/Sma1033/drum_generation_with_ssm
DrumNet (GAE; lattner19waspaa): https://sites.google.com/view/drum-generation
DrumVAE (GRU+VAE; thio19milc): http://vibertthio.com/drum-vae-client

compor melodia+acordes (duas faixas)

Geração de planilhas de leads emocionais (sen2seq): https://github.com/melkor169/LeadSheetGen_Valence
EmoMusicTV (Transformer; ji23tmm): https://github.com/Tayjsl97/EmoMusicTV
Jazz Transformer (Transformador; wu20ismir): https://drive.google.com/drive/folders/1-09SoxumYPdYetsUWHIHSugK99E2tNYD
Transformador VAE (Transformador + VAE; jiang20icassp): https://drive.google.com/drive/folders/1Su-8qrK__28mAesSCJdjo6QZf9zEgIx6
RNN de dois estágios (RNN; deboom20arxiv): https://users.ugent.be/~cdboom/music/
LeadsheetGAN (CRNN + GAN; liu18icmla): https://liuhaumin.github.io/LeadsheetArrangement/results
LeadsheetVAE (RNN + VAE; liu18ismir-lbd): https://liuhaumin.github.io/LeadsheetArrangement/results

dadas quaisquer faixas MIDI, componha outras faixas MIDI

GETMusic (difusão discreta): https://getmusicdemo.github.io/

dada melodia ou folha principal, componha o arranjo

AccoMontage3 (; zhao23arxiv): https://zhaojw1998.github.io/AccoMontage-3
GETMusic (difusão discreta): https://getmusicdemo.github.io/
SongDriver (Transformer-CRF; wang22mm):
AccoMontage2: https://billyyi.top/accomontage2/
AccoMontage (baseado em modelo; zhao21ismir): https://github.com/zhaojw1998/AccoMontage
Transformador CP (Transformador; hsiao21aaai): https://ailabs.tw/human-interaction/compound-word-transformer-generate-pop-piano-music-of-full-song-length/
PopMAG (transformador; ren20mm): https://music-popmag.github.io/popmag/
LeadsheetGAN: veja acima
LeadsheetVAE: veja acima
XiaoIce Band (o "modelo de co-arranjo multi-instrumento"): N/A

dada mixagem (áudio), compor baixo

difusão latente (difusão; pasini24arxiv): https://sonycslparis.github.io/bass_accompaniment_demo/
BassNet (GAE + CNN; ren20mm): https://sonycslparis.github.io/bassnet/

dada a melodia principal, componha melodia + acordes

local_conv_music_Generation (CNN; ouyang18arxiv): https://somedaywilldo.github.io/local_conv_music_Generation/

dada a melodia principal, componha melodia + acordes + baixo

BandNet (RNN; zhou18arxiv): https://soundcloud.com/yichao-zhou-555747812/sets/bandnet-sound-samples-1

dada partitura para piano, componha uma orquestração

LOP (RBM; crestel17smc): https://qsdfo.github.io/LOP/results.html

enchimento de piano

Polifusão (difusão; min23ismir): https://polyffusion.github.io/
preenchimento com reconhecimento de estrutura: https://tanchihpin0517.github.io/structure-aware_infilling
VLI (Transformador; chang21ismir): https://jackyhsiung.github.io/piano-infilling-demo/
O aplicativo Piano Inpainting (): https://ghadjeres.github.io/piano-inpainting-application/

preenchimento de melodia

CLSM (Transformer + LSTM; akama21ismir): https://contextual-latent-space-model.github.io/demo/

transferência de estilo de gênero de domínio simbólico

Pop2Jazz (RNN; yeh19ismir-lbd): https://soundcloud.com/yating_ai/sets/ismir-2019-submission/
Groove2Groove (RNN; cífka19ismir, cífka20taslp): https://groove2groove.telecom-paris.fr/
CicloGAN2 (CNN + GAN; brunner19mml): https://drive.google.com/drive/folders/1Jr_p6pnKvhA2YW9sp-ABChiFgV3gY1aT
CycleGAN (CNN + GAN; brunner18ictai): https://github.com/sumuzhao/CycleGAN-Music-Style-Transfer
FusionGAN (GAN; chen17icdm): http://people.cs.vt.edu/czq/publication/fusiongan/

transferência de estilo de arranjo de domínio simbólico

UnetED (CNN + Unet; hung19ijcai): https://biboamy.github.io/disentangle_demo/result/index.html

transferência de estilo de emoção / ritmo / tom de domínio simbólico

MuseMorphose (Transformer + VAE; wu21arxiv): https://slseanwu.github.io/site-musemorphose/
Kawai (VAE+GRU+adversarial; kawai20ismir): https://lisakawai.github.io/music_transformation/
Wang (VAE + GRU; wang20ismir): https://github.com/ZZWaang/polyphonic-chord-texture-disentanglement
Música FaderNets (VAE; tan20ismir): https://music-fadernets.github.io/
analogia de música profunda (yang19ismir): https://github.com/cdyrhjohn/Deep-Music-Analogy-Demos

geração de desempenho (dado MIDI, gera MIDI semelhante ao humano): somente piano

ScorePerformer (transformador; borovik23ismir): https://github.com/ilya16/scoreperformer
CVRNN (CVRNN; maezawa19ismir): https://sites.google.com/view/cvrnn-performance-render
GGNN (gráfico NN + atenção hierárquica RNN; jeong19icml)
VirtuosoNet (LSTM + rede de atenção hierárquica; jeong18nipsw): https://www.youtube.com/playlist?list=PLkIVXCxCZ08rD1PXbrb0KNOSYVh5Pvg-c
DesempenhoRNN (RNN): https://magenta.tensorflow.org/performance-rnn