learn_jax

History

Richard Wong 7e1f45f466 Chore: removing old files and some experiments		2024-10-06 23:53:57 +09:00
..
t5_model	Feat: increased learning rate for effective large batch size learning	2024-09-22 22:28:41 +09:00
.gitignore	Feat: t5_jax_simple_parallel implements a working example of fsdp	2024-09-20 23:42:51 +09:00
dataload.py	Chore: removing old files and some experiments	2024-10-06 23:53:57 +09:00
flax_pjit_tutorial.py	Feat: t5_jax_simple_parallel implements a working example of fsdp	2024-09-20 23:42:51 +09:00
fully_sharded_data_parallelism.py	Feat: flax pjit example	2024-09-16 12:19:07 +09:00
gpt-neo-125m.json	Feat: t5_jax_simple_parallel implements a working example of fsdp	2024-09-20 23:42:51 +09:00
gptneo_partition_test.py	Feat: t5_jax_simple_parallel implements a working example of fsdp	2024-09-20 23:42:51 +09:00
intro_to_distributed.py	Feat: fsdp demo	2024-09-15 22:41:00 +09:00
partitions.py	Feat: t5_jax_simple_parallel implements a working example of fsdp	2024-09-20 23:42:51 +09:00
single_gpu_optimizations.py	Feat: fsdp demo	2024-09-15 22:41:00 +09:00
t5.json	Feat: t5_jax_simple_parallel implements a working example of fsdp	2024-09-20 23:42:51 +09:00
t5_jax_train_2.py	Feat: flax pjit example	2024-09-16 12:19:07 +09:00
t5_jax_train_fail.py	Feat: flax pjit example	2024-09-16 12:19:07 +09:00
t5_pjit.py	Feat: t5_jax_simple_parallel implements a working example of fsdp	2024-09-20 23:42:51 +09:00