Back to work
Private/research language model
bangla-llm
Bengali model quality depends on corpus cleaning, tokenizer coverage, contamination control, and training discipline.
What I built
A 306M-parameter Bengali decoder-only model trained from scratch with a curated corpus, Bengali-specific normalization, tokenizer audit, and A100 training run.
Stack
PythonTransformersSentencePieceBF16A100
Status
Research/private direction; secondary