Safetensors
GGUF
conversational

TOBA Multilingual 1.2B

This TOBA model is a multilingual language model based on GPT-2 architecture with 1.2 billion parameters, trained for Indonesian, Batak, Minangkabau, Javanese, and Sundanese using syllabic-agglutinative tokenization. The architecture integrates an Engram Memory mechanism, an adaptive n-gram-based memory system for capturing morphological dependencies through bigram and trigram pathways.

Model Files

Safetensors:

safetensors/

GGUF Q4:

gguf/toba-sft-20mei-multilang-1.2b-Q4_K_M.gguf

Install

Install PyTorch first according to your CPU/CUDA environment, then install the package requirements:

pip install -r requirements.txt

Safetensors Usage

Run from the repository root.

Interactive chat:

python infer.py --interactive --mode chat

Single prompt, chat mode:

python infer.py --mode chat --prompt "Horas amang inang saluhutna"

Single prompt, completion mode:

python infer.py --mode completion --prompt "Horas amang inang saluhutna"

Exit interactive mode:

/q

GGUF Usage

Run from the repository root.

Interactive chat:

python gguf/infer_gguf.py --interactive

Single prompt:

python gguf/infer_gguf.py --prompt "siapa presiden pertama indonesia?"

If you are already inside the gguf folder:

python infer_gguf.py --interactive
python infer_gguf.py --prompt "siapa presiden pertama indonesia?"

PowerShell wrapper:

.\gguf\infer_gguf.ps1 -Interactive
.\gguf\infer_gguf.ps1 -Prompt "kapan indonesia merdeka?"

From Command Prompt, run the PowerShell wrapper like this:

powershell -ExecutionPolicy Bypass -File .\gguf\infer_gguf.ps1 -Interactive

Translation

Safetensors and GGUF both use translation_wrapper.py for explicit translation prompts:

python infer.py --prompt "Terjemahkan paragraf berikut dari bahasa batak ke bahasa indonesia:

Oii, ito, sungkun-sungkunmu on memang na pas! Molo gabe produser au di musik dangdut koplo, on ma rencana na adong di otakku: 1. Dangdut Koplo "Go Internasional": Unang sai holan di Jawa manang Sumatera, ito. Bahenonta ma dangdut koplo on boi dihagiot halak di mancanegara. Carana? Kolaborasi dohot musisi internasional. Bayangkon ma, adong remix dangdut koplo dohot sentuhan musik latin manang reggae. Mantap! 2. Video Klip Na Unik: Unang ma video klip na standar."


python gguf/infer_gguf.py --prompt "Terjemahkan paragraf berikut dari Bahasa Indonesian ke Bahasa jawa: Akun saya udah 2 Minggu tidak dapat diskon resto tapi pas bantuan malah kaya robot yang jawab"

Default Generation Settings

temperature=0.4
top_k=30
top_p=0.85
do_sample=1
repetition_penalty=1.2
no_repeat_ngram_size=3
max_new_tokens=200

Ref

Downloads last month
23
GGUF
Model size
1B params
Architecture
toba-gpt2-engram
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Papers for ai-toba/toba-multilingual-1.2B