CS @ UT Dallas

Shifat Santo

I build vector databases in C++ because I wanted to know how they actually work. I trained a language model because I wanted to understand tokenization. I built a voice AI platform because I wanted to hear latency, not read about it. Most of what I know, I learned by building it wrong first.

View Projects Read Writing

// projects

Things I built to understand things

Not wrappers. Not tutorials. Systems I wrote from scratch because the only way to really learn something is to build it.

Soniq

Phone rings, AI picks up, handles the whole call. Checks availability, books appointments, takes orders, escalates to a human if it needs to. LiveKit + Deepgram + Cartesia. Had to get it under 200ms end-to-end or the conversation feels broken.

PythonTypeScriptLiveKitVoice AI

Deep dive →

VectorVault

I wanted to understand how vector databases actually work. Not the API -- the graph construction, the distance math, the persistence. So I built one. HNSW in C++20 with AVX2 SIMD, mmap, and CRC32 checksums.

C++20HNSWSIMDAVX2

Deep dive →

Quantum Tunnel

Solves the time-dependent Schrodinger equation on a 3D grid. Split-operator FFT, OpenMP parallelism, two render modes (marching cubes and volume raycasting). Built it because I wanted to see quantum tunneling, not just read the math.

C++FFTW3OpenGLOpenMP

Deep dive →

nightshift

Sits between your agent and LLM APIs. Compresses context with T5, deduplicates with SHA-256, routes queries through a UCB1 bandit between cheap and expensive models. I built it so I could run research agents overnight without burning my API budget. 141 tests.

PythonT5MiniLMChromaDB

Deep dive →

undertone

Hold a hotkey, speak, release. Words typed at cursor. Groq Whisper for speed, local faster-whisper when you want privacy. Published on PyPI because existing voice typing tools were either Mac-only or had garbage latency.

PythonPyPIWhisper

Deep dive →

Bengali Tokenizer Research

Tested 14 tokenizers on Bengali text. Multilingual LLM tokenizers are 5-9x less efficient than dedicated ones. Not 'a few percent.' A completely different cost structure. Root cause: one missing Unicode character (Bengali Nukta) accounts for 89.8% of byte-fallback tokens.

PythonNLPSentencePiece

Deep dive →

// writing

Writing

I built in silence for too long. Now I write about what I build and what breaks along the way.

I Scanned 20 MCP Server Configs. 19 Failed.

These aren't contrived attack scenarios. These are the configs developers copy from documentation and paste into their settings.

Mar 2026

A $9/Month Content Pipeline That Does Everything

One git push cross-posts to Dev.to and Hashnode, generates social content, and tracks what went where. I'm an engineer. I automated it.

Mar 2026

Why I'm Writing Now

26 repos and 3 followers. That's what building in silence gets you.

Mar 2026

// proof of work

What I've actually done

Not claims. Specific things I built, measured, and shipped.

200K+

Lines Shipped

306M

Params Trained

<200ms

Voice Latency

141

Tests Green

I built a voice AI platform

Inbound phone calls hit a SIP trunk, route to LiveKit, and an AI agent handles the conversation. Bookings, orders, escalation. Sub-200ms or it's useless.

I wrote a vector database in C++

Not a wrapper. HNSW graph construction, AVX2 distance functions, mmap persistence with CRC32 checksums. Because I wanted to know how they actually work.

I trained a language model

306M parameters. Bengali. Custom data pipeline, custom tokenizer, pre-training on A100s. Found that one missing Unicode character causes 89.8% of byte-fallback tokens.

I ship across the whole stack

C++ and Rust at the bottom. Python in the middle. TypeScript at the top. The language doesn't matter. Understanding the problem does.

// about

About

CS student at UT Dallas. I spend most of my time building things that probably don't make sense for a student to build — vector databases, language models, voice AI platforms. I do it anyway because using an API without understanding what's behind it bothers me.

I write C++, Rust, Go, Python, and TypeScript. Not because I'm trying to collect languages, but because different problems need different tools. You don't write a Schrodinger equation solver in JavaScript. You don't build a dashboard in C.

Currently looking for internships where I can work on hard problems with people who care about getting them right.

Technical Stack

Languages

C++RustGoPythonTypeScript

AI / ML

LLMsRAGAgentsWhisperEmbeddingsFine-tuning

Infrastructure

DockerLinuxCUDAA100/H100FastAPI

Frontend

Next.jsReactTailwindOpenGL

// contact

Say hi

Looking for internships. Also happy to talk about systems, AI, or whatever you're building.

Email GitHub LinkedIn