TANISH_SARKAR

I Build ML Systems That Are
Boring To Maintain And
Interesting To Read.

AI/ML Engineer in Progress // Specializing in GenAI architecture,MLOps, Agentic Workflows, robust data pipelines, and turning academic papers into production-grade iron // Open to AI/ML & GenAI Roles

INITIALIZE SEQUENCE →

GITHUB LINKEDIN X KAGGLE

01 // SYSTEM_OUTLOOK

Core Statement // 2026

"I am deeply obsessed with the gap between a machine learning model that scores 94% on an offline benchmark and one that actually survives production deployment. I don't build generic tutorial wrappers. I design clean pipelines, implement architectures from scratch to truly understand their boundaries, and build systems optimized for reliable execution over academic metric inflation."

02 // SYSTEM_CAPABILITIES

CORE COMPETENCY

Deep Learning & Transformers

CNNs, ANN models, Encoder-Decoder, Attention Mechanism, ViT, BERT-style, GPT-style, model tuning, regularization.

-> Built GPT-2 transformer from first principles with BPE tokenizer.

CORE COMPETENCY

AI-ML & Data Science

NumPy, Pandas, Scikit-learn, TensorFlow, Matplotlib, Seaborn, OpenCV, HuggingFace, PyTorch, NLP.

-> Optimized CNNs with custom regularization.

CORE COMPETENCY

Programming & Tools

Python, C++, JavaScript, SQL. Git, GitHub, AWS Certified Cloud Practitioner.

CORE COMPETENCY

Backend & Databases

REST APIs, Flask, FastAPI, PostgreSQL (pgvector), MongoDB, MySQL.

-> Architected production semantic search engine bypassing LangChain.

03 // PROJECTS

PRO_01 PYTORCH / CUDA

DECONSTRUCTING THE TRANSFORMER

Built GPT-2 entirely from scratch to understand the fundamental mechanics of self-attention. Implemented custom training loops and optimized inference paths without relying on high-level abstraction libraries.

VIEW REPOSITORY ↘

PRO_02 [UNDER_CONSTRUCTION]

#PGVECTOR

PRODUCTION SEMANTIC SEARCH

A raw implementation using pgvector. Bypassed LangChain to reduce latency and dependencies, proving that raw SQL and bare-metal embeddings often outperform bloated frameworks in production.

PRO_03 COMPUTER VISION

CV OPTIMIZATION

Architected a highly optimized CNN for CIFAR-10. Focused on memory efficiency and parameter reduction without sacrificing accuracy. Demonstrated pruning techniques for edge deployment.

VIEW REPOSITORY ↘ VIEW LIVE DEMO ↘

DATA_STREAM // LOSS CURVE

PRO_04 COMPUTER VISION

Real-time Facial Detection

Designed and deployed a real-time facial detection pipeline using modern computer vision techniques. Optimized for low latency inference suitable for live video streams.

VIEW REPOSITORY ↘

PRO_05 DEEP LEARNING

Autoencoder from Scratch

Developed a neural autoencoder from the ground up to explore representation learning, latent space compression, and data reconstruction without reliance on high-level wrappers.

VIEW REPOSITORY ↘

PRO_06 MACHINE LEARNING

Heart Disease Classification

Built an end-to-end classification pipeline for predicting heart disease presence. Deployed as a robust predictive API endpoint for reliable integration.

VIEW REPOSITORY ↘ VIEW LIVE DEMO ↘

04 // TECHNICAL_BLOGS

Active tracking & architectural deep dives hosted at tanishverse.hashnode.dev

How LLMs actually learn and generate: A complete overview

Mathematical verification and checking parity by loading raw OpenAI weights directly into scratch-built architectures.

→

Diffusion Models: From Sandcastles to Stable Diffusion

Unpacking the probabilistic mechanics and step-by-step noise schedule scheduling algorithms behind generative models.

→

I Build ML Systems That Are Boring To Maintain And Interesting To Read.

01 // SYSTEM_OUTLOOK

02 // SYSTEM_CAPABILITIES

Deep Learning & Transformers

AI-ML & Data Science

Programming & Tools

Backend & Databases

03 // PROJECTS

DECONSTRUCTING THE TRANSFORMER

PRODUCTION SEMANTIC SEARCH

CV OPTIMIZATION

Real-time Facial Detection

Autoencoder from Scratch

Heart Disease Classification

04 // TECHNICAL_BLOGS

How LLMs actually learn and generate: A complete overview

Diffusion Models: From Sandcastles to Stable Diffusion

I Build ML Systems That Are
Boring To Maintain And
Interesting To Read.