TANISH_SARKAR

I Build ML Systems That Are
Boring To Maintain And
Interesting To Read.

AI/ML Engineer in Progress // Specializing in GenAI architecture,MLOps, Agentic Workflows, robust data pipelines, and turning academic papers into production-grade iron // Open to AI/ML & GenAI Roles

INITIALIZE SEQUENCE →
SIM_WIDGET_V1
SYS_STATUS: ONLINE LATENCY: 12ms
> Initializing model inference environment...
> Loading tensors... OK.
CONFIDENCE: 98.4% MEM: 2.4GB / 16GB

01 // SYSTEM_OUTLOOK

Core Statement // 2026
"I am deeply obsessed with the gap between a machine learning model that scores 94% on an offline benchmark and one that actually survives production deployment. I don't build generic tutorial wrappers. I design clean pipelines, implement architectures from scratch to truly understand their boundaries, and build systems optimized for reliable execution over academic metric inflation."

02 // SYSTEM_CAPABILITIES

CORE COMPETENCY

Deep Learning & Transformers

CNNs, ANN models, Encoder-Decoder, Attention Mechanism, ViT, BERT-style, GPT-style, model tuning, regularization.

-> Built GPT-2 transformer from first principles with BPE tokenizer.
CORE COMPETENCY

AI-ML & Data Science

NumPy, Pandas, Scikit-learn, TensorFlow, Matplotlib, Seaborn, OpenCV, HuggingFace, PyTorch, NLP.

-> Optimized CNNs with custom regularization.
CORE COMPETENCY

Programming & Tools

Python, C++, JavaScript, SQL. Git, GitHub, AWS Certified Cloud Practitioner.

CORE COMPETENCY

Backend & Databases

REST APIs, Flask, FastAPI, PostgreSQL (pgvector), MongoDB, MySQL.

-> Architected production semantic search engine bypassing LangChain.

03 // PROJECTS

PRO_01 PYTORCH / CUDA

DECONSTRUCTING THE TRANSFORMER

Built GPT-2 entirely from scratch to understand the fundamental mechanics of self-attention. Implemented custom training loops and optimized inference paths without relying on high-level abstraction libraries.

PRO_02 [UNDER_CONSTRUCTION]
#PGVECTOR

PRODUCTION SEMANTIC SEARCH

A raw implementation using pgvector. Bypassed LangChain to reduce latency and dependencies, proving that raw SQL and bare-metal embeddings often outperform bloated frameworks in production.

PRO_03 COMPUTER VISION

CV OPTIMIZATION

Architected a highly optimized CNN for CIFAR-10. Focused on memory efficiency and parameter reduction without sacrificing accuracy. Demonstrated pruning techniques for edge deployment.

DATA_STREAM // LOSS CURVE
PRO_04 COMPUTER VISION

Real-time Facial Detection

Designed and deployed a real-time facial detection pipeline using modern computer vision techniques. Optimized for low latency inference suitable for live video streams.

PRO_05 DEEP LEARNING

Autoencoder from Scratch

Developed a neural autoencoder from the ground up to explore representation learning, latent space compression, and data reconstruction without reliance on high-level wrappers.

PRO_06 MACHINE LEARNING

Heart Disease Classification

Built an end-to-end classification pipeline for predicting heart disease presence. Deployed as a robust predictive API endpoint for reliable integration.

04 // TECHNICAL_BLOGS

Active tracking & architectural deep dives hosted at tanishverse.hashnode.dev

How LLMs actually learn and generate: A complete overview

Mathematical verification and checking parity by loading raw OpenAI weights directly into scratch-built architectures.

Diffusion Models: From Sandcastles to Stable Diffusion

Unpacking the probabilistic mechanics and step-by-step noise schedule scheduling algorithms behind generative models.

SYS_TERMINAL // INTERACTIVE_MODE
TANISH_OS v2.0.26 initializing...
Type 'help' to see available commands.
>