Fontlume #7: The infrastructure layer is where the money is hiding

01 / Build This Weekend

DeFi protocols pay $250k and wait six weeks for a smart contract audit. An LLM chained to symbolic execution can do it in hours with zero false positives.

What just became possible

A September 2025 arXiv paper showed that chaining LLMs with symbolic execution tools (MythX, Manticore, Z3) cuts false positives by 70%+ and detects 50% more vulnerabilities than LLM-only or static analysis. SymGPT and Knowdit demonstrated formal proofs of defects, not just flags. Halmos and bounded model checking handle the scalability piece.

Why now

DeFi TVL is around $200B with the top 100 protocols holding $150B+. Smart contract audits cost $50k to $250k each with 2-6 week turnaround. The smart contract audit market goes from $940M in 2024 to $7.6B by 2033 at 22.7% CAGR. Web3 protocols continue iterating fast, and manual audits are the bottleneck.

What you'd build

An automated smart contract auditing SaaS that returns instant, false-positive-free vulnerability reports with formal proofs. Sell to DeFi protocols managing $100M+ TVL on continuous auditing contracts. Pricing: 20-30% premium over manual at 100x speed.

Who's already moving

ConsenSys Diligence (MythX), CertiK (formal verification + AI), OpenZeppelin (AI-assisted), Zellic (acquired Code4rena), ChainGPT (LLM auditor with grants). SymGPT, Knowdit, SmartLLM exist as academic projects. No fully automated, false-positive-free solution exists. All require human-in-the-loop.

The gap

Scalability of symbolic execution on complex Solidity/Vyper. Fine-tuning CodeLlama-70B or DeepSeek-Coder on Solidity patterns. Cryptographic audit trails for the formal proofs. ChainGPT offers up to $50k grants. Kadena Ecosystem Fund: $25M for AI projects. Ethereum Foundation has $5k to $50k academic grants. Senior security engineers earn $150k to $350k.

02 / AI Makes This Possible

97% KV cache reduction with under 1% accuracy loss. LLM inference economics just inverted.

What just became possible

A March 2026 arXiv paper introduced position-aware pseudo queries that predict token importance during generation, enabling near-lossless 97% KV cache compression. This sits on top of vLLM's paged attention and prefix caching (which max out at 7-14x compression). The breakthrough makes high-concurrency, low-cost LLM serving real on commodity hardware.

Why now

LLM cost optimization market: $2.69B in 2026 going to $9.2B by 2030 at 36% CAGR. Inference providers over-provision H100/A100 clusters and pay for HBM that sits idle. The drop-in promise (no model retraining) makes this addressable by every team running open-source models at scale, not just frontier labs.

What you'd build

A drop-in inference optimization library for LLM providers and AI-native SaaS. Cuts KV memory by 97%, increases concurrency 10x at the same SLA. Sell to Hugging Face, Together AI, Anyscale, and the hundreds of mid-sized teams running their own inference. Senior LLM inference engineers earn $200k to $400k+ in 2026.

Who's already moving

vLLM (paged attention, default in production), SGLang (RadixAttention, agentic), TensorRT-LLM (speculative decoding, multi-GPU), LMDeploy (China trusted hardware). Neural Magic was acquired by Red Hat/IBM. Together AI raised to ~$300M ARR by September 2025. The whitespace: cross-framework KV compression that ships as a wheel.

The gap

Architectural genericity: works across Llama, Qwen, DeepSeek, Mistral without per-model tuning. Hardware compatibility (NVIDIA FP8, AMD MI300, Ascend 910B). NSF NDIF grants up to $2M for LLM transparency research. NSF AI Institutes are $20M+ over five years. The play: open-source core, enterprise paid tier with SLA on accuracy/throughput.

03 / Deep Tech Bet

The 4.7kb AAV packaging limit has killed every large-gene therapy program. A split-hexamer stabilization just lifted the cap.

What just became possible

A November 2023 bioRxiv paper engineered viral capsids to package larger DNA cargoes by stabilizing split hexamers, without altering capsid architecture. Single-vector delivery for dystrophin (DMD), factor VIII (hemophilia), and other genes that exceed 4.7kb. This sidesteps the dual-AAV split-intein workaround that has been the bottleneck for a decade.

Why now

AAV gene therapy market: $5.4B in 2026 to $112B by 2035 at 40% CAGR. AAV vector manufacturing market: $0.85B to $6B over the same window. Sarepta has DMD programs needing larger packaging. Roche partnered with Dyno Therapeutics for $1B+ on AI capsid engineering. AAVnerGene shipped AAVone2.1 in May 2026 with ~1e16 GC/L and >70% full capsids.

What you'd build

A next-generation AAV vector manufacturing platform that packages large-gene therapies in a single vector. Sell to Sarepta, BioMarin, Pfizer, Novartis, and the rest of the gene therapy programs stuck on packaging. CDMO services alone: $1.24B in 2024 to $5.14B by 2034.

Who's already moving

AAVnerGene (single-plasmid AAVone2.1, ATHENA platform). Capsigen (TRADE mRNA screening). Dyno Therapeutics (CapsidMap, $1B+ Roche partnership). 4DMT (Fit4Function pipeline). Novartis (Zolgensma plus RegenxBio NAV). Sarepta (SRP-9003 platform technology designation). No one is selling split-hexamer capsids as a productized library.

The gap

High-throughput screening on hundreds of variant capsids. GMP-compliant manufacturing at the new packaging limit. IP positioning around split-hexamer stabilization specifically. NIH put U01MH130700 and R01NS126397 into capsid engineering. UK Transforming Medicines Manufacturing has additional funds. First mover on a clinical-grade large-gene capsid library captures the next decade of in vivo therapies.

04 / Hidden in Plain Sight

Most NMCPs still deploy dual-AI nets uniformly. A March 2026 paper showed that net mix should follow the resistance map, not the country border.

What just became possible

A March 2026 medRxiv paper modeled optimized deployment strategies for dual-AI ITNs and IRS by sub-district resistance profile. Net efficacy increases 20-40% when chlorfenapyr nets go to chlorfenapyr-susceptible districts and PBO synergist nets go to metabolic-resistance districts. The technical insight: resistance is heterogeneous and procurement should be too.

Why now

Global ITN/IRS market: $2.5B to $3.1B annually. Transition to dual-AI nets requires $132M to $159M extra per year in sub-Saharan Africa. The Global Fund and UNICEF project 300 million net demand for 2024-2026. The Bill and Melinda Gates Foundation put $85M into IVCC (2025-2030) for next-gen ITN/IRS. Optimized deployment is the next funding cycle's priority.

What you'd build

A procurement-decision SaaS for NMCPs: resistance-mapped recommendations for net mix, IRS chemistry, and timing. Sell as a service to NMCPs, UNICEF, and the Global Fund Revolving Facility. Or build VectorCam-style AI surveillance hardware. Johns Hopkins CBID is hiring for AI vector surveillance roles right now.

Who's already moving

BASF (Interceptor G2), DCT (Royal Guard), Vestergaard (PermaNet Dual) on the net side. V.K.A. Polymers, Tijanjin Yorkool on emerging manufacturers. UC Irvine (Yan Lab) on population genetics. WHO and IVCC coordinate, but neither sells a procurement-grade decision tool. No commercial competitor on resistance-mapped deployment recommendations.

The gap

Real-time resistance surveillance feeds need to plug into the procurement model. NMCPs need WHO endorsement to act on data-driven net mix decisions. WHO Vector Control Advisory Group governs prequalification. FNIH Grand Challenges has $6M for novel insecticide discovery. American Mosquito Control Association open call for synergists. The first SaaS to land 5 NMCP contracts owns the data layer.

05 / Watch This Space

Diffusion image models can now run 2-3x faster without retraining, on the same hardware. The next wave of image gen pricing wars starts here.

What just became possible

A November 2025 arXiv paper showed training-free 2-3x acceleration of DiT and FLUX diffusion models via cross-scale invariance caching. No retraining, no fine-tuning, no quality loss. The math: feature reuse across timesteps with adaptive corrections. The library wraps PyTorch and TensorRT.

Why now

AI image generation market: $4.8B in 2026 to $30B by 2033 at 32.5% CAGR. GPU-as-a-service: $7.38B to $37B in the same window. Mid-market AI SaaS companies spending $50k+/month on GPUs are over-provisioning to handle traffic spikes. The savings drop straight to gross margin. ML infrastructure engineers cost $130k to $200k+.

What you'd build

A drop-in optimization library targeted at logo generators, ad creative platforms, e-commerce image services. Pricing as a percentage of cloud spend saved. Or open-source the core and sell support and integration. The buyer is the engineering lead at a Series A/B image gen SaaS who needs to cut burn this quarter.

Who's already moving

NVIDIA ComfyUI + TensorRT (2-3x speedup with NVFP4/FP8, but NVIDIA-only). Hugging Face Diffusers + Optimum (open source, manual tuning). BentoML for deployment. Together AI's optimized stack. The whitespace: hardware-agnostic, drop-in, no code changes for non-technical product teams.

The gap

Maintaining the kernel-level optimization across CUDA versions and AMD ROCm releases. Zero-quality-loss SLAs. NSF Advancements in AI for Science: $100k to $350k per year for 3 years. DOE has the same. The right wedge: ship as a paid Hugging Face Spaces plugin first, then expand to enterprise on-prem.

See you next week.

- Theis

Fontlume #7: The infrastructure layer is where the money is hiding

01 / Build This Weekend

DeFi protocols pay $250k and wait six weeks for a smart contract audit. An LLM chained to symbolic execution can do it in hours with zero false positives.

02 / AI Makes This Possible

97% KV cache reduction with under 1% accuracy loss. LLM inference economics just inverted.

03 / Deep Tech Bet

The 4.7kb AAV packaging limit has killed every large-gene therapy program. A split-hexamer stabilization just lifted the cap.

04 / Hidden in Plain Sight

Most NMCPs still deploy dual-AI nets uniformly. A March 2026 paper showed that net mix should follow the resistance map, not the country border.

05 / Watch This Space

Diffusion image models can now run 2-3x faster without retraining, on the same hardware. The next wave of image gen pricing wars starts here.

Keep Reading

Fontlume