Triveni — Query-First IoT Pipeline

A Query-First IoT Pipeline

Triveni is an educational system study that turns fresh device telemetry into typed batches, durable table snapshots, hot cached reads, and a stable query surface for dashboards, APIs, and analytical workflows.

Prototype Notice

This is a prototype built for educational purposes after studying GHz Architecture. It highlights the concepts and system ideas, but a lot more engineering work would still be required before treating it as production-ready.

Explore Architecture Read Documents

System Promise

Ingest Once. Query Many.

Instead of treating analytics as a downstream cleanup job, Triveni shapes telemetry for repeated reads from the beginning.

Prototype Status

Built To Explain, Not To Ship

The implementation is meant to communicate the architecture clearly. It is intentionally framed as a study prototype, not as a finished production system.

Prototype Study / Inspired By GHz Architecture

Demo / Overview

SYSTEM WALKTHROUGH

01 / Architecture

CORE COMPONENTS

Query-First Pipeline Diagram

Devices to SQL over Arrow, Parquet, Delta snapshots, and hybrid cache

Architecture diagram for the Triveni query-first IoT data pipeline

No.01

Query-First Foundation

Triveni starts from the read path, not just ingestion, so fresh telemetry is shaped for fast querying from the beginning.

No.02

Serialization And Storage Layout

This part explains why Arrow and Parquet fit repeated analytical reads better than keeping telemetry as row-shaped messages.

No.03

Persistence, Consistency, And Cost

This section shows why durable analytics needs table snapshots, transaction-aware storage, and control over file economics.

No.04

Caching And Hot Data

Triveni keeps metadata and frequently reused data close so object storage stays durable truth without making every read cold.

No.05

Query Serving And System Fit

The final part focuses on the query surface, engine choices, and how Triveni serves dashboards, APIs, and analytical access.

02 / Documentation

CORE DOCUMENTS

Part 101

Query-First Is a Different System Design Goal

Why immediate queryability changes the whole backend shape and pushes design effort into the read path early.

Part 202

Serialization and Storage Layout

How JSON, Protobuf, Arrow, and Parquet affect CPU cost, data shape, and the efficiency of repeated scans.

Part 303

Persistence, Consistency, and Cost

Why raw files are not enough, how table metadata defines truth, and where batching and compaction matter.

Part 404

Caching and Hot Data

How memory, disk, and object storage work together to keep the hot working set fast without breaking correctness.

Part 505

Query Engines and System Fit

How Triveni exposes one query surface while keeping the execution layer flexible for different serving models.

03 / Flow

PIPELINE ARCHITECTURE

01INPUT

Device Events Enter The System

Telemetry arrives from devices as raw events, but Triveni treats the next important operation as querying, not only delivery.

02INGEST

Schema-Aware Ingest Shapes The Data

Incoming events are validated and grouped by a known schema so the pipeline can build typed batches early.

03BATCH

Arrow RecordBatches Form The Query-Native Layer

The pipeline converts event fields into typed columns so later scans avoid repeated row-by-row reconstruction work.

04SNAPSHOT

Parquet Files And Table Snapshots Get Written

Batches are persisted as columnar files and published through snapshot-aware table metadata for durable, consistent reads.

05METADATA

Metadata Resolves What The Table Is Right Now

Readers do not guess from directory listings; they resolve the latest valid snapshot and scan only active files.

06CACHE

Hybrid Cache Keeps Recent And Reused Data Close

Memory protects critical metadata, disk keeps warm files nearby, and object storage remains the durable source of truth.

07QUERY

SQL Runs On A Stable Serving Layer

Triveni exposes one query-facing contract so dashboards, APIs, and analysis tools can access fresh telemetry consistently.

08CONSUMER

Results Power Dashboards, APIs, And Analytics

End users consume the data through product interfaces, not by managing the pipeline internals or storage mechanics.

04 / Project Framing

CORE PRINCIPLES

01 / QUERY-FIRST

QUERY-FIRST THINKING

This prototype starts from the read path. The goal is to explain why queryability changes ingest, storage, and table design from the beginning.

02 / EDUCATIONAL

EDUCATIONAL PROTOTYPE

Triveni is intentionally framed as a study project. It demonstrates concepts clearly, while leaving the heavier production engineering work explicit and unfinished.

03 / GHZ

GHZ INSPIRATION

The system ideas here were shaped after studying GHz Architecture. This project adapts those patterns into a learning-oriented walkthrough for telemetry systems.

04 / MORE

MORE ENGINEERING

Operational hardening, deeper testing, production safety, and runtime validation still need substantial engineering before this should be treated as a real deployable system.