Skip to main content

Portfolio

This page lists selected projects and systems I have built or contributed to. Each entry includes a short, factual description and links to source code or live documentation where available.


Lucendex

A neutral, non-custodial execution layer for XRPL trading.

Repository: https://github.com/2pk03/lucendex
Website: https://lucendex.com

Lucendex is a non-custodial, deterministic routing engine for the XRPL decentralized exchange.
It indexes AMM pools and orderbook data, evaluates available paths, and produces quotes using a deterministic QuoteHash mechanism.

The service uses PostgreSQL and PL/pgSQL for indexing and routing logic, and provides an Ed25519-authenticated API access.


docAI Toolkit

A Python toolkit for document analysis workflows.

Repository: https://github.com/2pk03/docai
PyPI: https://pypi.org/project/docai-toolkit/

Provides utilities for loading documents, splitting and preprocessing text, integrating embeddings or ML-based processing, and preparing inputs for AI pipelines or further downstream processing.

Available as a published package on PyPI.


kaf-s3 Connector

A Kafka-to-S3 connector implemented in Python.

Repository: https://github.com/2pk03/kaf-s3
PyPI: https://pypi.org/project/kaf-s3-connector/

Consumes records from Kafka topics, batches them, and writes them to S3 (or compatible object storage) using configurable batching parameters and storage formats.

Published as a package on PyPI.


Scalytics-Federated / Schema→Iceberg Application

Internal system combining data ingestion, schema normalization and processing pipelines to produce “AI-ready” data views for analytics or ML.

Architecture integrates data from arbitrary source systems or message topics, normalizes schema, and writes unified results into an Iceberg-based data lakehouse.
Processing is performed via Apache Flink for streaming / batch workload support.


Apache Wayang (as committer and member of the PMC)

Project: Apache Wayang (incubating) — https://wayang.apache.org/ 

What Apache Wayang is:

Wayang is a unified data processing framework that allows developers to write data workflows in a platform-agnostic way. It translates logical plans into an intermediate representation (WayangPlan), then optimizes them and executes them across one or more processing engines — e.g. relational databases, batch engines, stream engines — without requiring users to write engine-specific code.

Why Wayang matters:

  • Enables cross-platform execution: same data flow code can run on different engines (PostgreSQL, Spark, Flink, etc.) depending on workload and environment. 

  • Provides cost and performance optimization: its optimizer selects the most efficient execution plan across platforms.

  • Supports federated or distributed data scenarios and heterogeneous data infrastructures, useful when data lives in multiple, different storage or processing systems. 

My contributions / involvement:
I contribute as a committer and PMC member of Wayang. I have worked on development, architecture and integration of Wayang, bridging its core engine with data-processing pipelines (batch/stream), data ingestion integrations, and use cases for data-lake / AI-ready datasets.

You can find a detailed write-up of my experience and discussion of performance implications in my blog post: What are performance implications of distributed data processing across multiple engines

Popular posts from this blog

Why Is Customer Obsession Disappearing?

 It's wild that even with all the cool tech we've got these days, like AI solving complex equations and doing business across time zones in a flash, so many companies are still struggling with the basics: taking care of their customers.The drama around Coinbase's customer support is a prime example of even tech giants messing up. And it's not just Coinbase — it's a big-picture issue for the whole industry. At some point, the idea of "customer obsession" got replaced with "customer automation," and now we're seeing the problems that came with it. "Cases" What Not to Do Coinbase, as main example, has long been synonymous with making cryptocurrency accessible. Whether you’re a first-time buyer or a seasoned trader, their platform was once the gold standard for user experience. But lately, their customer support practices have been making headlines for all the wrong reasons: Coinbase - Stuck in the Loop:  Users have reported being caugh...

MySQL Scaling in 2024

When your MySQL database reaches its performance limits, vertical scaling through hardware upgrades provides a temporary solution. Long-term growth, though, requires a more comprehensive approach. This involves optimizing the database strategically and integrating complementary technologies. Caching The implementation of a caching layer, such as Memcached or Redis , can result in a notable reduction in the load and an increase ni performance at MySQL. In-memory stores cache data that is accessed frequently, enabling near-instantaneous responses and freeing the database for other tasks. For applications with heavy read traffic on relatively static data (e.g. product catalogues, user profiles), caching represents a low-effort, high-impact solution. Consider a online shop product catalogue with thousands of items. With each visit to the website, the application queries the database in order to retrieve product details. By using caching, the retrieved details can be stored in Memcached (a...

What the Heck is Superposition and Entanglement?

If you’ve ever heard the words superposition or entanglement thrown around in conversations about quantum physics, you may have nodded politely while your brain quietly filed them away in the "too confusing to deal with" folder.  These aren't just theoretical quirks; they're the foundation of mind-bending tech like Google's latest quantum chip, the Willow with its 105 qubits. Superposition challenges our understanding of reality, suggesting that particles don't have definite states until observed. This principle is crucial in quantum technologies, enabling phenomena like quantum computing and quantum cryptography. What's in for us? Short, nothing at the moment. 105 qubits sounds awesome, but it would neither crack encryption nor enhance AI in the next few years. There are some use cases for Willow, like drug (protein) discovery or solving certain mathematical problems when they aren't too complicated. Right now, Google managed to turn physical qubits ...