Skip to main content

Stream IoT data to S3 - the simple way

Listen:

First, a short introduction to infinimesh, an Internet of Things (IoT) platform which runs completely in Kubernetes

infinimesh enables the seamless integration of the entire IoT ecosystem independently from any cloud technology or provider. infinimesh easily manages millions of devices in a compliant, secure, scalable and cost-efficient way without vendor lock-ins.

We released some plugins over the last weeks - a task we had on our roadmap for a while. Here is what we have so far:

Elastic
Connect infinimesh IoT seamless into Elastic.

Timeseries
Redis-timeseries with Grafana for Time Series Analysis and rapid prototyping, can be used in production when configured as a Redis cluster and ready to be hosted via Redis-Cloud.

SAP Hana
All code to connect infinimesh IoT Platform to any SAP Hana instance

Snowflake
All code to connect infinimesh IoT Platform to any Snowflake instance.

Cloud Connect


All code to connect infinimesh IoT Platform to Public Cloud Provider AWS, GCP and Azure. This plugin enables customers to use their own cloud infrastructure and extend infinimesh to other services, like Scalytics, using their own cloud native data pipelines and integration tools.

We have chosen Docker as main technology, because it enables users to run their own plugins in their own space in their controlled environment. And, since our plugins don't consume so much resources, they fit perfectly into the free tiers of AWS EC2 - I use them here in that blog post.

The plugin repository was structured with developer friendliness in mind. All code is written in Go, and the configuration will be done on dockerfiles. Since you need to put credentials into, we highly advise to run the containers in a controlled and secure environment. 

infinimesh UI

Stream IoT data to S3

Here I like to show how easy it is to combine IoT with already installed infrastructures in public clouds. The most used task we figured is the data stream to S3; most of our customers use S3 either directly with AWS, or by implementing their own object storage using the S3 protocol, like MinIO - which is also Kubernetes native.

Of course a private installation of infinimesh or accounts on infinimesh.cloud and AWS are needed, if using the cloud version of both. Here is a screenshot from the SMA device I used to write this post:

Preparation

  1. Spin up an EC2 instance in the free tier with Linux, a t2.micro instance should fit mostly all needs
  2. Log into the VM and install docker as described in the AWS documentation: Docker basics for Amazon ECS - Amazon Elastic Container Service
  3. Install docker-compose and git:
sudo curl -L \
https://github.com/docker/compose/releases/latest/download/docker-compose-$(uname -s)-$(uname -m)\
-o /usr/local/bin/docker-compose \
&& sudo chmod +x /usr/local/bin/docker-compose \
&& sudo yum install git -y

That’s all we need as preparation, now log-off and login again to enable the permissions we have set earlier. 

Setup and Run

  1. Clone the plugin - repo:
    git clone https://github.com/infinimesh/plugins.git
  2. Edit the CloudConnect/docker-compose.yml and replace CHANGEME with your credentials
  3. Compose and start the connector (-d detaches from the console and let the containers run in background):
    docker-compose -f CloudConnect/docker-compose.yml --project-directory . up --build -d
  4. Check the container logs:
    docker logs plugins_csvwriter_1 -f
We used Go as development language, therefore the resource consumption is low:


After one minute the first CSV file should be arriving in S3. That’s all - easy and straightforward.



 Some developer internals

We have built some magic around to make the use of our plugins as easy as possible for customers and at the same time easy to adapt for developers.

First we iterate over /objects, finding all endpoints marked with [device], call the API for each device and store the data as a sliding window into a local redis store, to buffer network latency. After some seconds we send the captured data as CSV to the desired endpoints. In our tests we transported data from up to 2 Million IoT devices over this plugin, each of those devices send every 15 seconds ten key:value pairs as JSON.

Comments

Popular posts from this blog

Why Is Customer Obsession Disappearing?

 It's wild that even with all the cool tech we've got these days, like AI solving complex equations and doing business across time zones in a flash, so many companies are still struggling with the basics: taking care of their customers.The drama around Coinbase's customer support is a prime example of even tech giants messing up. And it's not just Coinbase — it's a big-picture issue for the whole industry. At some point, the idea of "customer obsession" got replaced with "customer automation," and now we're seeing the problems that came with it. "Cases" What Not to Do Coinbase, as main example, has long been synonymous with making cryptocurrency accessible. Whether you’re a first-time buyer or a seasoned trader, their platform was once the gold standard for user experience. But lately, their customer support practices have been making headlines for all the wrong reasons: Coinbase - Stuck in the Loop:  Users have reported being caugh...

MySQL Scaling in 2024

When your MySQL database reaches its performance limits, vertical scaling through hardware upgrades provides a temporary solution. Long-term growth, though, requires a more comprehensive approach. This involves optimizing the database strategically and integrating complementary technologies. Caching The implementation of a caching layer, such as Memcached or Redis , can result in a notable reduction in the load and an increase ni performance at MySQL. In-memory stores cache data that is accessed frequently, enabling near-instantaneous responses and freeing the database for other tasks. For applications with heavy read traffic on relatively static data (e.g. product catalogues, user profiles), caching represents a low-effort, high-impact solution. Consider a online shop product catalogue with thousands of items. With each visit to the website, the application queries the database in order to retrieve product details. By using caching, the retrieved details can be stored in Memcached (a...

Can AI Really Code?

My upcoming novel,  Catalyst , is set in a world where AI is a major player in shaping the human future. I did some research into how AI is currently being used in software development and found that it has some amazing capabilities, but also some limitations that are a bit concerning. I'd even go so far as to say that those models are a bit of a hoax. They're impressive, but they don't actually solve anything. Yes, AI coding assistants like Devin and Copilot are impressive in demos and demo videos. In reality, they're not as powerful as you'd think, but they're great for simple tasks like crafting email parsing functions or authentication flows. However, I ran into some issues when I tried to use it in more complex situations. When I asked the AI to " write a connector from a database to ingest data into Spark ," it didn't understand and made mistakes. And that is a pure, simple and so well documented task that every non-coder could do that by sim...