Skip to main content

Posts

Showing posts with the label wayang

What are the performance implications of cross-platform execution within Wayang, and how can these be optimized for each cloud provider?

Apache Wayang is a dataflow and distributed computing framework designed for cross-platform data processing, enabling applications to be decoupled from underlying platforms. This allows for platform-agnostic application development. Wayang's cross-platform optimizer determines the most efficient execution plan across various platforms, such as Apache Flink and Apache Spark. The primary performance challenges in cross-platform execution within Wayang include heterogeneous hardware, network latency and bandwidth, data locality, resource management, vendor-specific optimizations, and abstraction overhead. This report analyzes the performance implications of cross-platform execution in Wayang, focusing on optimization strategies for major cloud providers like AWS, Azure, and GCP, as of June 09, 2025. The key benefit of Wayang is its ability to optimize execution plans across multiple platforms, potentially leading to significant performance gains compared to single-platform execution....

Some fun with Apache Wayang and Spark / Tensorflow

Apache Wayang is an open-source Federated Learning (FL) framework developed by the Apache Software Foundation. It provides a platform for distributed machine learning, with a focus on ease of use and flexibility. It supports multiple FL scenarios and provides a variety of tools and components for building FL systems. It also includes support for various communication protocols and data formats, as well as integration with other Apache projects such as Apache Kafka and Apache Pulsar for data streaming. The project aims to make it easier to develop and deploy machine learning models in decentralized environments. It's important to note that this are just examples and they may not be the way for your project to interact with Apache Wayang, you may need to check the documentation of the Apache Wayang project ( https://wayang.apache.org ) to see how to interact with it. I just point out how easy it is to use different languages to interact between Wayang and Spark. Also, you need to mak...

Get Apache Wayang ready to test within 5 minutes

Hey followers, I often get ask how to get Apache Wayang ( https://wayang.apache.org ) up and running without having a full big data processing system behind. We heard you, we built a full fledged docker container, called BDE (Blossom Development Environment), which is basically Wayang. Here's the repo:  https://github.com/databloom-ai/BDE I made a short screencast how to get it running with Docker on OSX, and we also have made two hands-on videos to explain the first steps. Let's start with the basics - Docker. Get the whole platform with: docker pull ghcr.io/databloom-ai/bde:main At the end the Jupyter notebook address is shown, control-click on it (OS X); the browser should open and login you automatically: Voila - done. You have now a full working Wayang environment, we prepared three notebooks to make it more easy to dive into. Watch our development tutorial video (part 1) to get a better understanding what Wayang can do, and what not. Click the video below: 

Compile Apache Wayang on Mac M1

We release Apache Wayang  v0.6.0 in the next days, and during the release testing I was wondering if we get wayang on M1 (ARM) running. And yes, a few small changes - voila! Install maven, scala, sqlite and groovy: brew install maven scala groovy sqlite Download openJDK 8 for M1: https://www.azul.com/downloads/?version=java-8-lts&os=macos&architecture=arm-64-bit&package=jdk  and install the pkg.  Get Apache Wayang either from  https://dist.apache.org/repos/dist/dev/wayang/ , or git-clone directly: git clone https://github.com/apache/incubator-wayang.git Start the build process: cd incubator-wayang export JAVA_HOME=/Library/Java/JavaVirtualMachines/zulu-8.jdk/Contents/Home mvn clean install Ready to go: [INFO] Reactor Summary for Apache Wayang 0.6.0-SNAPSHOT: ... [INFO] BUILD SUCCESS [INFO] ------------------------------------------------------------------------ [INFO] Total time:  06:24 min After the build is done the binaries are located in mavens ...