Skip to main content

Posts

Showing posts with the label AI

SynthLink comparison against Google NQ and FRAMES

Claude 4 is out and I couldn't resist to push it for judging and critically review our new developed open benchmark for Deep Search Multi-Hop systems . Here's the outcome, uncut: Based on my analysis of the SynthLink GitHub repository and research on Google Natural Questions and FRAMES, here's a comprehensive comparison of these three datasets: SynthLink SynthLink is a collection of complex, multi-hop questions designed for testing deep search / deep research systems, split into categories like Historical Impact Analysis, Economic and Industrial Shifts, Environmental and Ecological Consequences, Scientific and Technological Evolution, Policy and Social Movements, and STEM and Future Tech. Key Features: Uses a scoring system that measures answer accuracy, source relevance, reasoning quality, fact-checking, and search efficiency Five evaluation metrics: F1 Score (answer accuracy), Precision@5 (source relevance), Reasoning Quality Score (RQS), Fact-Checking Score (FCS), and It...

How ChatGPT o1 Helped Build XRPayroll

 Up until recently, I was a skeptic about AI’s role in coding. The idea that AI could replace developers? Let’s just say I wasn’t buying it. But after this December, my perspective has shifted. Not only can AI support developers—it can dramatically speed up development. Here’s the story of how XRPayroll, my new app, came to life with the help of OpenAI’s ChatGPT o1. From Simple UI to Functional App When I started in December, I had a straightforward goal: build a simple XRP UI. Fast forward a few weeks, and XRPayroll is now an app with user management, admin login, and basic role-based access control (RBAC). What’s incredible is that approximately 70% of this app was AI-generated. Using OpenAI’s ChatGPT o1, I managed to implement: Vue.js Views : From basic layouts to dynamic components, o1 helped me structure and write reusable code efficiently. HTML : Generating clean, functional markup without getting bogged down in the details. SQLite Queries : Writing database calls with accura...

What the Heck is Superposition and Entanglement?

If you’ve ever heard the words superposition or entanglement thrown around in conversations about quantum physics, you may have nodded politely while your brain quietly filed them away in the "too confusing to deal with" folder.  These aren't just theoretical quirks; they're the foundation of mind-bending tech like Google's latest quantum chip, the Willow with its 105 qubits. Superposition challenges our understanding of reality, suggesting that particles don't have definite states until observed. This principle is crucial in quantum technologies, enabling phenomena like quantum computing and quantum cryptography. What's in for us? Short, nothing at the moment. 105 qubits sounds awesome, but it would neither crack encryption nor enhance AI in the next few years. There are some use cases for Willow, like drug (protein) discovery or solving certain mathematical problems when they aren't too complicated. Right now, Google managed to turn physical qubits ...

Can AI Really Code?

My upcoming novel,  Catalyst , is set in a world where AI is a major player in shaping the human future. I did some research into how AI is currently being used in software development and found that it has some amazing capabilities, but also some limitations that are a bit concerning. I'd even go so far as to say that those models are a bit of a hoax. They're impressive, but they don't actually solve anything. Yes, AI coding assistants like Devin and Copilot are impressive in demos and demo videos. In reality, they're not as powerful as you'd think, but they're great for simple tasks like crafting email parsing functions or authentication flows. However, I ran into some issues when I tried to use it in more complex situations. When I asked the AI to " write a connector from a database to ingest data into Spark ," it didn't understand and made mistakes. And that is a pure, simple and so well documented task that every non-coder could do that by sim...

Run Llama3 (or any LLM / SLM) on Your MacBook in 2024

I'm gonna be real with you: the Cloud and SaaS / PaaS is great... until it isn't. When you're elbow-deep in doing something with the likes of ChatGPT or Gemini or whatever, the last thing you need is your AI assistant starts choking (It seems that upper network connection was reset) because 5G or the local WiFi crapped out or some server halfway across the world is having a meltdown(s). That's why I'm all about running large language models (LLMs) like Llama3 locally. Yep, right on your trusty MacBook. Sure, the cloud's got its perks, but here's why local is the way to go, especially for me: Privacy:  When you're brainstorming the next big thing, you don't want your ideas floating around on some random server. Keeping your data local means it's  yours , and that's a level of control I can get behind. Offline = Uninterrupted Flow:  Whether you're on a plane, at a coffee shop with spotty wifi, or jus...

AI in Product Development?

I like product ideation brainstorming—done right and focused,  it opens my mind to think much more analytically about an idea, its development, and its trajectory. But on the other hand, I often had brainstorming sessions, and they were just a waste of time. And to be honest, can you count how often a session went sideways, got stuck in the same old thought patterns, and the loudest voices in the room dominate the conversation?  I did a test yesterday with GPT-4o, and it blew the lid off my creative potential. I had tried the same exercise with the earlier models, and it was a colossal waste of time and energy. Adding AI To The Product Team, worth? Short, after the test, yes, it's definitely worth. Why? We as startup founders, product managers or developer, our job isn't just about executing on a roadmap, we have to build the roadmap and come up with the  right  product idea at the right time - in the first place. That means s...

AI's False Reality: Understanding Hallucination

Artificial Intelligence (AI) has leapfrogged to the poster child of technological innovation, on track to transform industries in a scale similar to the Industrial Revolution of the 1800s. But in this case, as cutting-edge technology, AI presents its own unique challenge, exploiting our human behavior of "love to trust", we as humans face a challenge: AI hallucinations. This phenomenon, where AI models generate outputs that are factually incorrect, misleading, or entirely fabricated, raises complex questions about the reliability and trust of AI models and larger systems. The tendency for AI to hallucinate comes from several interrelated factors. Overfitting – a condition where models become overly specialized to their training data – can lead to confident but wildly inaccurate responses when presented with novel scenarios (Guo et al., 2017). Moreover, biases embedded within datasets shape the models' understanding of the world; if these datasets are flawed or unreprese...

GPT & GenAI for Startup Storytelling

OpenAI and Bard   are the most used GenAI tools today; the first one has a massive Microsoft investment, and the other one is an experiment from Google. But did you know that you can also use them to optimize and hack your startup?  For startups, creating pitch scripts, sales emails, and elevator pitches with generative AI (GenAI) can help you not only save time but also validate your marketing and wording. Curious? Here are a few prompt hacks for startups to create,improve, and validate buyer personas, your startup's mission/vision statements, and unique selling proposition (USP) definitions. First Step: Introduce yourself and your startup Introduce yourself, your startup, your website, your idea, your position, and in a few words what you are doing to the chatbot: Prompt : I'm NAME and our startup NAME, with website URL, is doing WHATEVER. With PRODUCT NAME, we aim to change or disrupt INDUSTRY. Bard is able to pull information from y...

Some fun with Apache Wayang and Spark / Tensorflow

Apache Wayang is an open-source Federated Learning (FL) framework developed by the Apache Software Foundation. It provides a platform for distributed machine learning, with a focus on ease of use and flexibility. It supports multiple FL scenarios and provides a variety of tools and components for building FL systems. It also includes support for various communication protocols and data formats, as well as integration with other Apache projects such as Apache Kafka and Apache Pulsar for data streaming. The project aims to make it easier to develop and deploy machine learning models in decentralized environments. It's important to note that this are just examples and they may not be the way for your project to interact with Apache Wayang, you may need to check the documentation of the Apache Wayang project ( https://wayang.apache.org ) to see how to interact with it. I just point out how easy it is to use different languages to interact between Wayang and Spark. Also, you need to mak...

Combined Federated Data Services with Blossom and Flower

When it comes to Federated Learning frameworks we typically find two leading open source projects - Apache Wayang [2] (maintained by  databloom ) and Flower [3] (maintained by  Adap ). And at the first view both frameworks seem to do the same. But, as usual, the 2nd view tells another story. How does Flower differ from Wayang? Flower is a federated learning system, written in Python and supports a large number of training and AI frameworks. The beauty of Flower is the strategy concept [4]; the data scientist can define which and how a dedicated framework is used. Flower delivers the model to the desired framework and watches the execution, gets the calculations back and starts the next cycle. That makes Federated Learning in Python easy, but also limits the use at the same time to platforms supported by Python.  Flower has, as far as I could see, no data query optimizer; an optimizer understands the code and splits the model into smaller pieces to use multiple frameworks ...

Compile Apache Wayang on Mac M1

We release Apache Wayang  v0.6.0 in the next days, and during the release testing I was wondering if we get wayang on M1 (ARM) running. And yes, a few small changes - voila! Install maven, scala, sqlite and groovy: brew install maven scala groovy sqlite Download openJDK 8 for M1: https://www.azul.com/downloads/?version=java-8-lts&os=macos&architecture=arm-64-bit&package=jdk  and install the pkg.  Get Apache Wayang either from  https://dist.apache.org/repos/dist/dev/wayang/ , or git-clone directly: git clone https://github.com/apache/incubator-wayang.git Start the build process: cd incubator-wayang export JAVA_HOME=/Library/Java/JavaVirtualMachines/zulu-8.jdk/Contents/Home mvn clean install Ready to go: [INFO] Reactor Summary for Apache Wayang 0.6.0-SNAPSHOT: ... [INFO] BUILD SUCCESS [INFO] ------------------------------------------------------------------------ [INFO] Total time:  06:24 min After the build is done the binaries are located in mavens ...

Next Internet comes with IoT

The Internet we know is a great space for collaboration, social media and gaming. But when it comes to business or transactions, the power belongs to few big ones. Remember the S3 outage and half of the north-american services where offline? Or the Dny hack which kicked out half of the internet for hours? The next internet could be a blockchain based independent network, using as many protocols as available and there is no one person in control of it and it is run on the Internet. In a nutshell, Blockchain is a decentralized system in which every transaction gets mathematically approved by the members of the system, therefore every member of that transaction knows about it. The information of the transaction is stored in the distributed servers of the blockchain. That makes manipulations highly impossible, and the transaction is also highly available at every time. IoT devices are getting more and more intelligent and can now create meshed networks by itself, switching from a sensor ...