Ollama

TechXConf 2024: Insights into AI and Cloud Innovations

Attending TechXConf 2024, Asia’s largest AI and Cloud conference, was an eye-opener! Over two packed days, I explored groundbreaking topics and tools that left me eager to dive deeper. Here’s a quick rundown of three standout sessions that I found particularly fascinating. 1. Unleashing the Power of Azure AI with Microsoft Fabric Speaker: Vinodh Kumar Session Theme: Extracting Insights from Documents In this session, Vinodh Kumar demonstrated how Azure Document Intelligence, integrated with Microsoft Fabric, can revolutionize document processing workflows. Using AI-powered tools, you can extract data from complex PDFs or document formats, seamlessly load it into data lake solutions, and use it for advanced analytics. ...

Part 2: Exposing and Scaling the Ollama Model on AKS

Part 2: Exposing and Scaling the Ollama Model on AKS Ollama is a versatile platform designed for deploying and managing language models like Llama. It’s particularly suited for environments where large models are run on Kubernetes clusters, utilizing GPU resources efficiently. In this guide, we’ll explore deploying the Ollama model server using Docker and the CLI, and we’ll reference important configurations and best practices to make the deployment seamless. Why Ollama? Ollama offers a streamlined way to serve and scale language models in Kubernetes environments, with built-in support for GPU acceleration. This makes it an ideal choice for deploying large models that demand substantial computational power, such as the Llama models. Ollama’s compatibility with Docker and Kubernetes lets developers and data scientists quickly spin up model-serving instances, ensuring high availability and performance in both development and production setups. ...

Part 1: Preparing Your Environment and Setting Up AKS for Ollama Models

Part 1: Preparing Your Environment and Setting Up AKS for Ollama Models In this guide, we will explore the step-by-step process of deploying Ollama models in an Azure Kubernetes Service (AKS) cluster. We will cover the necessary prerequisites, including setting up a GPU node pool to optimize performance for machine learning tasks, and using Helm charts to manage the deployment of the GPU operator, which is crucial for handling GPU resources effectively. This guide assumes that your AKS cluster is already deployed and configured, and we will provide detailed commands and configurations to ensure a smooth deployment process. ...