Tag: Ollama

  • Deploying Deepseek R1 in Ubuntu

    Hello everyone! Hope you all have been well. I’ve been messing around with AI and different models for my job as we are implementing AI in our software.

    I wanted to learn more about this, and it just so happened that Deepseek R1 was announced and I decided to start there. I originally installed this on my Macbook Pro and I installed a smaller model, and for the hardware, it worked well. However, my son needed the laptop so that he could record music so I restored it back to MacOS and am now using my old Linux laptop that I used when I was at Canonical. This laptop is a beast. It’s a little on the older side, but here’s what it has under the hood:

    • Intel 7th Gen Core i7 processor, 8 core, 3.8 gHz
    • 32 GB DDR4 Memory
    • 256GB SSD
    • 1TB HDD
    • Nvidia Geforce GTX 1050 with 4GB vRAM

    So, these are the steps that I did to install Deepseek R1 and Open-WebUI as a docker container on my laptop for testing.

    First thing I did was install Ollama, which is the LLM from Meta that works with Deepseek R1 Models.

    First thing, you need to download and install Ollama. To do this all you need to do is run the following command:

    curl -fsSL https://ollama.com/install.sh | sh

    After this, I had to add an Environment variable to the systemd service. The systemd service is located in /etc/systemd/system/ollama.service

    Under the [Service] section, add the following:

    Environment="OLLAMA_HOST=0.0.0.0"

    This will allow Ollama to listen and serve on all clients. Since I use Docker this works best. I kept running into issue getting Open-WebUI to connect to my Deepseek model without doing this.

    Next, you need to reload the daemon and the Ollama Service:

    sudo systemctl daemon-reload
    sudo systemctl restart ollama.service

    Now, we need to load the model. I use the 8 Billion Parameter model since my laptop can handle that fairly easily. To load this model use the following command:

    ollama run deepseek-r1:8b

    There are other models you can use depending on your system. The 1.5 billion parameter is the smallest, and works farily well on most systems. I ran this model on Raspberry Pi’s and on my Mac Laptop with 16GB of and no GPU, and it ran well. To see the different models, you can check out the details on Ollama’s website here:

    https://ollama.com/library/deepseek-r1

    You will be dropped in to the Ollama shell where you can interact with the model here. To exit, just type /bye in the prompt and you will be back at the Linux shell.

    Next, we need to install a nice Web front end. I use Open-WebUI since it works like ChatGPT, and is super simple to setup.

    I use Open-WebUI as a Docker container on my laptop to keep it nice and clean. If I want to disable and stop using this, I can remove the container and my system is nice and clean. Plus updating the web front end is really easy with Docker containers.

    Make sure you install Docker on your machine. You can use Snaps or Apt. I followed the instructions on Docker website. It’s pretty straight forward. After you install Docker, and add yourself to the docker group. After that, log out and log back in so that the group membership gets applied.

    I also had to install the Nvidia Container Toolkit so that I could use the GPU in my containers. To do this run the following command to add the repo to Ubuntu and then use Apt to install:

    curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
      && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
        sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
        sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

    Next, we need to update the sources and install the toolkit:

    sudo apt update
    sudo apt install -y nvidia-container-toolkit

    Next, we need to restart the Docker daemon so it uses the toolkit:

    sudo systemctl restart docker

    Once that has been completed, we need to pull the container:

    docker pull ghcr.io/open-webui/open-webui:cuda

    After this, run the following command to start the container and have it run at startup:

    docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always --gpus all ghcr.io/open-webui/open-webui:cuda

    Now, open your web browser, and point it to:

    http://localhost:3000

    On the landing page, setup a new Admin user and you done. Select your model from the pull down in the top left corner. Ask your new chatbot a question and your done.