Notes to Self

Alex Sokolsky's Notes on Computers and Programming

Llama

Reference: ubuntu-22-04-from-zero-to-70b-llama.

Prerequsites

sudo apt update
sudo apt install python3-pip

My current OS:

> lsb_release -a
No LSB modules are available.
Distributor ID: Linuxmint
Description:    Linux Mint 22.1
Release:        22.1
Codename:       xia

Install CUDA Toolkit

https://developer.nvidia.com/cuda-downloads

Make your choices, then:

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2404/x86_64/cuda-keyring_1.1-1_all.deb
sudo dpkg -i cuda-keyring_1.1-1_all.deb
sudo apt-get update
sudo apt-get -y install cuda-toolkit-12-8

Venv

python3 -m venv .venv
source .venv/bin/activate
pip3 install -r requirements.txt

Check it:

(.venv)  alex@exi  ~/Projects/pytorch  python3 -c "import torch; print(torch.cuda.device_count())"
1

Install ollama

https://github.com/ollama/ollama

curl https://ollama.ai/install.sh | sh
>>> Installing ollama to /usr/local
[sudo] password for alex:
>>> Downloading Linux amd64 bundle
######################################################################## 100.0%
>>> Creating ollama user...
>>> Adding ollama user to render group...
>>> Adding ollama user to video group...
>>> Adding current user to ollama group...
>>> Creating ollama systemd service...
>>> Enabling and starting ollama service...
Created symlink /etc/systemd/system/default.target.wants/ollama.service → /etc/systemd/system/ollama.service.
>>> NVIDIA GPU installed.

Then:

ollama run codellama:70b