How To Setup DeepSeek-R1 LLM AI ChatBot Using Ollama On An Ubuntu Linux GPU Cloud Server (VPS)
In this video, you will learn how to run your own DeepSeek-R1 LLM AI model on an Ubuntu Linux based NVIDIA H100 GPU cloud server with a single GPU or multiple GPUs depending on the DeepSeek model you require. The Ollama platform will allow us to find and run the DeepSeek-R1 LLM model easily from its library of models on our GPU server. DigitalOcean will serve as the GPU cloud server host provider in this video demonstration, as they offer a convenient service called GPU Droplets to easily create GPU servers with all the necessary software and drivers. DeepSeek-R1 is DeepSeek's first generation reasoning model, it achieves performance comparable to the big AI tech companies at a fraction of the cost in code, mathematics, and reasoning tasks. To attain this performance, DeepSeek used a process known as distillation, to distil larger models into smaller models, resulting in better performance compared to smaller models reasoning patterns that were created or discovered using reinforcement learning (RL).
🔵 Free $200 DigitalOcean cloud credits using my referral link: https://digitalocean.pxf.io/c/1245219/1373759/15890
Download PuTTY And PuTTYgen https://www.putty.org/
Ollama https://ollama.com/
1. Go to https://digitalocean.pxf.io/c/1245219/1373759/15890 and create a free DigitalOcean account. The above link is my referral link granting you $200 in free cloud credit for 60 days as a new user
2. On your projects dashboard, click create and select GPU Droplets
3. Configure your droplet to your liking ensuring to select DigitalOceans AI/ML Ready OS image
4. Add a Public SSH Key for authentication, and for logging into your GPU droplet via SSH. You will need to download and install PuTTY you can do so here https://www.putty.org/ or follow this step by step video of mine
5. Once you got PuTTY and all its complementary software, open PuTTYgen (PuTTY Key Generator)
6. Click Generate and move your mouse randomly to generate your public and private key pair
7. Copy and paste your public key into DigitalOcean, click add SSH Key and and select your added public key for use with the droplet you're creating
8. In PuTTYgen save your Public key on your local machine and also save your Private key as a backup
9. Go back to your DigitalOcean GPU droplet creation page and click Create GPU Droplet
10. Once your GPU droplet is active, copy the Public IPV4 address of your GPU droplet in the Overview tab under connection details and open up the PuTTY program
11. Paste in your GPU droplets IP address into the hostname section of PuTTY
12. Click the + symbol next to the SSH category in PuTTY
13. Click Auth
14. Click Browse and find your Private SSH key that you generated using PuTTYgen and saved on your local device.
15. Click Open to select it and then click Open again on PuTTY
16. The PuTTY command line terminal will then open and you should see Login as at the top left of the terminal
17. Type root and press Enter on your keyboard
18. You have now logged into your GPU droplet. To check your NVIDIA GPU specs, type one of the following commands in the command line terminal:
nvidia-smi
nvidia-smi -L
19. In a browser, navigate to https://ollama.com/ and click on Download
20. Click Linux
21. Copy the single install command below to install Ollama on your Ubuntu Linux GPU Dropet:
curl -fsSL https://ollama.com/install.sh | sh
22. Paste the Ollama install command into your command line terminal and press Enter on your keyboard to execute the command
23. Once installed, go back to https://ollama.com/ and click on the search box at the top of the page. Search for the AI LLM model you would like to run on your GPU cloud server, which for this video is going to be Deepseek-R1. Press Enter to search and select deepseek-r1.
24. Choose one of the following DeepSeek-R1 AI model sizes: 1.5b, 7b, 8b,14b, 32b, 70b, and 671b. Please note, that the largest DeepSeek-R1 model that can be ran on a single NVIDIA H100 GPU is 70b. You will need the 8xGPU GPU plan to run the 671b DeepSeek-R1 AI model
25. In this video I chose 70b. To the right of the model you selected, you will see the command to copy to run the model. Copy the command below if you also chose 70b also:
ollama run deepseek-r1:70b
26. Paste the command above into the terminal and press Enter.
27. You can now begin interacting with your DeepSeek model through the terminal by typing in your message and and pressing Enter.
Note: For help options type:
/?
To Exit DeepSeek and go back to termina commands type:
/bye
When you exit out of DeepSeek, it does not remember your previous conversation, so keep that in mind. To re-run DeepSeek, re-run the ollama command for the model you selected. In my case it was the following:
ollama run deepseek-r1:70b
Timestamps:
0:00 - Intro & Context
1:18 - Creating GPU Cloud Server
10:59 - Installing Ollama On GPU Server
12:08 - Running & Interacting With DeepSeek-R1
18:20 - Closing & Outro
#DeepSeek #GPU #GpuServer