Self Hosted Code-Copilot

Here i will describe how to install the code llama modell on one of your VM’s. Problem here is that you usually loose the GPU support because MS is still not able to provide GPU-P for Hyper-V. TY MS at this point.

First prepare a vm, I’m choosing ubuntu srv 22.04.03 LTS. Assigning 8 cores and 64GB of RAM.

Check if git is installed (insert “git –version” in the terminal):

Clone the llama.cpp project, insert the following commands in the terminal (“make” w/o anything is cpu only, you may need to install the gcc compiler):

git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
sudo apt install build-essential //gcc compiler
make

Downloading the model can be done with a nice script:

bash <(curl -sSL https://g.bodaay.io/hfd) -m TheBloke/CodeLlama-70B-hf-GGUF --concurrent 8 --storage models/

After downloading the modell, we can start llama.cpp.

./server \
    --model ./models/TheBloke_CodeLlama-70B-hf-GGUF/codellama-70b-hf.Q5_K_M.gguf --host 192.168.50.13 --port 8080 -c 2048

Next step would be to create a plugin for VS e.g. to autofill code from commentblock.

This is a an example on how to use AI in your own environment. I’m using the popular LLM Code Llama from Meta which is open source (also for commercial use) and supports C#, my main programming language. Another advantage is that you can use it in your own environment and your most value asset the sourcecode will not leave your network.

Code Llama comes in different sizes, I’m currently using the 70B Model which returns the best result but requires quite a bit of resources. I’m showing you how to self host a tool which runs the llm locally on a central server and write a simple VSCode extension to access the api to query results.

But first lets try to run it locally on my development machine. For this just install ollama (MIT License) and select the model from meta.

It usually gives you a code snipped and an explanation. It works, but it may be a bit inconvenient. In the next post, I’ll show you how to run the modell on your own virtual environment and access it via web or inside your code editor.