To make it short: I’m now using an existing opensource tool for VSCode which can point to my local server. It’s called open copilot and is licensed under Apache 2 which allows commercial use.
Once setting up the url for your local server you can query the extension “cody” with the icon on the left sidebar or within inline code.
Here i will describe how to install the code llama modell on one of your VM’s. Problem here is that you usually loose the GPU support because MS is still not able to provide GPU-P for Hyper-V. TY MS at this point.
First prepare a vm, I’m choosing ubuntu srv 22.04.03 LTS. Assigning 8 cores and 64GB of RAM.
Check if git is installed (insert “git –version” in the terminal):
Clone the llama.cpp project, insert the following commands in the terminal (“make” w/o anything is cpu only, you may need to install the gcc compiler):
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
sudo apt install build-essential //gcc compiler
make
Downloading the model can be done with a nice script:
This is a an example on how to use AI in your own environment. I’m using the popular LLM Code Llama from Meta which is open source (also for commercial use) and supports C#, my main programming language. Another advantage is that you can use it in your own environment and your most value asset the sourcecode will not leave your network.
Code Llama comes in different sizes, I’m currently using the 70B Model which returns the best result but requires quite a bit of resources. I’m showing you how to self host a tool which runs the llm locally on a central server and write a simple VSCode extension to access the api to query results.
But first lets try to run it locally on my development machine. For this just install ollama (MIT License) and select the model from meta.
It usually gives you a code snipped and an explanation. It works, but it may be a bit inconvenient. In the next post, I’ll show you how to run the modell on your own virtual environment and access it via web or inside your code editor.
I just found those PCIe card from Asus (ASUS Hyper M.2 x16 Gen 4) for about 50€ where you can connect 4 additional NVMe drives to your Mainboard. Its for my Epyc board (ArockRack ROMED8-2T) but it works also with sTRX40 boards. I’ve put 4x Samsung 970 EVO Plus in a Raid 1 configuration. The storage will be used primarily for the VMs and the Databases. For the adapter to work you will need to change the PCIe lane configuration of the slot from 16x to 4x+4x+ 4x+4x.
This is my old NAS based on an Intel Core i3-6100 from 2016. The system is using 5 Seagate NAS HDDs (ST4000VN000) in a storage space configuration using raid 5 mode. It was reliable, quiet and it ran almost 5 years without any issue. However it was too slow and i needed more space. It will serve as a playground for my future Youtube videos and maybe later as a backup machine.
Finally i received my new Microsemi HBA 1100-16i SAS Controller. Now i can begin with setting up my hyperconverged server. This host bus adapter offers 4×4 internal Sata/SAS channels. Together with the ports from the mainboards this should be more than enough for any upgrades in the next few years.
We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will assume that you are happy with it.OkPrivacy policy