nix-ai/README.md
2024-07-03 02:45:27 +02:00

184 lines
7.8 KiB
Markdown

# Nix AI
Look through options [here](https://search.wavelens.io/)
(c) 2024 Wavelens
## Introducing Nix AI
Enhancing GPU Utilization and Reproducibility in Deep Learning Projects
Nix AI is a novel project built upon the Nix package manager, designed to optimize GPU utilization at runtime for deep learning tasks. By leveraging Nix's capabilities, Nix AI ensures efficient utilization of GPU resources, thereby enhancing the speed and efficency of ai development and training.
One of the key features of Nix AI is its integration with Hydra, a powerful job scheduling system. Hydra enables the efficient management of training queues, allowing for parallel execution of tasks and effective utilization of computational resources. This ensures that deep learning experiments are conducted in a timely and resource-efficient manner.
Furthermore, Nix AI emphasizes reproducibility through its declarative approach. By defining the entire computational environment—including software dependencies, configurations, and environment variables—in a declarative manner, Nix AI facilitates easy reproduction of experimental setups. This ensures that research findings can be validated and replicated with confidence, fostering scientific rigor and collaboration within the deep learning community.
In summary, Nix AI offers a robust solution for enhancing GPU utilization, optimizing job scheduling, and promoting reproducibility in deep learning projects. By combining the power of Nix and Hydra with a declarative approach, Nix AI provides researchers and practitioners with a reliable platform for advancing the state-of-the-art in artificial intelligence.
Here is a brief schema of the new training pipeline:
![Work Schma](./docs/work_schema.png)
### Key Features of Nix AI and Nix
#### Contextualizing Machine Learning on NixOS
Nix AI emerges as a specialized toolset tailored explicitly for NixOS, a Linux distribution distinguished by its declarative and reproducible package management system. By aligning with the Nix package manager, Nix AI endeavors to streamline the intricacies of installing, configuring, and deploying ML models within NixOS environments. This initiative addresses a critical need within the NixOS ecosystem, facilitating smoother integration and operationalization of ML workflows.
#### Operational Dynamics of Nix AI
Nix AI operates as an integrated software library and ecosystem meticulously engineered to facilitate the development of machine learning projects within NixOS environments. Fundamentally, Nix AI leverages the capabilities of the Nix package manager, a cornerstone of NixOS, to furnish a seamless and efficient platform for managing dependencies, environments, and workflows associated with ML endeavors.
#### Foundational Role of Nix Flakes in Reproducibility
At the heart of Nix AI lies the concept of Nix flakes, offering a declarative and reproducible mechanism for defining package dependencies and configurations. Leveraging Nix flakes, Nix AI ensures the reproducibility of ML projects across disparate environments, thereby fostering collaboration and experimentation sans concerns pertaining to inconsistent dependencies or divergent configurations.
#### Containerization for Isolation and Flexibility
When not using NixOS, Nix AI embraces containerization to afford isolated and reproducible environments for ML development and deployment. By encapsulating ML workflows within lightweight and portable containers, Nix AI facilitates seamless experimentation with diverse libraries, frameworks, and configurations, safeguarding system integrity and ensuring uniformity across the development lifecycle.
#### Automated Setup and Configuration
Nix AI streamlines the initialization and customization process for ML development via automated scripts and utilities. Whether provisioning Hydra clusters for distributed training or configuring build machines for continuous integration, Nix AI automates mundane tasks, enabling practitioners to channel their efforts towards model refinement and innovation.
#### Extensibility and Customization
Designed with modularity and extensibility in focus, Nix AI empowers practitioners to tailor ML environments to suit bespoke requirements. Whether integrating with external libraries, extending existing functionalities, or crafting custom workflows, Nix AI provides the flexibility and modularity requisite for addressing diverse use cases and evolving research imperatives.
## Installation
### By using Nix the Package Manager
#### Install Nix and Systemd Containers
```bash
apt install xz-utils systemd-container
sh <(curl -L https://nixos.org/nix/install) --daemon
```
#### Add/Change the following lines to the nix.conf file
```bash
experimental-features = nix-command flakes
substituters = https://attic.wavelens.io/main?priority=5&want-mass-query=true https://cache.nixos.org/
trusted-public-keys = main:3VVGDhOgY/x5hn7XIkVhqjEjHvOnU7o1cPlrWv91Mko= cache.nixos.org-1:6NCHdD59X431o0gWypbMrAURkbJ16ZPMQFGspcDShjY=
```
#### Install Nix Extra Containers
```bash
git clone https://github.com/erikarvstedt/extra-container
extra-container/util/install.sh
```
#### Clone Nix AI
```bash
git clone https://git.wavelens.io/public/nix-ai
cd nix-ai
nix flake check
```
#### Setup Nix AI Containers for Hydra
```bash
nix run .#buildContainer_hydra -- create --start
machinectl shell hydra /bin/sh
su hydra-queue-runner
cd ~
```
Edit the .ssh/config file to include the following lines:
```bash
Host [BUILDER_IP]
HostName [BUILDER_IP]
User builder
Port 65535
IdentityFile ~/.ssh/id_builder
```
Create a new SSH key for the builder user:
```bash
mkdir ~/.ssh
ssh-keygen -t ed25519 -f ~/.ssh/id_builder -C builder
```
After adding the public key to the authorized_keys file, you can now connect to the builder machine and then switch to the hydra user:
```bash
ssh [BUILDER_IP]
exit # exit the builder machine
exit # exit the hydra-queue-runner user
su hydra
cd ~
```
Create a new SSH key for the git server:
```bash
ssh-keygen -t ed25519 -f ~/.ssh/id_git -C git
```
Create ~/.ssh/config file with the following content:
```bash
Host [GIT SERVER DOMAIN]
HostName [GIT SERVER DOMAIN]
User git
Port 22
IdentityFile ~/.ssh/id_git
```
Copy the public key to use it as a deploy key for the git repository:
```bash
cat ~/.ssh/id_git.pub
exit # exit the hydra user
```
Create an Admin account for the Hydra:
```bash
hydra-create-user admin --password-prompt --role admin
```
#### Setup Nix AI Containers for a BuildMachine
```bash
nix run .#buildContainer_builder -- create --start
machinectl shell builder@builder /bin/sh
```
Add the public key to the authorized_keys file:
```bash
mkdir ~/.ssh
echo "[BUILDER PUBLIC KEY]" >> ~/.ssh/authorized_keys
```
#### Finalize by setting up the Nix AI environment
The Hydra should be running on `http://[HYDRA_IP]:4444`. You can access the Hydra interface by visiting this URL in your browser.
Download the Nix AI Template for your ML project:
```bash
git clone https://git.wavelens.io/public/nix-ai-template
```
You can now start developing your ML project using Nix AI!
### By using NixOS the Linux Distribution
[Coming Soon]
## Updating
### By using Nix the Package Manager
Step 1: Backup the current Systemd Container /var/lib folder and the /etc/ssh folder
Step 2: Pull the latest changes from the Nix AI repository
Step 3: Run the following commands to update the Nix AI Containers
```bash
nix run .#buildContainer_hydra -- update
```
or if you want to update the BuildMachine
```bash
nix run .#buildContainer_builder -- update
```
Step 4: If necessary apply the Backup to the Systemd Container /var/lib folder and the /etc/ssh folder
You have successfully updated your Nix AI Containers!
### By using NixOS the Linux Distribution
Run the following command to update Nix AI:
```bash
cd [NixOS Configuration Folder]
nix flake update
nixos-rebuild switch --flake .#
```
You have successfully updated your NixOS System!