184 lines
7.8 KiB
Markdown
184 lines
7.8 KiB
Markdown
# Nix AI
|
|
|
|
Look through options [here](https://search.wavelens.io/)
|
|
|
|
(c) 2024 Wavelens
|
|
|
|
## Introducing Nix AI
|
|
Enhancing GPU Utilization and Reproducibility in Deep Learning Projects
|
|
|
|
Nix AI is a novel project built upon the Nix package manager, designed to optimize GPU utilization at runtime for deep learning tasks. By leveraging Nix's capabilities, Nix AI ensures efficient utilization of GPU resources, thereby enhancing the speed and efficency of ai development and training.
|
|
|
|
One of the key features of Nix AI is its integration with Hydra, a powerful job scheduling system. Hydra enables the efficient management of training queues, allowing for parallel execution of tasks and effective utilization of computational resources. This ensures that deep learning experiments are conducted in a timely and resource-efficient manner.
|
|
|
|
Furthermore, Nix AI emphasizes reproducibility through its declarative approach. By defining the entire computational environment—including software dependencies, configurations, and environment variables—in a declarative manner, Nix AI facilitates easy reproduction of experimental setups. This ensures that research findings can be validated and replicated with confidence, fostering scientific rigor and collaboration within the deep learning community.
|
|
|
|
In summary, Nix AI offers a robust solution for enhancing GPU utilization, optimizing job scheduling, and promoting reproducibility in deep learning projects. By combining the power of Nix and Hydra with a declarative approach, Nix AI provides researchers and practitioners with a reliable platform for advancing the state-of-the-art in artificial intelligence.
|
|
|
|
Here is a brief schema of the new training pipeline:
|
|
|
|
![Work Schma](./docs/work_schema.png)
|
|
|
|
### Key Features of Nix AI and Nix
|
|
|
|
#### Contextualizing Machine Learning on NixOS
|
|
|
|
Nix AI emerges as a specialized toolset tailored explicitly for NixOS, a Linux distribution distinguished by its declarative and reproducible package management system. By aligning with the Nix package manager, Nix AI endeavors to streamline the intricacies of installing, configuring, and deploying ML models within NixOS environments. This initiative addresses a critical need within the NixOS ecosystem, facilitating smoother integration and operationalization of ML workflows.
|
|
#### Operational Dynamics of Nix AI
|
|
|
|
Nix AI operates as an integrated software library and ecosystem meticulously engineered to facilitate the development of machine learning projects within NixOS environments. Fundamentally, Nix AI leverages the capabilities of the Nix package manager, a cornerstone of NixOS, to furnish a seamless and efficient platform for managing dependencies, environments, and workflows associated with ML endeavors.
|
|
#### Foundational Role of Nix Flakes in Reproducibility
|
|
|
|
At the heart of Nix AI lies the concept of Nix flakes, offering a declarative and reproducible mechanism for defining package dependencies and configurations. Leveraging Nix flakes, Nix AI ensures the reproducibility of ML projects across disparate environments, thereby fostering collaboration and experimentation sans concerns pertaining to inconsistent dependencies or divergent configurations.
|
|
#### Containerization for Isolation and Flexibility
|
|
|
|
When not using NixOS, Nix AI embraces containerization to afford isolated and reproducible environments for ML development and deployment. By encapsulating ML workflows within lightweight and portable containers, Nix AI facilitates seamless experimentation with diverse libraries, frameworks, and configurations, safeguarding system integrity and ensuring uniformity across the development lifecycle.
|
|
#### Automated Setup and Configuration
|
|
|
|
Nix AI streamlines the initialization and customization process for ML development via automated scripts and utilities. Whether provisioning Hydra clusters for distributed training or configuring build machines for continuous integration, Nix AI automates mundane tasks, enabling practitioners to channel their efforts towards model refinement and innovation.
|
|
#### Extensibility and Customization
|
|
|
|
Designed with modularity and extensibility in focus, Nix AI empowers practitioners to tailor ML environments to suit bespoke requirements. Whether integrating with external libraries, extending existing functionalities, or crafting custom workflows, Nix AI provides the flexibility and modularity requisite for addressing diverse use cases and evolving research imperatives.
|
|
|
|
## Installation
|
|
### By using Nix the Package Manager
|
|
|
|
#### Install Nix and Systemd Containers
|
|
```bash
|
|
apt install xz-utils systemd-container
|
|
sh <(curl -L https://nixos.org/nix/install) --daemon
|
|
```
|
|
|
|
#### Add/Change the following lines to the nix.conf file
|
|
```bash
|
|
experimental-features = nix-command flakes
|
|
substituters = https://attic.wavelens.io/main?priority=5&want-mass-query=true https://cache.nixos.org/
|
|
trusted-public-keys = main:3VVGDhOgY/x5hn7XIkVhqjEjHvOnU7o1cPlrWv91Mko= cache.nixos.org-1:6NCHdD59X431o0gWypbMrAURkbJ16ZPMQFGspcDShjY=
|
|
```
|
|
|
|
#### Install Nix Extra Containers
|
|
```bash
|
|
git clone https://github.com/erikarvstedt/extra-container
|
|
extra-container/util/install.sh
|
|
```
|
|
|
|
#### Clone Nix AI
|
|
```bash
|
|
git clone https://git.wavelens.io/public/nix-ai
|
|
cd nix-ai
|
|
nix flake check
|
|
```
|
|
|
|
#### Setup Nix AI Containers for Hydra
|
|
```bash
|
|
nix run .#buildContainer_hydra -- create --start
|
|
machinectl shell hydra /bin/sh
|
|
su hydra-queue-runner
|
|
cd ~
|
|
```
|
|
|
|
Edit the .ssh/config file to include the following lines:
|
|
```bash
|
|
Host [BUILDER_IP]
|
|
HostName [BUILDER_IP]
|
|
User builder
|
|
Port 65535
|
|
IdentityFile ~/.ssh/id_builder
|
|
```
|
|
|
|
Create a new SSH key for the builder user:
|
|
```bash
|
|
mkdir ~/.ssh
|
|
ssh-keygen -t ed25519 -f ~/.ssh/id_builder -C builder
|
|
```
|
|
|
|
After adding the public key to the authorized_keys file, you can now connect to the builder machine and then switch to the hydra user:
|
|
```bash
|
|
ssh [BUILDER_IP]
|
|
exit # exit the builder machine
|
|
exit # exit the hydra-queue-runner user
|
|
su hydra
|
|
cd ~
|
|
```
|
|
|
|
Create a new SSH key for the git server:
|
|
```bash
|
|
ssh-keygen -t ed25519 -f ~/.ssh/id_git -C git
|
|
```
|
|
|
|
Create ~/.ssh/config file with the following content:
|
|
```bash
|
|
Host [GIT SERVER DOMAIN]
|
|
HostName [GIT SERVER DOMAIN]
|
|
User git
|
|
Port 22
|
|
IdentityFile ~/.ssh/id_git
|
|
```
|
|
|
|
Copy the public key to use it as a deploy key for the git repository:
|
|
```bash
|
|
cat ~/.ssh/id_git.pub
|
|
exit # exit the hydra user
|
|
```
|
|
|
|
Create an Admin account for the Hydra:
|
|
```bash
|
|
hydra-create-user admin --password-prompt --role admin
|
|
```
|
|
|
|
#### Setup Nix AI Containers for a BuildMachine
|
|
```bash
|
|
nix run .#buildContainer_builder -- create --start
|
|
machinectl shell builder@builder /bin/sh
|
|
```
|
|
|
|
Add the public key to the authorized_keys file:
|
|
```bash
|
|
mkdir ~/.ssh
|
|
echo "[BUILDER PUBLIC KEY]" >> ~/.ssh/authorized_keys
|
|
```
|
|
|
|
#### Finalize by setting up the Nix AI environment
|
|
|
|
The Hydra should be running on `http://[HYDRA_IP]:4444`. You can access the Hydra interface by visiting this URL in your browser.
|
|
|
|
Download the Nix AI Template for your ML project:
|
|
```bash
|
|
git clone https://git.wavelens.io/public/nix-ai-template
|
|
```
|
|
You can now start developing your ML project using Nix AI!
|
|
|
|
### By using NixOS the Linux Distribution
|
|
|
|
[Coming Soon]
|
|
|
|
## Updating
|
|
### By using Nix the Package Manager
|
|
|
|
Step 1: Backup the current Systemd Container /var/lib folder and the /etc/ssh folder
|
|
|
|
Step 2: Pull the latest changes from the Nix AI repository
|
|
|
|
Step 3: Run the following commands to update the Nix AI Containers
|
|
```bash
|
|
nix run .#buildContainer_hydra -- update
|
|
```
|
|
or if you want to update the BuildMachine
|
|
```bash
|
|
nix run .#buildContainer_builder -- update
|
|
```
|
|
|
|
Step 4: If necessary apply the Backup to the Systemd Container /var/lib folder and the /etc/ssh folder
|
|
|
|
You have successfully updated your Nix AI Containers!
|
|
|
|
### By using NixOS the Linux Distribution
|
|
|
|
Run the following command to update Nix AI:
|
|
```bash
|
|
cd [NixOS Configuration Folder]
|
|
nix flake update
|
|
nixos-rebuild switch --flake .#
|
|
```
|
|
You have successfully updated your NixOS System!
|