7.8 KiB
Nix AI
Look through options here
(c) 2024 Wavelens
Introducing Nix AI
Enhancing GPU Utilization and Reproducibility in Deep Learning Projects
Nix AI is a novel project built upon the Nix package manager, designed to optimize GPU utilization at runtime for deep learning tasks. By leveraging Nix's capabilities, Nix AI ensures efficient utilization of GPU resources, thereby enhancing the speed and efficency of ai development and training.
One of the key features of Nix AI is its integration with Hydra, a powerful job scheduling system. Hydra enables the efficient management of training queues, allowing for parallel execution of tasks and effective utilization of computational resources. This ensures that deep learning experiments are conducted in a timely and resource-efficient manner.
Furthermore, Nix AI emphasizes reproducibility through its declarative approach. By defining the entire computational environment—including software dependencies, configurations, and environment variables—in a declarative manner, Nix AI facilitates easy reproduction of experimental setups. This ensures that research findings can be validated and replicated with confidence, fostering scientific rigor and collaboration within the deep learning community.
In summary, Nix AI offers a robust solution for enhancing GPU utilization, optimizing job scheduling, and promoting reproducibility in deep learning projects. By combining the power of Nix and Hydra with a declarative approach, Nix AI provides researchers and practitioners with a reliable platform for advancing the state-of-the-art in artificial intelligence.
Here is a brief schema of the new training pipeline:
Key Features of Nix AI and Nix
Contextualizing Machine Learning on NixOS
Nix AI emerges as a specialized toolset tailored explicitly for NixOS, a Linux distribution distinguished by its declarative and reproducible package management system. By aligning with the Nix package manager, Nix AI endeavors to streamline the intricacies of installing, configuring, and deploying ML models within NixOS environments. This initiative addresses a critical need within the NixOS ecosystem, facilitating smoother integration and operationalization of ML workflows.
Operational Dynamics of Nix AI
Nix AI operates as an integrated software library and ecosystem meticulously engineered to facilitate the development of machine learning projects within NixOS environments. Fundamentally, Nix AI leverages the capabilities of the Nix package manager, a cornerstone of NixOS, to furnish a seamless and efficient platform for managing dependencies, environments, and workflows associated with ML endeavors.
Foundational Role of Nix Flakes in Reproducibility
At the heart of Nix AI lies the concept of Nix flakes, offering a declarative and reproducible mechanism for defining package dependencies and configurations. Leveraging Nix flakes, Nix AI ensures the reproducibility of ML projects across disparate environments, thereby fostering collaboration and experimentation sans concerns pertaining to inconsistent dependencies or divergent configurations.
Containerization for Isolation and Flexibility
When not using NixOS, Nix AI embraces containerization to afford isolated and reproducible environments for ML development and deployment. By encapsulating ML workflows within lightweight and portable containers, Nix AI facilitates seamless experimentation with diverse libraries, frameworks, and configurations, safeguarding system integrity and ensuring uniformity across the development lifecycle.
Automated Setup and Configuration
Nix AI streamlines the initialization and customization process for ML development via automated scripts and utilities. Whether provisioning Hydra clusters for distributed training or configuring build machines for continuous integration, Nix AI automates mundane tasks, enabling practitioners to channel their efforts towards model refinement and innovation.
Extensibility and Customization
Designed with modularity and extensibility in focus, Nix AI empowers practitioners to tailor ML environments to suit bespoke requirements. Whether integrating with external libraries, extending existing functionalities, or crafting custom workflows, Nix AI provides the flexibility and modularity requisite for addressing diverse use cases and evolving research imperatives.
Installation
By using Nix the Package Manager
Install Nix and Systemd Containers
apt install xz-utils systemd-container
sh <(curl -L https://nixos.org/nix/install) --daemon
Add/Change the following lines to the nix.conf file
experimental-features = nix-command flakes
substituters = https://attic.wavelens.io/main?priority=5&want-mass-query=true https://cache.nixos.org/
trusted-public-keys = main:3VVGDhOgY/x5hn7XIkVhqjEjHvOnU7o1cPlrWv91Mko= cache.nixos.org-1:6NCHdD59X431o0gWypbMrAURkbJ16ZPMQFGspcDShjY=
Install Nix Extra Containers
git clone https://github.com/erikarvstedt/extra-container
extra-container/util/install.sh
Clone Nix AI
git clone https://git.wavelens.io/public/nix-ai
cd nix-ai
nix flake check
Setup Nix AI Containers for Hydra
nix run .#buildContainer_hydra -- create --start
machinectl shell hydra /bin/sh
su hydra-queue-runner
cd ~
Edit the .ssh/config file to include the following lines:
Host [BUILDER_IP]
HostName [BUILDER_IP]
User builder
Port 65535
IdentityFile ~/.ssh/id_builder
Create a new SSH key for the builder user:
mkdir ~/.ssh
ssh-keygen -t ed25519 -f ~/.ssh/id_builder -C builder
After adding the public key to the authorized_keys file, you can now connect to the builder machine and then switch to the hydra user:
ssh [BUILDER_IP]
exit # exit the builder machine
exit # exit the hydra-queue-runner user
su hydra
cd ~
Create a new SSH key for the git server:
ssh-keygen -t ed25519 -f ~/.ssh/id_git -C git
Create ~/.ssh/config file with the following content:
Host [GIT SERVER DOMAIN]
HostName [GIT SERVER DOMAIN]
User git
Port 22
IdentityFile ~/.ssh/id_git
Copy the public key to use it as a deploy key for the git repository:
cat ~/.ssh/id_git.pub
exit # exit the hydra user
Create an Admin account for the Hydra:
hydra-create-user admin --password-prompt --role admin
Setup Nix AI Containers for a BuildMachine
nix run .#buildContainer_builder -- create --start
machinectl shell builder@builder /bin/sh
Add the public key to the authorized_keys file:
mkdir ~/.ssh
echo "[BUILDER PUBLIC KEY]" >> ~/.ssh/authorized_keys
Finalize by setting up the Nix AI environment
The Hydra should be running on http://[HYDRA_IP]:4444
. You can access the Hydra interface by visiting this URL in your browser.
Download the Nix AI Template for your ML project:
git clone https://git.wavelens.io/public/nix-ai-template
You can now start developing your ML project using Nix AI!
By using NixOS the Linux Distribution
[Coming Soon]
Updating
By using Nix the Package Manager
Step 1: Backup the current Systemd Container /var/lib folder and the /etc/ssh folder
Step 2: Pull the latest changes from the Nix AI repository
Step 3: Run the following commands to update the Nix AI Containers
nix run .#buildContainer_hydra -- update
or if you want to update the BuildMachine
nix run .#buildContainer_builder -- update
Step 4: If necessary apply the Backup to the Systemd Container /var/lib folder and the /etc/ssh folder
You have successfully updated your Nix AI Containers!
By using NixOS the Linux Distribution
Run the following command to update Nix AI:
cd [NixOS Configuration Folder]
nix flake update
nixos-rebuild switch --flake .#
You have successfully updated your NixOS System!