.. _getting_started: ############### Getting Started ############### Deploy Ragger Duck ================== To ease the deployment, we rely on `pixi`. Refer to following `link `_ for installing `pixi` but in short, for the currently supported platform, the following should be enough:: curl -fsSL https://pixi.sh/install.sh | bash In the latest stage, `pixi` will be in charge to create the Python environments to build the scikit-learn documentation, train the retrievers, and launch the Web Console. We already setup several environments for you depending on the platform and hardware at your disposal: - `cpu`: this is a cross-platform environments (i.e. linux and MacOS on x86_64 and arm64); - `mps`: this is an environment for MacOS on M1/M2/M3 chips; - `cuda-12-1`: this is an environment for linux on x86_64 machine with GPU support. We used it to make experiment on Scaleway instance that provides an L4 GPU. - `cuda-11-7`: similar to `cuda-12-1` but relying on cuda 11.7 instead of 12.1. Note that you can modify the `pixi.toml` to create your own environments since the cuda version used in the `cuda-12-1` or `cuda-11-7` environment might not suits your needs. Cloning the project ------------------- The GitHub repository self-contained all the necessary source files for building the RAG. You need to clone the repository in a recursive way to get the scikit-learn source files as a submodule:: git clone --recursive git@github.com:probabl-ai/sklearn-ragger-duck.git Install dependencies using `pixi` --------------------------------- The subsequent steps will require some dependencies to be installed. They are defined in the `pixi.lock` file and can be installed using `pixi install`. However, you need to specify which environment you want to use as stated in the previous section. Here, we will use the `cpu` environment:: pixi install --frozen -e cpu Build the scikit-learn documentation ------------------------------------ First, we need to build the scikit-learn documentation since some of the retrievers will rely on the HTML generated pages. You can build the documentation by running the following command:: pixi run --frozen build-doc-sklearn Train the semantic and lexical retrievers ----------------------------------------- We need to train a set of lexical and semantic retrievers on the API documentation, the user guide, and the gallery of examples. We will have different retrievers for each of these type of documentation. You can refer :ref:`user_guide` for more details on the strategy used to train the retrievers. You can launch the training of the retrievers by running the following command:: pixi run --frozen train-retrievers Pixi might propose you to select a specific environment to make the training. You can also specify the environment by running the following command:: pixi run --frozen -e cpu train-retrievers Download the Large Language Model --------------------------------- You need to get a Large Language Model (LLM). For testing purpose, you can get the Mistral 7b model by running the following command:: pixi run --frozen fetch-mistral-7b Launch the Web Console ---------------------- Now, you are all set to start the web console. Then, Launch the Web Console by running the following command:: pixi run --frozen start-ragger-duck You will also be required to select an environment depending on which hardware you want to offload the LLM. Then, you can access the Web Console at the following address:: http://127.0.0.1:8123 Use the Ragger Duck library =========================== When using `pixi` as discussed earlier, Ragger Duck is installed in editable mode in the environment. However, we also make Ragger Duck installable via `pip`:: pip install -e . However, we don't install any of the dependencies since it is hardware dependent and can be better handled with `pixi`.