autotrain advanced Download - autotrain advanced Quellcode-Download

autotrain advanced

Python

1.0.0

Herunterladen

? AutoTrain Advanced

AutoTrain Advanced: schnellere und einfachere Schulung und Bereitstellung modernster Modelle für maschinelles Lernen. AutoTrain Advanced ist eine No-Code-Lösung, mit der Sie Modelle für maschinelles Lernen mit nur wenigen Klicks trainieren können. Bitte beachten Sie, dass Sie die Daten im richtigen Format hochladen müssen, damit das Projekt erstellt werden kann. Hilfe zum richtigen Datenformat und zur Preisgestaltung finden Sie in der Dokumentation.

HINWEIS: AutoTrain ist kostenlos! Sie zahlen nur für die Ressourcen, die Sie nutzen, wenn Sie sich entscheiden, AutoTrain auf Hugging Face Spaces auszuführen. Bei lokaler Ausführung zahlen Sie nur für die Ressourcen, die Sie in Ihrer eigenen Infrastruktur nutzen.

Unterstützte Aufgaben

Aufgabe	Status	Python-Notizbuch	Beispielkonfigurationen
LLM SFT-Feinabstimmung	✅		llm_sft_finetune.yaml
LLM ORPO Feinabstimmung	✅		llm_orpo_finetune.yaml
Feinabstimmung des LLM DPO	✅		llm_dpo_finetune.yaml
Feinabstimmung der LLM-Belohnung	✅		llm_reward_finetune.yaml
Generisches/Standard-Feintuning für LLM	✅		llm_generic_finetune.yaml
Textklassifizierung	✅		text_classification.yaml
Textregression	✅		text_regression.yaml
Token-Klassifizierung	✅	Demnächst verfügbar	token_classification.yaml
Seq2Seq	✅	Demnächst verfügbar	seq2seq.yaml
Extraktive Beantwortung von Fragen	✅	Demnächst verfügbar	extractive_qa.yaml
Bildklassifizierung	✅	Demnächst verfügbar	image_classification.yaml
Bildbewertung/Regression	✅	Demnächst verfügbar	image_regression.yaml
VLM	?	Demnächst verfügbar	vlm.yaml

Ausführen der Benutzeroberfläche auf Colab oder Hugging Face Spaces

Stellen Sie AutoTrain für umarmte Gesichtsbereiche bereit:
Führen Sie die AutoTrain-Benutzeroberfläche auf Colab über ngrok aus:

Lokale Installation

Sie können das AutoTrain-Advanced-Python-Paket über PIP installieren. Bitte beachten Sie, dass Sie Python >= 3.10 benötigen, damit AutoTrain Advanced ordnungsgemäß funktioniert.

 pip install autotrain-advanced

Bitte stellen Sie sicher, dass Sie git lfs installiert haben. Sehen Sie sich die Anweisungen hier an: https://github.com/git-lfs/git-lfs/wiki/Installation

Sie müssen außerdem Torch, Torchaudio und Torchvision installieren.

Der beste Weg, Autotrain auszuführen, ist in einer Conda-Umgebung. Mit dem folgenden Befehl können Sie eine neue Conda-Umgebung erstellen:

 conda create -n autotrain python=3.10
conda activate autotrain
pip install autotrain-advanced
conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
conda install -c "nvidia/label/cuda-12.1.0" cuda-nvcc

Sobald Sie fertig sind, können Sie die Anwendung starten mit:

 autotrain app --port 8080 --host 127.0.0.1

Wenn Ihnen die Benutzeroberfläche nicht gefällt, können Sie AutoTrain Configs verwenden, um über die Befehlszeile oder einfach über die AutoTrain-CLI zu trainieren.

Um die Konfigurationsdatei für das Training zu verwenden, können Sie den folgenden Befehl verwenden:

 autotrain --config <path_to_config_file>

Beispielkonfigurationsdateien finden Sie im Verzeichnis configs dieses Repositorys.

Beispielkonfigurationsdatei zur Feinabstimmung von SmolLM2:

 task : llm-sft
base_model : HuggingFaceTB/SmolLM2-1.7B-Instruct
project_name : autotrain-smollm2-finetune
log : tensorboard
backend : local

data :
  path : HuggingFaceH4/no_robots
  train_split : train
  valid_split : null
  chat_template : tokenizer
  column_mapping :
    text_column : messages

params :
  block_size : 2048
  model_max_length : 4096
  epochs : 2
  batch_size : 1
  lr : 1e-5
  peft : true
  quantization : int4
  target_modules : all-linear
  padding : right
  optimizer : paged_adamw_8bit
  scheduler : linear
  gradient_accumulation : 8
  mixed_precision : bf16
  merge_adapter : true

hub :
  username : ${HF_USERNAME}
  token : ${HF_TOKEN}
  push_to_hub : true

Um ein Modell mithilfe der obigen Konfigurationsdatei zu optimieren, können Sie den folgenden Befehl verwenden:

$ export HF_USERNAME= < your_hugging_face_username >
$ export HF_TOKEN= < your_hugging_face_write_token >
$ autotrain --config < path_to_config_file >

Dokumentation

Die Dokumentation ist verfügbar unter https://hf.co/docs/autotrain/

Zitat

 @inproceedings{thakur-2024-autotrain,
    title = "{A}uto{T}rain: No-code training for state-of-the-art models",
    author = "Thakur, Abhishek",
    booktitle = "Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: System Demonstrations",
    month = nov,
    year = "2024",
    address = "Miami, Florida, USA",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2024.emnlp-demo.44",
    pages = "419--423",
    abstract = "With the advancements in open-source models, training(or finetuning) models on custom datasets has become a crucial part of developing solutions which are tailored to specific industrial or open-source applications. Yet, there is no single tool which simplifies the process of training across different types of modalities or tasks.We introduce AutoTrain(aka AutoTrain Advanced){---}an open-source, no code tool/library which can be used to train (or finetune) models for different kinds of tasks such as: large language model (LLM) finetuning, text classification/regression, token classification, sequence-to-sequence task, finetuning of sentence transformers, visual language model (VLM) finetuning, image classification/regression and even classification and regression tasks on tabular data. AutoTrain Advanced is an open-source library providing best practices for training models on custom datasets. The library is available at https://github.com/huggingface/autotrain-advanced. AutoTrain can be used in fully local mode or on cloud machines and works with tens of thousands of models shared on Hugging Face Hub and their variations.",
}

Expandieren

Zusätzliche Informationen