Huggingface Transformers Pipeline

Posted on Thu, Feb 10, 2022 NLP MLDL Framework

Generation Options

GPT나 T5와 같은 생성 타입의 옵션들
Pipelines

The pipelines are a great and easy way to use models for inference. These pipelines are objects that abstract most of the complex code from the library, offering a simple API dedicated to several tasks, including Named Entity Recognition, Masked Language Modeling, Sentiment Analysis, Feature Extraction and Question Answering.

생성 Pipeline Object 생성시의 옵션

from transformers import pipeline

pipe = pipeline(
		model, # <-- 이렇게 pipeline 객체 생성시 넣는 Arguments
		# ...
)

생성 일반적인 사용 샘플

pipe = pipeline(
    'text2text-generation',
    model='KETI-AIR/ke-t5-base',
    # tokenizer='KETI-AIR/ke-t5-base', # <- 모델과 같으면 안써도 됨!
)

생성 Pipeline Object의 Call을 할 때 옵션

pipe(
		'text', # <-- 이렇게 pipeline 객체 사용시 넣는 Arguments
		do_sample=True,
		max_length=128,
		...
)

생성 일반적인 사용 샘플

from transformers import pipeline

pipe = pipeline('text-generation', model='beomi/kcgpt2')

args = ...

res = pipe(
    x,
    do_sample=True if args.g_top_p or args.g_top_k or args.g_temperature else False,
    top_p=args.g_top_p if args.g_top_p else None,
    top_k=args.g_top_k if args.g_top_k else None,
    temperature=args.g_temperature if args.g_temperature else None,
    no_repeat_ngram_size=2,
    early_stopping=True,
    max_new_tokens=args.g_max_length,
    num_return_sequences=args.g_num_return_sequences, # text2text gen에는 반응 X
)

Classifier Options

BERT for Sequence Classification 같은 분류기에 사용하는 옵션들
Pipelines

The pipelines are a great and easy way to use models for inference. These pipelines are objects that abstract most of the complex code from the library, offering a simple API dedicated to several tasks, including Named Entity Recognition, Masked Language Modeling, Sentiment Analysis, Feature Extraction and Question Answering.

from transformers import pipeline

device = -1 # CPU
# device = 0 # 0번 GPU
toxic_classifier = pipeline(
    "text-classification",
    model='beomi/beep-KcELECTRA-base-hate',
    device=device,
    return_all_scores=True,
)

Pipeline Object Call 할때의 옵션