detector_benchmark.generation¶

Submodules¶

Classes¶

ArticleGenerator

Helper class that provides a standard way to create an ABC using

GenParamsAttack

Helper class that provides a standard way to create an ABC using

LLMGenerator

PromptAttack

Helper class that provides a standard way to create an ABC using

PromptParaphrasingAttack

Helper class that provides a standard way to create an ABC using

GenLoader

AttackLoader

Package Contents¶

class detector_benchmark.generation.ArticleGenerator(gen_model: detector_benchmark.generation.generator.LLMGenerator, gen_config: detector_benchmark.utils.configs.ModelConfig, gen_prompt_config: detector_benchmark.utils.configs.PromptConfig, max_sample_len: int, watermarking_scheme: detector_benchmark.watermark.auto_watermark.AutoWatermark = None)¶

Bases: abc.ABC

Helper class that provides a standard way to create an ABC using inheritance.

gen_model¶
gen_prompt_config¶
gen_model_config¶
max_sample_len¶
watermarking_scheme¶
attack_name = ''¶
watermarking_scheme_name = ''¶
gen_name¶
generate_text(prefixes, batch_size=1) list[str]¶

Takes a list of input contexts and generates text using the model.

Parameters:¶

prefixes: list

A list of input contexts for text generation.

batch_size: int

The batch size to use for generation.

Returns:¶

fake_articles: list

A list of generated text.

set_attack_name(attack_name: str) None¶

Public setter for the attack name.

Parameters:¶

attack_name: str

The name of the attack.

set_watermarking_scheme_name(watermarking_scheme_name: str) None¶

Public setter for the watermarking scheme name.

Parameters:¶

watermarking_scheme_name: str

The name of the watermarking scheme.

abstract generate_adversarial_text(prefixes: list, batch_size: int = 1) list[str]¶

This is the adversarial version of text generation. All attack should generate text at some point. Either generate text in a specific way or modify the generated text.

Parameters:¶

prefixes: list

A list of input contexts for text generation.

batch_size: int

The batch size to use for generation.

class detector_benchmark.generation.GenParamsAttack(gen_model: detector_benchmark.generation.generator.LLMGenerator, gen_config: detector_benchmark.utils.configs.ModelConfig, gen_prompt_config: detector_benchmark.utils.configs.PromptConfig, adversarial_gen_params: dict, max_sample_len: int, watermarking_scheme: detector_benchmark.watermark.auto_watermark.AutoWatermark = None)¶

Bases: detector_benchmark.generation.article_generator.ArticleGenerator

Helper class that provides a standard way to create an ABC using inheritance.

adversarial_gen_params¶
attack_name = 'gen_parameters_attack'¶
generate_adversarial_text(prefixes: list[str], batch_size: int = 1) list[str]¶

Generate text with adversarial generation parameters.

Parameters:¶

prefixes: list

A list of input contexts for text generation.

batch_size: int

The batch size to use for generation.

Returns:¶

fake_articles: list

A list of generated text.

class detector_benchmark.generation.LLMGenerator(model: transformers.AutoModelForCausalLM, model_config: detector_benchmark.utils.configs.ModelConfig)¶

Bases: torch.nn.Module

generator¶
tokenizer¶
device¶
gen_params¶
forward(samples: list, batch_size: int = 1, watermarking_scheme: detector_benchmark.watermark.auto_watermark.AutoWatermark | None = None) list[str]¶

Generate text from a list of input contexts.

Parameters:¶

samples: list

A list of input contexts for text generation.

batch_size: int

The batch size to use for generation. Defaults to 1.

watermarking_scheme: AutoWatermark

The watermarking scheme to use for generation. If provided, it should be an instance of LogitsProcessor. Defaults to None.

Returns:¶

list[str]

A list of generated texts.

forward_debug(samples: list, batch_size: int = 1, watermarking_scheme: detector_benchmark.watermark.auto_watermark.AutoWatermark | None = None) list[str]¶

Takes a list of input contexts and generates text using the model.

Parameters:¶

samples: list

A list of input contexts for text generation.

batch_size: int

The batch size to use for generation.

watermarking_scheme: LogitsProcessor

The watermarking scheme to use for generation.

Returns:¶

tuple

A tuple containing the generated texts, the raw logits, and the processed logits.

class detector_benchmark.generation.PromptAttack(gen_model: detector_benchmark.generation.generator.LLMGenerator, gen_config: detector_benchmark.utils.configs.ModelConfig, gen_prompt_config: detector_benchmark.utils.configs.PromptConfig, adversarial_prompt_config: detector_benchmark.utils.configs.PromptConfig, max_sample_len: int, watermarking_scheme: detector_benchmark.watermark.auto_watermark.AutoWatermark = None)¶

Bases: detector_benchmark.generation.article_generator.ArticleGenerator

Helper class that provides a standard way to create an ABC using inheritance.

adversarial_prompt_config¶
attack_name = 'prompt_attack'¶
generate_adversarial_text(prefixes: list[str], batch_size: int = 1) list[str]¶

Generate text with an (adversarial) prompt.

Parameters:¶

prefixes: list[str]

A list of input contexts for text generation.

batch_size: int

The batch size to use for generation.

Returns:¶

list[str]

A list of generated text.

class detector_benchmark.generation.PromptParaphrasingAttack(gen_model: detector_benchmark.generation.generator.LLMGenerator, gen_config: detector_benchmark.utils.configs.ModelConfig, gen_prompt_config: detector_benchmark.utils.configs.PromptConfig, paraphraser_model: detector_benchmark.generation.generator.LLMGenerator, paraphraser_config: detector_benchmark.utils.configs.ModelConfig, paraphraser_prompt_config: detector_benchmark.utils.configs.PromptConfig, max_sample_len: int, watermarking_scheme: detector_benchmark.watermark.auto_watermark.AutoWatermark = None)¶

Bases: detector_benchmark.generation.article_generator.ArticleGenerator

Helper class that provides a standard way to create an ABC using inheritance.

paraphraser_model¶
paraphraser_prompt_config¶
model_config¶
attack_name = 'paraphrasing_attack'¶
paraphrase(texts: list[str], nb_paraphrasing: int = 1, batch_size: int = 1) list[str]¶

Paraphrasing function used after the initial text generation.

Parameters:¶

texts: list

Initial generated texts to be paraphrased.

nb_paraphrasing: int

Number of recursive paraphrasing to be done.

batch_size: int

The batch size to use for generation.

Returns:¶

list

A list of paraphrased generated texts.

generate_adversarial_text(prefixes: list[str], batch_size: int = 1) list[str]¶

Generate text with paraphrasing.

Parameters:¶

prefixes: list

A list of input contexts for text generation.

batch_size: int

The batch size to use for generation.

Returns:¶

list

A list of generated text.

class detector_benchmark.generation.GenLoader(model_name: str, gen_params: dict, device: str, gen_tokenizer_only: bool = False)¶
model_name¶
gen_params¶
device¶
gen_tokenizer_only¶
load() tuple[torch.nn.Module, detector_benchmark.generation.generator.LLMGenerator, detector_benchmark.utils.configs.ModelConfig]¶

Load the specifed generator model (from init) and tokenizer

Returns:¶

torch.nn.Module: The loaded generator model LLMGenerator: The loaded generator model ModelConfig: The configuration of the generator model

class detector_benchmark.generation.AttackLoader(cfg: omegaconf.DictConfig, attack_type: str, gen_model: detector_benchmark.generation.generator.LLMGenerator, model_config: detector_benchmark.utils.configs.ModelConfig, max_sample_len: int, watermarking_scheme: detector_benchmark.watermark.auto_watermark.AutoWatermark | None = None, paraphraser_model: detector_benchmark.generation.generator.LLMGenerator | None = None, paraphraser_config: detector_benchmark.utils.configs.ModelConfig | None = None)¶
cfg¶
attack_type¶
gen_model¶
model_config¶
max_sample_len¶
watermarking_scheme¶
paraphraser_model¶
paraphraser_config¶
load() detector_benchmark.generation.article_generator.ArticleGenerator¶

Load the attack.

Returns:¶

ArticleGenerator: The attack.