detector_benchmark.detector¶

Submodules¶

Classes¶

BertDetector

Detector

Helper class that provides a standard way to create an ABC using

FastDetectGPT

WatermarkDetector

DetectorLoader

GPTZero

Package Contents¶

class detector_benchmark.detector.BertDetector(model: torch.nn.Module, tokenizer: transformers.PreTrainedTokenizerBase, device: str)¶

Bases: detector_benchmark.detector.detector.Detector

model¶
tokenizer¶
device¶
detect(texts: list[str], batch_size: int, detection_threshold: float = 0.0) tuple[list[int], list[float], list[int]]¶

Detect the if the texts given as input are AI-generated (label 1) or human-written (label 0). Returns the predicted lables with argmax, the logits of the positive class and the predicted labels with the given detection threshold instead of the argmax.

Parameters:¶

texts: list[str]

The texts to detect

batch_size: int

The batch size to use for detection

detection_threshold: float

The threshold to use for detection. Default is 0.0.

Returns:¶

tuple[list[int], list[float], list[int]]

The predicted labels with argmax, the logits of the positive class and the predicted labels with the given detection threshold instead of the argmax.

class detector_benchmark.detector.Detector¶

Bases: abc.ABC

Helper class that provides a standard way to create an ABC using inheritance.

abstract detect(texts: list[str], batch_size: int, detection_threshold: float = 0.0) tuple[list[int], list[float], list[int]]¶

Detect the watermark in the texts.

Parameters:¶

texts: list[str]

The texts to detect the watermark in

batch_size: int

The batch size

detection_threshold: float

The threshold to use for the detection

Returns:¶

tuple[list[int], list[float], list[int]]

The predictions, the logits, and the predictions at the threshold

class detector_benchmark.detector.FastDetectGPT(ref_model, scoring_model, ref_tokenizer, scoring_tokenizer, device)¶

Bases: detector_benchmark.detector.detector.Detector

ref_model¶
scoring_model¶
ref_tokenizer¶
scoring_tokenizer¶
device¶
get_samples(logits, labels) torch.Tensor¶

Get the samples from the logits.

Parameters:¶

logits: torch.Tensor

The logits

labels: torch.Tensor

The labels

get_likelihood(logits, labels) torch.Tensor¶

Get the likelihood from the logits.

Parameters:¶

logits: torch.Tensor

The logits

labels: torch.Tensor

The labels

get_sampling_discrepancy(logits_ref, logits_score, labels) torch.Tensor¶

Get the sampling discrepancy from the logits.

Parameters:¶

logits_ref: torch.Tensor

The logits of the reference model

logits_score: torch.Tensor

The logits of the scoring model

labels: torch.Tensor

The labels

get_sampling_discrepancy_analytic(logits_ref, logits_score, labels) torch.Tensor¶

Get the sampling discrepancy from the logits.

Parameters:¶

logits_ref: torch.Tensor

The logits of the reference model

logits_score: torch.Tensor

The logits of the scoring model

class ProbEstimatorFastDetectGPT(args=None, ref_path=None)¶

Probability estimator for the FastDetectGPT detector.

real_crits = []¶
fake_crits = []¶
crit_to_prob(crit) float¶

Convert the criterion to probability.

Parameters:¶

crit: float

The criterion

Returns:¶

float

The probability

detect(texts: list[str], batch_size: int, detection_threshold: float = 0.5) tuple[list[int], list[float], list[int]]¶

Detect the watermark in the texts.

Parameters:¶

texts: list[str]

The texts to detect the watermark in

batch_size: int

The batch size

detection_threshold: float

The detection threshold

Returns:¶

tuple[list[int], list[float], list[int]]

The predictions, the probabilities and the predictions at the threshold

class detector_benchmark.detector.WatermarkDetector(watermarking_scheme: detector_benchmark.watermark.auto_watermark.AutoWatermark, detection_threshold: float)¶

Bases: detector_benchmark.detector.detector.Detector

watermarking_scheme¶
detection_threshold¶
detect(texts: list[str], batch_size: int, detection_threshold: float) tuple[list[int], list[float], list[int]]¶

Detect the if the texts given as input are watermarked (label 1) or not (label 0).

Parameters:¶

texts: list[str]

The texts to detect

batch_size: int

The batch size

detection_threshold: float

The threshold to use for the detection

Returns:¶

tuple[list[int], list[float], list[int]]

The predictions, the logits for the positive class, and the predictions at the threshold

class detector_benchmark.detector.DetectorLoader(cfg: dict, detector_name: str, device: str, weights_checkpoint: str = None, local_weights: bool = False)¶
cfg¶
detector_name¶
device¶
weights_checkpoint¶
local_weights¶
load() detector_benchmark.detector.detector.Detector¶

Load the detector based on the given configuration (init).

Returns:¶

Detector

The loaded detector

class detector_benchmark.detector.GPTZero(api_key, debug_mode=False)¶

Bases: detector_benchmark.detector.detector.Detector

api_key¶
debug_mode¶
predict_gpt_zero(text, api_key, debug_mode=False) dict¶

Predict the GPT-Zero score for the text.

Parameters:¶

text: str

The text to predict

api_key: str

The API key

debug_mode: bool

Whether to print the debug information

Returns:¶

dict

The prediction result

detect(texts: list, batch_size: int, detection_threshold: float = 0.5) tuple[list[int], list[float], list[int]]¶

Detect the GPT-Zero score for the texts.

Parameters:¶

texts: list

The texts to detect

batch_size: int

The batch size

detection_threshold: float

The threshold to use for the detection

Returns:¶

tuple[list[int], list[float], list[int]]

The predictions, the logits for the positive class, and the predictions at the threshold