detector_benchmark.detector¶

Submodules¶

Classes¶

`BertDetector`
`Detector`	Helper class that provides a standard way to create an ABC using
`FastDetectGPT`
`WatermarkDetector`
`DetectorLoader`
`GPTZero`

Package Contents¶

class detector_benchmark.detector.BertDetector(model: torch.nn.Module, tokenizer: transformers.PreTrainedTokenizerBase, device: str)¶

Bases: detector_benchmark.detector.detector.Detector

model¶

tokenizer¶

device¶

detect(texts: list[str], batch_size: int, detection_threshold: float = 0.0) → tuple[list[int], list[float], list[int]]¶

Detect the if the texts given as input are AI-generated (label 1) or human-written (label 0). Returns the predicted lables with argmax, the logits of the positive class and the predicted labels with the given detection threshold instead of the argmax.

Parameters:¶

texts: list[str]
The texts to detect

batch_size: int
The batch size to use for detection

detection_threshold: float
The threshold to use for detection. Default is 0.0.

Returns:¶

tuple[list[int], list[float], list[int]]
The predicted labels with argmax, the logits of the positive class and the predicted labels with the given detection threshold instead of the argmax.

class detector_benchmark.detector.Detector¶

Bases: abc.ABC

Helper class that provides a standard way to create an ABC using inheritance.

abstract detect(texts: list[str], batch_size: int, detection_threshold: float = 0.0) → tuple[list[int], list[float], list[int]]¶

Detect the watermark in the texts.

Parameters:¶

texts: list[str]
The texts to detect the watermark in

batch_size: int
The batch size

detection_threshold: float
The threshold to use for the detection

Returns:¶

tuple[list[int], list[float], list[int]]
The predictions, the logits, and the predictions at the threshold

class detector_benchmark.detector.FastDetectGPT(ref_model, scoring_model, ref_tokenizer, scoring_tokenizer, device)¶

Bases: detector_benchmark.detector.detector.Detector

ref_model¶

scoring_model¶

ref_tokenizer¶

scoring_tokenizer¶

device¶

get_samples(logits, labels) → torch.Tensor¶

Get the samples from the logits.

Parameters:¶

logits: torch.Tensor
The logits

labels: torch.Tensor
The labels

get_likelihood(logits, labels) → torch.Tensor¶

Get the likelihood from the logits.

Parameters:¶

logits: torch.Tensor
The logits

labels: torch.Tensor
The labels

get_sampling_discrepancy(logits_ref, logits_score, labels) → torch.Tensor¶

Get the sampling discrepancy from the logits.

Parameters:¶

logits_ref: torch.Tensor
The logits of the reference model

logits_score: torch.Tensor
The logits of the scoring model

labels: torch.Tensor
The labels

get_sampling_discrepancy_analytic(logits_ref, logits_score, labels) → torch.Tensor¶

Get the sampling discrepancy from the logits.

Parameters:¶

logits_ref: torch.Tensor
The logits of the reference model

logits_score: torch.Tensor
The logits of the scoring model

class ProbEstimatorFastDetectGPT(args=None, ref_path=None)¶

Probability estimator for the FastDetectGPT detector.

real_crits = []¶

fake_crits = []¶

crit_to_prob(crit) → float¶

Convert the criterion to probability.

Parameters:¶

crit: float
The criterion

Returns:¶

float
The probability

detect(texts: list[str], batch_size: int, detection_threshold: float = 0.5) → tuple[list[int], list[float], list[int]]¶

Detect the watermark in the texts.

Parameters:¶

texts: list[str]
The texts to detect the watermark in

batch_size: int
The batch size

detection_threshold: float
The detection threshold

Returns:¶

tuple[list[int], list[float], list[int]]
The predictions, the probabilities and the predictions at the threshold

class detector_benchmark.detector.WatermarkDetector(watermarking_scheme: detector_benchmark.watermark.auto_watermark.AutoWatermark, detection_threshold: float)¶

Bases: detector_benchmark.detector.detector.Detector

watermarking_scheme¶

detection_threshold¶

detect(texts: list[str], batch_size: int, detection_threshold: float) → tuple[list[int], list[float], list[int]]¶

Detect the if the texts given as input are watermarked (label 1) or not (label 0).

Parameters:¶

texts: list[str]
The texts to detect

batch_size: int
The batch size

detection_threshold: float
The threshold to use for the detection

Returns:¶

tuple[list[int], list[float], list[int]]
The predictions, the logits for the positive class, and the predictions at the threshold

class detector_benchmark.detector.DetectorLoader(cfg: dict, detector_name: str, device: str, weights_checkpoint: str = None, local_weights: bool = False)¶

cfg¶

detector_name¶

device¶

weights_checkpoint¶

local_weights¶

load() → detector_benchmark.detector.detector.Detector¶

Load the detector based on the given configuration (init).

Returns:¶

Detector
The loaded detector

class detector_benchmark.detector.GPTZero(api_key, debug_mode=False)¶

Bases: detector_benchmark.detector.detector.Detector

api_key¶

debug_mode¶

predict_gpt_zero(text, api_key, debug_mode=False) → dict¶

Predict the GPT-Zero score for the text.

Parameters:¶

text: str
The text to predict

api_key: str
The API key

debug_mode: bool
Whether to print the debug information

Returns:¶

dict
The prediction result

detect(texts: list, batch_size: int, detection_threshold: float = 0.5) → tuple[list[int], list[float], list[int]]¶

Detect the GPT-Zero score for the texts.

Parameters:¶

texts: list
The texts to detect

batch_size: int
The batch size

detection_threshold: float
The threshold to use for the detection

Returns:¶

tuple[list[int], list[float], list[int]]
The predictions, the logits for the positive class, and the predictions at the threshold