detector_benchmark.pipeline
===========================

.. py:module:: detector_benchmark.pipeline


Submodules
----------

.. toctree::
   :maxdepth: 1

   /autoapi/detector_benchmark/pipeline/create_dataset_pipeline/index
   /autoapi/detector_benchmark/pipeline/experiment_pipeline/index
   /autoapi/detector_benchmark/pipeline/experiment_test_detector_pipeline/index
   /autoapi/detector_benchmark/pipeline/pipeline_utils/index
   /autoapi/detector_benchmark/pipeline/text_quality_pipeline/index


Classes
-------

.. autoapisummary::

   detector_benchmark.pipeline.CreateDatasetPipeline
   detector_benchmark.pipeline.ExperimentTestDetectorPipeline
   detector_benchmark.pipeline.TextQualityPipeline


Package Contents
----------------

.. py:class:: CreateDatasetPipeline(cfg: dict, dataset_loader: detector_benchmark.dataset_loader.FakeTruePairsDataLoader, attack: detector_benchmark.generation.ArticleGenerator, experiment_path: str, batch_size: int = 1, skip_cache: bool = False, skip_train_split: bool = False)

   Bases: :py:obj:`detector_benchmark.pipeline.experiment_pipeline.ExperimentPipeline`


   Abstract class for an experiment pipeline.


   .. py:attribute:: cfg


   .. py:attribute:: dataset_loader


   .. py:attribute:: attack


   .. py:attribute:: experiment_path


   .. py:attribute:: batch_size
      :value: 1


   .. py:attribute:: generator_name


   .. py:attribute:: experiment_name
      :value: 'Uninferable_Uninferable'


   .. py:attribute:: skip_cache
      :value: False


   .. py:attribute:: skip_train_split
      :value: False


   .. py:attribute:: log


   .. py:method:: create_experiment_dataset() -> datasets.Dataset

      Create the fake true dataset for the experiment by generating fake articles using the generator.

      Returns:
      -------
          dataset: Dataset
              The generated fake true dataset.


   .. py:method:: run_pipeline()

      Main function of the class, runs the pipeline.
      Creates the dataset for the experiment and saves it.


.. py:class:: ExperimentTestDetectorPipeline(cfg: dict, detector: detector_benchmark.detector.Detector, experiment_path: str, non_attack_experiment_path: str, dataset_experiment_path: str, batch_size: int = 1)

   Bases: :py:obj:`detector_benchmark.pipeline.experiment_pipeline.ExperimentPipeline`


   Abstract class for an experiment pipeline.


   .. py:attribute:: cfg


   .. py:attribute:: experiment_path


   .. py:attribute:: non_attack_experiment_path


   .. py:attribute:: batch_size
      :value: 1


   .. py:attribute:: detector_name


   .. py:attribute:: dataset_experiment_path


   .. py:attribute:: dataset_experiment_name


   .. py:attribute:: experiment_name
      :value: 'Uninferable_Uninferable'


   .. py:attribute:: target_fpr


   .. py:attribute:: log


   .. py:attribute:: detector


   .. py:method:: find_threshold(eval_set: datasets.Dataset, target_fpr: float) -> float

      Find the detection threshold for the target false positive rate (FPR) on the evaluation set.
      Uses the fpr and thresholds obtained with the roc_curve function from sklearn.

      Parameters:
      ----------
          eval_set: Dataset
              The evaluation set to use for finding the threshold.
          target_fpr: float
              The target false positive rate to find the threshold for.

      Returns:
          threshold: float
              The detection threshold for the target FPR.


   .. py:method:: evaluate_detector(preds: list[int], logits: list[float], preds_at_threshold: list[int], labels: list[int], dataset: datasets.Dataset, detection_threshold: float, data_split: str = 'test')

      Use the predictions and labels to compute the metrics of the detector and save them.

      Parameters:
      ----------
          preds: list[int]
              The predictions of the detector.
          logits: list[float]
              The logits of the detector.
          preds_at_threshold: list[int]
              The predictions of the detector at the given threshold.
          labels: list[int]
              The true labels of the dataset.
          dataset: Dataset
              The dataset used for testing the detector.
          detection_threshold: float
              The detection threshold used for the predictions at the given threshold.
          data_split: str
              The data split used for testing the detector. Default is "test".


   .. py:method:: run_pipeline()

      Main function of the pipeline, runs the pipeline.
      First, find the detection threshold for the target FPR on the evaluation set.
      Then, test the detector on the test set using the detection threshold and save the results.


.. py:class:: TextQualityPipeline(scorer: detector_benchmark.text_quality_evaluation.Scorer, dataset_path: str, dataset_path_compare: Optional[str] = None, batch_size: int = 64, return_loss_lists: bool = False, eval_human: bool = False)

   Bases: :py:obj:`detector_benchmark.pipeline.experiment_pipeline.ExperimentPipeline`


   Pipeline to evaluate the quality of text.


   .. py:attribute:: scorer


   .. py:attribute:: dataset


   .. py:attribute:: batch_size
      :value: 64


   .. py:attribute:: return_loss_lists
      :value: False


   .. py:attribute:: eval_human
      :value: False


   .. py:method:: run_pipeline()